Abstract :
The efficient processing and association of different multi-modal information is a very important research field with a great variety of applications, such as human computer interaction, knowledge discovery, document understanding, etc. A good approach to this important issue is the development of a common platform for converting different modalities (such as images, text, etc) into the same medium and associating them for efficient processing and understanding. Thus, this paper here presents the development of a novel methodology based on Local-Global (LG) graphs capable for automatically converting image context into natural language text sentences and then into speech for serving as an interactive model for locating missing objects in home environments. Simple illustrative examples are provided for proving the concept proposed here.
Keywords :
graph theory; home computing; interactive systems; natural language processing; object recognition; speech synthesis; automatic image-to-text-to-voice conversion; home environment; interactive object location; local-global graph; natural language text sentence; Artificial intelligence; Data mining; Feature extraction; Image converters; Image edge detection; Image processing; Image recognition; Image retrieval; Image segmentation; Natural languages; Converting Images to NL-Text; Graphs; Image Analysis and Representation; Recognizing Objects;