Author_Institution :
Comput. Sci. & Eng. Dept., Univ. of Texas at Arlington (UTA), Arlington, TX, USA
Abstract :
Summary form only given. Information processing has been the primary goal - whether it was being done manually or using sophisticated devices. Sometimes one gets the feeling that although we are drowning, we are not able to extract useful knowledge. What computing and computers have allowed us to do, over several decades, is to handle large volumes of raw data, in real-time as needed, to infer meaningful knowledge and facilitate decision making through actionable knowledge. This has come about due to various technological advances (both hardware and software) which have tremendously increased our ability to generate, collect, store, and process very large amounts of data. This is true whether it is data on the web, personal data, or data collected by enterprises. In this talk, we first identify the causes (or drivers) that have helped us generate and accumulate large amounts of raw data. Then we overview the earlier approaches for managing data and obtaining information. Finally, we explore potential current approaches for dealing with very large amounts of data that will allow us to filter/fuse/reduce it to obtain actionable knowledge. We present database technology, stream and complex event processing (CEP), mining, and information retrieval & ranking as examples of potential approach that need to be synergistically mixed and matched to achieve the desired outcome. Other aspects, such as parallel processing and cloud computing will also be discussed briefly for dealing with very large amounts of data.
Keywords :
cloud computing; data mining; file organisation; information retrieval; actionable knowledge; cloud computing; complex event processing; data collection; data generation; data management; data mining; data processing; data storage; database technology; file systems; information processing; information ranking; information retrieval; Cloud computing; Computer science; Educational institutions; File systems; Information processing; Information technology; Laboratories;