Abstract :
Summary form only given. Large scale data processing analyses and makes sense of large amounts of data. Although the field itself is not new, it is finding many usecases under the theme "Bigdata" where Google itself, IBM Watson, and Google\´s Driverless car are some of success stories. Spanning many fields, Large scale data processing brings together technologies like Distributed Systems, Machine Learning, Statistics, and Internet of Things together. It is a multi-billion-dollar industry including use cases like targeted advertising, fraud detection, product recommendations, and market surveys. With new technologies like Internet of Things (IoT), these use cases are expanding to scenarios like Smart Cities, Smart health, and Smart Agriculture. Some usecases like Urban Planning can be slow, which is done in batch mode, while others like stock markets need results within Milliseconds, which are done in streaming fashion. There are different technologies for each case: MapReduce for batch processing and Complex Event Processing and Stream Processing for real-time usecases. Furthermore, the type of analysis range from basic statistics like mean to complicated prediction models based on machine Learning. In this talk, we will discuss data processing landscape: concepts, usecases, technologies and open questions while drawing examples from real world scenarios.
Keywords :
Big Data; parallel processing; Big Data; Google Driverless car; IBM Watson; Internet of Things; IoT; MapReduce; batch processing; complex event processing; distributed systems; fraud detection; large scale data processing; machine learning; market surveys; product recommendations; smart agriculture; smart cities; smart health; statistics; stock markets; stream processing; targeted advertising; urban planning; usecases;