Loading…
Scala By the Bay has ended
Monday, August 17 • 3:20pm - 4:00pm
Why Apache Flink is the 4G of Big Data Analytics Frameworks?

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Apache Flink is a community-driven open source and memory-centric Big Data analytics framework.  It provides the only hybrid (Real-Time Streaming + Batch) open source distributed data processing engine supporting many use cases. 

Flink uses a mixture of Scala and Java internally, has very good Scala APIs and some of its libraries are basically pure Scala (FlinkML and Table).

At its core, it is a streaming dataflow execution engine and it also provides several APIs for batch processing (DataSet API), real-time streaming (DataStream API) and relational queries (Table API) and also domain-specific libraries for machine learning (FlinkML) and graph processing (Gelly).

In this talk, you will learn in more details about:

  1. What is Apache Flink, how it fits into the Big Data ecosystem and why it is the 4G (4th Generation) of Big Data Analytics frameworks? 
  2. How Apache Flink integrates with Apache Hadoop and other open source tools for data input and output as well as deployment? 
  3. Why Apache Flink is an alternative to Apache Hadoop MapReduce, Apache Storm and Apache Spark? What are the benchmarking results between Apache Flink and those other Big Data analytics frameworks?

Speakers
avatar for Slim Baltagi

Slim Baltagi

Director, Big Data Engineering Fellow, Capital One
Slim Baltagi is currently a director of Big Data engineering at Capital One in Chicago. He has more than 17 years of IT and business experience and has spent the last four years of his life hadooping and more recently sparking and flinking! He has worked on more than 12 Big Data projects... Read More →


Monday August 17, 2015 3:20pm - 4:00pm PDT
Track B

Attendees (0)