This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
View analytic
Friday, August 14 • 10:20am - 10:50am
Keynote III: Data Science at Scale with Spark

Sign up or log in to save this to your schedule and see who's attending!

Apache Spark has been blessed as the replacement for MapReduce in Hadoop environments. It also runs in other deployment modes. Spark provides better performance, better user productivity, and it supports a wider range of application scenarios than MapReduce, including event stream processing, ad hoc queries, graph representations and algorithms, and iterative algorithms, such as those commonly used in machine learning.

This talk discusses Spark from a Data Science perspective, it's strengths and weaknesses, the Scala, as well as Java, Python, and R APIs it offers for common analytics problems, what's missing, and what's planned. We'll look at support for ad hoc queries over large data sets, stream processing, machine learning algorithms, graph processing, and the user experience.

avatar for Dean Wampler

Dean Wampler

Office of the CTO, Architect for Big Data Products and Services, Typesafe
Dean Wampler, Ph.D. (@deanwampler) leads the Big Data efforts at Typesafe, focusing on Spark, Mesos, Hadoop, Akka, and other tools. He is the author of "Programming Scala, Second Edition" and "Functional Programming for Java Developers", and the co-author of "Programming Hive", all from O'Reilly. Dean is a contributor to several open source projects and he co-organizes and speaks at many technology conferences and Chicago-based user groups.

Friday August 14, 2015 10:20am - 10:50am

Attendees (44)