Scala By the Bay has ended
Back To Schedule
Friday, August 14 • 10:20am - 10:50am
Keynote III: Data Science at Scale with Spark

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Apache Spark has been blessed as the replacement for MapReduce in Hadoop environments. It also runs in other deployment modes. Spark provides better performance, better user productivity, and it supports a wider range of application scenarios than MapReduce, including event stream processing, ad hoc queries, graph representations and algorithms, and iterative algorithms, such as those commonly used in machine learning.

This talk discusses Spark from a Data Science perspective, it's strengths and weaknesses, the Scala, as well as Java, Python, and R APIs it offers for common analytics problems, what's missing, and what's planned. We'll look at support for ad hoc queries over large data sets, stream processing, machine learning algorithms, graph processing, and the user experience.

avatar for Dean Wampler

Dean Wampler

Office of the CTO, Architect for Big Data Products and Services, Typesafe
Dean Wampler, Ph.D. (@deanwampler) leads the Big Data efforts at Typesafe, focusing on Spark, Mesos, Hadoop, Akka, and other tools. He is the author of "Programming Scala, Second Edition" and "Functional Programming for Java Developers", and the co-author of "Programming Hive", all... Read More →

Friday August 14, 2015 10:20am - 10:50am PDT

Attendees (0)