Loading…
This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
View analytic
Monday, August 17 • 3:20pm - 4:00pm
Title: Scala, FP and Spark - the Perfect Combo for Machine Learning

Sign up or log in to save this to your schedule and see who's attending!

While FP and Scala have already become the mainstays of middleware, web development and big data stacks (Akka, Play, Kafka, Spark), they tend not to have a big presence in the machine learning and NLP communities. For instance, the emerging deep learning toolkits are mostly Python‐based (Pylearn2, Theano, etc.). The same goes for general-purpose machine learning (Python's scikit-learn, countless R libraries). Performance seekers dissatisfied with slow scripting languages write typed Cython code, contorted C++ libraries bound to scripting language wrappers, or resort to random exotic solutions such as Lua. Some even dispense with all abstraction and write incomprehensible CUDA kernels. There has to be a better way. As a machine learning engineer, I want to write strongly typed functional code. Math has no place for side effects, and I don't want to waste time running a simulation for hours, only to find that I made a typo in my "stringly-typed" script. Unbeknownst to most, Scala's machine learning and NLP ecosystem is growing rapidly, from numeric processing (Spire, Breeze) to big data machine learning (MLLib, Mahout) to GPU‐based text parsing (Puck), to general‐purpose probabilistic programming (FACTORIE). In this talk, I'll do a quick overview of Scala's machine learning ecosystem, and show how easy it is to re-use existing components to build a new, scalable algorithm implementation. If you'd like to see how you can write vectorized linear regression running native BLAS code, based on an SGD/Adagrad implementation written from scratch. capable of running at scale on petabytes of data using Spark, this talk is for you.

Speakers
avatar for Marek Kolodziej

Marek Kolodziej

Principal Research Engineer, Nitro
Marek Kolodziej is a Principal Research Engineer at Nitro, Inc. He's been working on a diverse set of machine learning, distributed computing and big data problems for the past 6 years, and statistics and econometrics for the past 11. His current passion is deep learning and GPU computing. Marek got his PhD in Energy and Environmental Economics from Boston University.


Monday August 17, 2015 3:20pm - 4:00pm
Track A

Attendees (21)