This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
View analytic
Monday, August 17 • 2:50pm - 3:10pm
Enabling Enterprise Big Data Analytics with Scala at Alpine

Sign up or log in to save this to your schedule and see who's attending!

The vision of Alpine Data Labs is to make data science so straightforward that it becomes a tool for business users as well as data scientists. To this goal, we developed an intuitive visual UI which allows users to interact with Hadoop data and perform advanced analytics. However, architecting a highly scalable and effective platform presents some specific challenges including: supporting multiple Hadoop distributions, supporting pig/sql/R/hive/mapreduce/Spark, and showing visual progress for all the analysis. We have leveraged Scala to address each of these issue by building an agent architecture which uses Akka to scale out to different Hadoop distributions and designed an R-Akka Server that allows Alpine to scale out R sessions. We use Spray + Akka to expose our Alpine restful APIs and have implemented Machine Learning algorithms in Spark using Scala. We have also enhanced the Spark Yarn module via Akka messaging as communication channel.

In this talk, we will specifically focus on the Alpine Spark Integration:
  • Submitting a Spark job from a servlet engine
  • Enhancing the Spark client in Yarn cluster mode to enable the Yarn app Listener and the stop Yarn application
  • Yarn resource capacity callback 
  • Messaging Channel for logging, progress, error handling via Akka
  • Re-directing print stream and Spark Job Progress listener to Alpine UI.
  • Job progress live streaming to Alpine UI via websocket

avatar for Chester Chen

Chester Chen

Director of Engineering, Alpine Data
Chester Chen is the Director of Engineering and hands on architect at Alpine Data Labs. He manages the analytics platform development as well as contribute to some of the major developments. He has been working with scala on and off since Scala 2.7. He is the founder and organizer of SF Big Analytics Meetup, as well as the main co-organizer of the SF machine learning meetup. Before joining Alpine Data Labs, he had played many roles as Technical... Read More →
avatar for Steven Hillion

Steven Hillion

Co-Founder, Alpine Data Labs
Steven Hillion is the co-founder of Alpine Data Labs, which is dedicated to making advanced analytics scalable, accessible, and operational. | | Steven has been leading large engineering and analytics projects for fifteen years. Before joining Alpine Data Labs, he founded the analytics group at Greenplum, leading a team of data scientists and also designing and developing new open-source and enterprise analytics software. Before that, he was... Read More →

Monday August 17, 2015 2:50pm - 3:10pm
Track B

Attendees (4)