Home »Resources»Webinar


Machine Learning for Big Data Analytics:
Scaling In with Containers while Scaling Out on Clusters

Watch On Demand Anytime

Armed with nothing more than an Apache Spark toting laptop, you have all the trappings required to prototype the application of Machine Learning against your data-science needs. From programmability in Scala, Java or Python, to built-in support for Machine Learning via MLlib, Spark is an exceedingly effective enabler that allows you to rapidly produce results.

Of course, as soon as your prototyping proves successful, you'll want to scale out to embrace the volume, variety and velocity that characterizes today's Big Data demands... in production. Because Spark is as comfortable on an isolated laptop as it is in a distributed-computing environment, addressing Big Data requirements in production boils down to effectively and efficiently embracing containers and clusters for Big Data Analytics.

And this is where offerings from Univa shine - i.e., in making the transition from prototype to production completely seamless. For some use cases, it makes sense to scale-in Spark based applications within Docker containers via Univa Grid Engine Container Edition or Navops by Univa; whereas in others, Spark is interfaced (as a Mesos-compliant framework) with Univa Universal Resource Broker, to permit scaling out on a cluster. In both scenarios, your production Spark applications are scheduled alongside other classes of workload - without a need for dedicated resources.


• Overview of Apache Spark as a platform for Deep Learning - from Python-based Jupyter Notebooks to Spark's Machine Learning library MLlib
• Overview of prototyping Machine Learning via Apache Spark on a laptop - without and within Docker containers
• Introductions to Univa Grid Engine Container Edition and Univa Universal Resource Broker plus Navops by Univa
• Overview of production Big Data Analytics platforms for Machine Learning
    • Docker-containerized Apache Spark and Univa Grid Engine Container Edition
    • Docker-containerized Apache Spark and Navops by Univa
    • Apache Spark plus Univa Universal Resource Broker
    • Introducing support for GPUs without and within Docker containers
• Use case example - using Machine Learning to classify data from Twitter without and within Docker containers
• Summary and next steps

Ian Lumb, System Architect, Univa Corporation.

As an HPC specialist, Ian Lumb has spent about two decades at the global intersection of IT and science. Ian received his B.Sc. from Montreal's McGill University, and then an M.Sc. from York University in Toronto. Although his undergraduate and graduate studies emphasized geophysics, Ian's current interests include workload orchestration and container optimization for HPC to Big Data Analytics in clusters and clouds.

Video Download
Video is available in .mp4 format.

To download a copy of this video, please fill out the form below and hit the Submit button.
We will email you a link to download and view this video in .mp4 format.

Fields marked with (*) are mandatory.