Armed with nothing more than an Apache Spark toting laptop, you have all the trappings required to prototype the application of Machine Learning against your data-science needs. From programmability in Scala, Java or Python, to built-in support for Machine Learning via MLlib, Spark is an exceedingly effective enabler that allows you to rapidly produce results.
Of course, as soon as your prototyping proves successful, you'll want to scale out to embrace the volume, variety and velocity that characterizes today's Big Data demands... in production. Because Spark is as comfortable on an isolated laptop as it is in a distributed-computing environment, addressing Big Data requirements in production boils down to effectively and efficiently embracing containers and clusters for Big Data Analytics.
And this is where offerings from Univa shine - i.e., in making the transition from prototype to production completely seamless. For some use cases, it makes sense to scale-in Spark based applications within Docker containers via Univa Grid Engine Container Edition or Navops by Univa; whereas in others, Spark is interfaced (as a Mesos-compliant framework) with Univa Universal Resource Broker, to permit scaling out on a cluster. In both scenarios, your production Spark applications are scheduled alongside other classes of workload - without a need for dedicated resources.
Overview of Apache Spark as a platform for Deep Learning - from Python-based Jupyter Notebooks to Spark's Machine Learning library MLlib
Overview of prototyping Machine Learning via Apache Spark on a laptop - without and within Docker containers
Introductions to Univa Grid Engine Container Edition and Univa Universal Resource Broker plus Navops by Univa
Overview of production Big Data Analytics platforms for Machine Learning
Docker-containerized Apache Spark and Univa Grid Engine Container Edition
Docker-containerized Apache Spark and Navops by Univa
Apache Spark plus Univa Universal Resource Broker
Introducing support for GPUs without and within Docker containers
Use case example - using Machine Learning to classify data from Twitter without and within Docker containers
Summary and next steps
Speaker:Ian Lumb, System Architect, Univa Corporation.
As an HPC specialist, Ian Lumb has spent about two decades at the global intersection of IT and science.
Ian received his B.Sc. from Montreal's McGill University, and then an M.Sc. from York University in Toronto.
Although his undergraduate and graduate studies emphasized geophysics, Ian's current interests include workload
orchestration and container optimization for HPC to Big Data Analytics in clusters and clouds.
Video is available in .mp4 format.
To download a copy of this video, please fill out the form below and hit the Submit button. We will email you a link to download and view this video in .mp4 format.
Fields marked with (*) are mandatory.