Mastering Spark Data Science on Azure

Apache Spark is a fast, in-memory data-processing engine with elegant and expressive development APIs that enable data scientists to execute streaming workloads, build sophisticated machine-learning models, and perform other tasks endemic to extracting information from large datasets. Apache Spark for Azure HDInsight makes high-performance Spark clusters available to the masses, and Azure's Data Science Virtual Machine is perfect for learning Spark and other tools such as Jupyter and Microsoft R Server. In this landmark series, Microsoft data scientist Mark Tabladillo takes a deep dive into Apache Spark and the Spark ecosystem and demonstrates how to use the Spark support in Azure to make short work of big-data workloads and build sophisticated machine-learning models.

Course Title	Author	Duration	Topic(s)
Introduction to Spark	Mark Tabladillo	00:58:49	Data Science, Spark, Azure, Big Data
Business Intelligence Tools and Spark	Mark Tabladillo	01:02:01	Data Science, Spark, Azure, Big Data
Data Processing with Spark 2	Mark Tabladillo	00:48:21	Data Science, Spark, Azure, Big Data
Text Analytics with Spark ML	Mark Tabladillo	01:05:57	Data Science, Spark, Azure, Machine Learning, Big Data
Regression with Spark ML	Mark Tabladillo	01:06:39	Data Science, Spark, Azure, Machine Learning, Big Data
Classification with Spark ML	Mark Tabladillo	01:24:13	Data Science, Spark, Azure, Machine Learning, Big Data
Clustering with Spark ML	Mark Tabladillo	01:10:03	Data Science, Spark, Azure, Machine Learning, Big Data
Recommendation with Spark ML	Mark Tabladillo	00:59:49	Data Science, Spark, Azure, Machine Learning, Big Data