LearnAnalytics@MS presents: Scalable Data Science with Microsoft R Server + Spark and HDInsight | Reading UK
In this three day workshop, you’ll gain hands-on experience with Microsoft R and HDInsight Spark for scalable data science and machine learning
May 03 - 05, 2017  ·  Microsoft Technology Center Thames Valley


In this course, you’ll gain hands-on experience with Microsoft R and HDInsight Spark for scalable data science and machine learning. You will learn about the premium offering of the HDInsight platform and how to leverage Microsoft R Server as an application on top of HDInsight Spark to perform data analysis and machine learning at scale.


  • A subscription to Microsoft Azure (this may be provided through your company or as part of your invitation – you *must* have this enabled prior to class. You will be using Azure throughout the course, for all labs, work, and exercises. You can use your MSDN subscription (https://azure.microsoft.com/en-us/pricing/member-offers/msdn-benefits/), your employer may provide Azure resources to you, or you may receive instructions in your class invitation, and have at least $50 to spend for the course.
  • Understanding of R - ability to write functions, ability to train models, etc.
  • Putty, Cygwin, or some bash emulator (some Linux experience to go with it would be useful)
  • It’s also a good idea to have a general level of predictive and classification modeling, and a basic understanding of Statistics and Machine Learning, i.e., cross-validation, ensemble models, model metrics, etc.

Modules Covered:

  • Hadoop as a Service: Provisioning HDInsight Spark Clusters, Managing Storage Accounts and Containers, Applications and Resource Management
  • Taming the Hadoop Zoo: Overview of HDInsight, Hadoop, and Spark
  • Client Tools, IDEs, Ambari and Spark Job Manager
  • Creating Spark Contexts and Importing Data to Spark DataFrames with R APIs
  • The Art of Being Lazy - Creating Functions that are Invariant and Robust to Data Sources and Compute Contexts
  • Modeling with MLLib
  • Modeling with MRS
  • Deploying Models to AzureML
Microsoft Technology Center Thames Valley - Thames Valley Park Building Three Reading RG6 1WG United Kingdom



Cost of Training: May be subject to charge-back for no show.


Travel and Hotel Accommodations: Partners are responsible for arranging and paying for their own flights, lodging, and transportation to the training location. Microsoft Employees are responsible for arranging and their own flights, lodging, and transportation in accordance with MS policies. Recommendation for all attendees to fully participate, please arrive the day before and plan on departing late evening or the day after training ends. Hands-on activities and application typically continue until 5pm each day of training. For more information on this location, please visit https://www.microsoft.com/en-us/mtc/locations/thames-valley.aspx .


Arrival on-site and Training check-in: Be prepared with Government issued identification (or Microsoft issued badge) to check-in for training at the Microsoft building receptionist. For attendees driving a vehicle to the on-site training, please park in designated Visitor parking spaces and notify the receptionist during check-in. The dress code is business casual.


Meals: Continental breakfast and lunch, as well as refreshments, are provided each day. Dinner is not provided.