– Apache Spark and Big Data Ecosystem Overview
– Role of Spark with respect to Hadoop, AWS, EMR, and popular big data technologies
– Analytics and ETL with SparkSQL and DataFrame/Dataset APIs
– Basics of Spark Execution and Memory
– Visualizing Data with Zeppelin (and possibly Tableau, time permitting)
– Intro to Machine Learning with SparkML
– Intro to Spark Streaming
– Spark on YARN: Clustering and Operations within EMR
– Business Cases and Architecture Patterns with Spark
Gaurav “GP” Pal is the Founder of stackArmor and a well known expert in big data architectures on cloud based platforms such as AWS with many years of implementation experience on large data centric platforms such as USAspending.gov and Recovery.gov.
Adam Breindel is a stackArmor Big Data Consultant focused on consulting and teaching Apache Spark. Adam’s experience includes work with banks on neural-net fraud detection, streaming analytics, cluster management code, and web apps, as well as development at a variety of startup and established companies in the travel, productivity, and entertainment industries. He is excited by the way that Spark and other modern big-data tech remove so many old obstacles to system design and make it possible to explore new categories of interesting, fun, hard problems.
Learn more about stackArmor and our Analytics offerings on our website https://stackarmor.com/solutions-2/cdo-managed-data-platforms-for-chief-data-officers/