Learning Apache Spark 2
文件大小: 16613k
源码售价: 10 个金币 积分规则     积分充值
资源说明:Learning Apache Spark 2 by Muhammad Asif Abbasi English | 6 Jun. 2017 | ASIN: B01M7RO7US | 356 Pages | AZW3 | 16.22 MB Key Features Exclusive guide that covers how to get up and running with fast data processing using Apache Spark Explore and exploit various possibilities with Apache Spark using real-world use cases in this book Want to perform efficient data processing at real time? This book will be your one-stop solution. Book Description Spark juggernaut keeps on rolling and getting more and more momentum each day. The core challenge are they key capabilities in Spark (Spark SQL, Spark Streaming, Spark ML, Spark R, Graph X) etc. Having understood the key capabilities, it is important to understand how Spark can be used, in terms of being installed as a Standalone framework or as a part of existing Hadoop installation and configuring with Yarn and Mesos. The next part of the journey after installation is using key components, APIs, Clustering, machine learning APIs, data pipelines, parallel programming. It is important to understand why each framework component is key, how widely it is being used, its stability and pertinent use cases. Once we understand the individual components, we will take a couple of real life advanced analytics examples like: Building a Recommendation system Predicting customer churn The objective of these real life examples is to give the reader confidence of using Spark for real-world problems. What you will learn Overview Big Data Analytics and its importance for organizations and data professionals. Delve into Spark to see how it is different from existing processing platforms Understand the intricacies of various file formats, and how to process them with Apache Spark. Realize how to deploy Spark with YARN, MESOS or a Stand-alone cluster manager. Learn the concepts of Spark SQL, SchemaRDD, Caching, Spark UDFs and working with Hive and Parquet file formats Understand the architecture of Spark MLLib while discussing some of the off-the-shelf algorithms that come with Spark. Introduce yourself to SparkR and walk through the details of data munging including selecting, aggregating and grouping data using R studio. Walk through the importance of Graph computation and the graph processing systems available in the market Check the real world example of Spark by building a recommendation engine with Spark using collaborative filtering Use a telco data set, to predict customer churn using Regression About the Author Asif Abbasi has worked in the industry for over 15 years, in a variety of roles starting from engineering solutions to selling solutions and everything in between. Asif is currently working with SAS a Market Leader in Analytic Solutions as a Principal Business Solutions Manager for the Global Technologies Practice. Based out of London, Asif has vast experience in consulting for major organizations & industries across the globe, and running proof-of-concepts across various industries including but not limited to Telecommunications, Manufacturing, Retail, Finance, Services, Utilities and Government. Asif has presented at various conferences and delivered workshops on topics such as Big Data, Hadoop, Teradata, and Analytics using Aster on Teradata and Hadoop. Asif is a Oracle Certified Java EE 5 Enterprise Architect, Teradata Certified Master, PMP, Hortonworks Hadoop Certified developer and Administrator. Asif also holds a Masters degree in Computer Science and Business Administration.
本源码包内暂不包含可直接显示的源代码文件,请下载源码包。