大数据分析
文件大小: 1797k
源码售价: 10 个金币 积分规则     积分充值
资源说明:Contents Foreword........................................................................................................................................... 8 Acknowledgments............................................................................................................................. 10 About the Author............................................................................................................................... 13 1. Introduction: Why Look Beyond Hadoop Map-Reduce?.........................................14 Hadoop Suitability...................................................................................................................... 15 Big Data Analytics: Evolution of Machine Learning Realizations....................................... 19 Closing Remarks......................................................................................................................... 24 References....................................................................................................................................25 2. What Is the Berkeley Data Analytics Stack (BDAS)?................................................28 Motivation for BDAS..................................................................................................................28 BDAS Design and Architecture.................................................................................................32 Spark: Paradigm for Efficient Data Processing on a Cluster................................................34 Shark: SQL Interface over a Distributed System................................................................... 42 Mesos: Cluster Scheduling and Management System...........................................................45 Closing Remarks......................................................................................................................... 50 References................................................................................................................................... 50 3. Realizing Machine Learning Algorithms with Spark............................................... 55 Basics of Machine Learning...................................................................................................... 55 Logistic Regression: An Overview............................................................................................59 Logistic Regression Algorithm in Spark.................................................................................. 61 Support Vector Machine (SVM)............................................................................................... 64 PMML Support in Spark........................................................................................................... 68 Machine Learning on Spark with MLbase.............................................................................. 78 References....................................................................................................................................79 4. Realizing Machine Learning Algorithms in Real Time............................................81 Introduction to Storm................................................................................................................ 81 Design Patterns in Storm.......................................................................................................... 88 Implementing Logistic Regression Algorithm in Storm....................................................... 91 Implementing Support Vector Machine Algorithm in Storm.............................................. 94 7 Naive Bayes PMML Support in Storm.....................................................................................97 Real-Time Analytic Applications............................................................................................100 Spark Streaming....................................................................................................................... 106 References..................................................................................................................................107 5. Graph Processing Paradigms...........................................................................................109 Pregel: Graph-Processing Framework Based on BSP......................................................... 109 Open Source Pregel Implementations....................................................................................112 GraphLab....................................................................................................................................114 References..................................................................................................................................128 6. Conclusions: Big Data Analytics Beyond Hadoop Map-Reduce......................... 131 Overview of Hadoop YARN......................................................................................................131 Other Frameworks over YARN............................................................................................... 133 What Does the Future Hold for Big Data Analytics?........................................................... 134 References..................................................................................................................................136 A. Code Sketches........................................................................................................................ 138 Code for Naive Bayes PMML Scoring in Spark.................................................................... 138 Code for Linear Regression PMML Support in Spark.........................................................149 Page Rank in GraphLab........................................................................................................... 153 SGD in GraphLab......................................................................................................................158
本源码包内暂不包含可直接显示的源代码文件,请下载源码包。