Apache Mahout Online Training

Introduction to Apache Mahout

Apache Mahout is an Apache TLP project to build powerful scalable
machine learning tools for use on analyzing big-data on distributed
manner. Machine learning is the discipline of artificial intelligence that
enables to learn on data, spam filtering and natural language processing.
Apache mahout enable clustering, dimensionality reduction and
miscellaneous. It is used by Facebook, Linkedin and twitter.

Course Curriculum
Unit 1: Introduction to Machine Learning and Apache Mahout
Topics -Machine Learning Fundamentals, Apache Mahout Basics, History of
Mahout, Supervised and Unsupervised Learning techniques, Mahout and Hadoop,
Introduction to Clustering, Classification.
Unit 2: Mahout and Hadoop
Topics - Mahout on Apache Hadoop setup, Mahout and Myrrix.
Unit 3: Recommendation Engine
Topics -Recommendations using Mahout, Introduction to Recommendation
systems, Content Based, Mahout Optimizations.

Unit 4: Implementing a recommender and recommendation platform

Topics -User based recommendation, User Neighbourhood, Item based
Recommendation, Implementing a Recommender using MapReduce, Platforms:
Similarity Measures, Manhattan Distance, Euclidean Distance, Cosine Similarity,
Pearson's Correlation Similarity, Loglikihood Similarity, Tanimoto, Evaluating
Recommendation Engines (Online and Offline), Recommendors in Production.
Unit 5: Clustering

Topics -Clustering, Common Clustering Algorithms, K-means, Canopy Clustering,

Fuzzy K-means and Mean Shift etc., Representing Data, Feature Selection,
Vectorization, Representing Vectors, Clustering documents through example, TF-IDF,
Implementing clustering in Hadoop, Classification.
Unit 6: Classification
Topics -Examples, Basics, Predictor variables and Target variables, Common
Algorithms, SGD, SVM, Navie Bayes, Random Forests, Training and evaluating a
Classifier, Developing a Classifier.

Unit 7: Mahout and Amazon EMR

Topics -Mahout on Amazon EMR, Mahout Vs R, Introduction to tools like Weka,
Octave, Matlab, SAS.
Unit 8: Project
Topics -A complete recommendation engine built on application logs and

