This document summarizes a presentation on machine learning and Hadoop. It discusses the current state and future directions of machine learning on Hadoop platforms. In industrial machine learning, well-defined objectives are rare, predictive accuracy has limits, and systems must precede algorithms. Currently, Hadoop is used for data preparation, feature engineering, and some model fitting. Tools include Pig, Hive, Mahout, and new interfaces like Spark. The future includes YARN for running diverse jobs and improved machine learning libraries. The document calls for academic work on feature engineering languages and broader model selection ontologies.