2. The Problem
• Goal:
o Move from prototype to production
• Road block:
o Prototyping Environment Cages Your:
• Feature preprocessing
• Models
• Ideas
6. H2OAssembly
o Build Rich Feature Preprocessing Assembly Lines
• Clean, reduce, and expand datasets by composing any
of the 100s of primitives available in H2O
• Build hygenic processing assembly lines that can be
applied to new batches of data
• Export your feature preprocessing steps as a plain old
java object and apply to streaming tuples
9. Live Demo
• Lending Club Data: Predict Interest Rate
o Four-part dataset of loan data
o 500K rows, 52 columns
o Preprocess 5 columns within a 16 step assembly
o Build a simple GBM to predict interest rate
o Export everything into a Storm topology