¿Cómo puede llevar el aprendizaje automático a las masas? Los proyectos de Machine Learning con la búsqueda de talento, el tiempo para construir e implementar modelos y confiar en los modelos que se construyen.
¿Cómo puede tener varios equipos en su organización para crear modelos de ML precisos sin ser expertos en ciencia de datos o aprendizaje automático?
¿Se pregunta sobre los diferentes sabores de AutoML?
H2O Driverless AI emplea las técnicas de científicos expertos en datos en una aplicación fácil de usar que ayuda a escalar sus esfuerzos de ciencia de datos. La inteligencia artificial Driverless permite a los científicos de datos trabajar en proyectos más rápido utilizando la automatización y la potencia de computación de vanguardia de las GPU para realizar tareas en minutos que solían tomar meses.
Con H2O Driverless AI, todos, incluyendo expertos y científicos de datos junior, científicos de dominio e ingenieros de datos pueden desarrollar modelos confiables de aprendizaje automático. Esta plataforma de aprendizaje automático de última generación ofrece una funcionalidad única y avanzada para la visualización de datos, la ingeniería de características, la interpretabilidad del modelo y la implementación de baja latencia.
H2O Driverless AI hace:
* Visualización automática de datos
* Ingeniería automática de funciones a nivel de Grandmaster
* Selección automática del modelo
* Ajuste y capacitación automáticos del modelo
* Paralelización automática utilizando múltiples CPU o GPU
* Ensamblaje automático del modelo
*automática del Interpretaciónaprendizaje automático (MLI)
* Generación automática de código de puntuación
¿Quieres probarlo tú mismo? Puede obtener una prueba gratuita aquí: H2O Driverless AI trial.
Venga a esta sesión y descubra cómo comenzar con el Aprendizaje automático automático con AI sin conductor H2O, y cree modelos potentes con solo unos pocos clics.
¡Te veo pronto!
Acerca de H2O.ai
H2O.ai es una empresa visionaria de software de código abierto de Silicon Valley que creó y reimaginó lo que es posible. Somos una empresa de fabricantes que trajeron al mercado nuevas plataformas y tecnologías para impulsar el movimiento de inteligencia artificial. Somos los creadores de, H2O, la principal plataforma de aprendizaje de ciencia de datos de fuente abierta y de aprendizaje automático utilizada por casi la mitad de Fortune 500 y en la que confían más de 14,000 organizaciones y cientos de miles de científicos de datos de todo el mundo.
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Introducción al Machine Learning Automático
1. Introducción al Machine
Learning Automático
Meetup
“AI to do AI”
Rafael Coss (Rafael.Coss@h2o.ai)
Director of Community
@racoss
@h2oai
Chris Carpenter (Chris.Carpenter@h2o.ai)
Leobardo Morales (lmorales@mx1.ibm.com)
2. H2O.ai Meetup Groups
Contact Rafael Coss
community@h2o.ai
If you want to …
- Give a talk about AI /
machine learning use case (it
is a great opportunity to
promote your work)
- Host a joint meetup with
H2O.ai
https://www.meetup.com/pro/h2oai
3. H2O.ai Community Slack Workspace
•Join the H2O.ai Community Slack Workspace today!
•https://www.h2o.ai/community/driverless-ai-community/#chat
•Use emoji to tag messages
•:question :use_case :mli :get_started :bugs …
•Reply to message using threads
•Check out Community Guide for more info:
•https://tinyurl.com/hac-community-guide
Online Chat to ask questions, discuss use cases, give feedback and more
4. H2O WORLD SAN FRANCISCO
February 4-5, 2019
Hilton San Francisco Union Square
world.h2o.ai
5. AI is Transforming the IT Industry
"AI is the fastest growing workload on
the planet”
300%
Increase in AI spend year
over year
“Demand for AI Talent
on the Rise”
200%
Increase in jobs requiring
AI skills
“Businesses are preparing for
the widespread adoption of
machine learning”
9/10
CIOs planning to use
machine learning
7. H2O.ai Company Overview
Company Founded in Silicon Valley in 2012
Series C Investors: Wells Fargo, NVIDIA, Nexus Ventures, Paxion Ventures
Products • H2O Open Source Machine Learning (14,000 organizations)
• H2O Driverless AI – Automatic Machine Learning
Leadership Market Leader recognized by Gartner, Forrester, InfoWorld,
Constellation Research
Team 130+ AI experts (Kaggle Grandmasters, Distributed Computing and
Visualization experts)
Global Mountain View, London, Prague, Chennai
8. Worlds largest data science community
(over 2 million members)
AI and ML education
Best known for AI competitions
Public datasets
Code and analysis sharing
http://www.kaggle.com
1st 4th
48th33rd25th
13th
Grandmasters
(their highest ranking)
10. “Confidential and property of H2O.ai. All rights reserved”
Partner Ecosystem
Strategic
Partners
Cloud ProvidersHW Vendors System
Integrators
Value Added
Resellers
Data Stores
11. H2O.ai Product Suite
GPU-accelerated
machine learning package
Automatic feature
engineering, machine learning
and interpretability
• 100% open source – Apache V2 licensed
• Built for data scientists – interface using R, Python
on H2O Flow (interactive notebook interface)
• Enterprise Support subscriptions
• Built for domain users, analysts and
data scientists – GUI based interface
for end-to-end data science
• Fully automated machine learning
from ingest to deployment
• Licensed on a per seat basis
(annual subscription)
Open Source
In-memory, distributed
machine learning algorithms
with H2O Flow GUI
-3
H2O AI open source engine
integration with Spark
12. H2O.ai is a Recognized Leader in AI and ML
2018 Gartner Magic Quadrant
for Data Science and
Machine Learning Platforms
Forrester Wave: Notebook-Based
Predictive Analytics And Machine
Learning Solutions, Q3 2018
Top 3 Artificial Intelligence (AI)
and Machine Learning (ML)
Software Solution
“Technology leadership …
with a distinguished vision”
“the quasi-industry standard”
“its vision of creating an AI and
ML tool that ultimately aims to
allow almost everyone within the
business to create their own
predictive models”
“H2O.ai’s future is automated
machine learning”
“its bright future is in
Driverless AI”
13. Highly Regarded
by Customers
Dr. Robert Coop
AI and ML Manager
Stanley Black & Decker
“H2O Driverless AI feature
engineering is better than anything
I've seen out there right now. And
the scoring pipeline generation is
probably one of the bigger pluses for
me. These features alone have
provided us with a true competitive
edge in agile manufacturing. It's a
massive time saver.”
15. What is Data Science?
Clean, transform, filter, aggregate, impute
Convert into X and Y
Problem
Formulation
Data
Processing
Machine
Learning
• Identify a data task or prediction problem
• Collect relevant data
• Train models
• Evaluate models
18. What is Automatic Machine Learning
“the automated process of algorithm selection,
feature generation, hyperparameter tuning,
iterative modeling, and model assessment.”
Enabled by advances in computing power at lower cost that
make it possible for machines to try thousands of possible
combinations to find the best one.
Confidential and property of H2O.ai. All rights reserved
19. The Evolving Space of Automatic Machine Learning
01
02
Open source model
showdown with feature
encoding, automatic hyper-
parameter tuning, ensembles
and model leader board
First Gen
01
HPC powered evolutionary
model development with
advanced feature
engineering, extensive
model explainabilty
Second Gen
02
2014-15
2017-18
The picture can't be
displayed.
Confidential and property of H2O.ai. All rights reserved
20. Challenges in AI Model Development
Basic Encoding
Feature Generation
Advanced Encoding
Feature Engineering
Algorithm Selection
Parameter Tuning
Model Building
Model Ensembles
Pipeline Generation
Model Explainabilty
Model Deployment
Model Documentation
• Time consuming
• Requires advanced
skillset
• Creating new feature
combinations requires
advanced skill
• Time consuming
• Requires advanced
knowledge of
algorithms and
parameters
• Creating ensembles is
an advanced skill
• Time consuming
• Requires different set of skills to
deploy models
• Explaining how models make
decisions is critical to building trust
with business stakeholders and
regulators
The entire process is highly iterative and can take weeks or months to develop a single production-ready model.
Confidential and property of H2O.ai. All rights reserved
22. Different Flavors of AutoML
https://www.h2o.ai/blog/the-different-flavors-of-automl
23. The Challenges of Enterprise AI Adoption
Time to Insights Slow
Weeks to
Months
Lack of AI Talent
~100
Data Science
“Grandmasters” in the World
Time for a data scientist
to build a model
Lack of Trust in AI
Black box models
”US alone faces a shortage of 190,000 people with analytical expertise.”
25. Challenges in AI Model Development
Basic Encoding
Feature Generation
Advanced Encoding
Feature Engineering
Algorithm Selection
Parameter Tuning
Model Building
Model Ensembles
Pipeline Generation
Model Explainabilty
Model Deployment
Model Documentation
• Time consuming
• Requires advanced
skillset
• Creating new feature
combinations requires
advanced skill
• Time consuming
• Requires advanced
knowledge of
algorithms and
parameters
• Creating ensembles is
an advanced skill
• Time consuming
• Requires different set of skills to
deploy models
• Explaining how models make
decisions is critical to building trust
with business stakeholders and
regulators
The entire process is highly iterative and can take weeks or months to develop a single production-ready model.
Confidential and property of H2O.ai. All rights reserved
26. Why Next Generation
Automatic Machine Learning
for the Enterprise
Time to Insight
Months down
to Hours
7 Kaggle Grandmasters
Top 10
Data Science Experts
Automated
GPU-accelerated ML
with IBM AC922
Explainability & Transparency
Trust
In AI
28. Problems Addressed by Driverless AI
28
• Supervised Learning
• Regression
• Classification
• Tabular Structured Data
• Numeric
• Categorical
• Time / Date
• Text
• Missing Values
• Identically and Independently Distributed
(iid) rows
• Time-series
• Single time-series
• Grouped time-series
• e.g. Store - Department - Item
• Time-series with gaps between
training and test set to account for time
to deploy
29. H2O Driverless AI – Simple, Fast, Accurate, Interpretable
Easy Deployment for
Low Latency Models
• Stand-alone scoring pipeline
that is easy for IT to deploy
and manage
• Easy to update when a new
model version is available
• Streamlined scoring code to
deploy on any device: on the
edge, mobile, …
• Very fast (milliseconds) to
satisfy today’s real-time apps
Fast and
Accurate Results
• “Data Scientist in a Box”
• Simple interface
• Automatic feature engineering
to increase accuracy
• Automatic recipes for solving
wide variety of use-cases
• Automatic tuning to
find and tune the right
ensemble of models
Industry Leading
Interpretability
• Trusted results with
explainability and
transparency
• Interpretability for debugging,
not just for regulators
• Get reason codes and model
interpretability in plain English
• K-Lime, LOCO, partial
dependence and more
Automatic Data
Visualization
• Automatic generation of
visualizations and graphs to
explore your data before the
model-building process
• Most relevant graphs shown
for the given data set
• Identify outliers and
missing values
31. H2O Driverless AI Delivers Value in Every Industry
Matched 10 years of
machine learning expertise
Financial Services
+6%
Accuracy
Increased customer
satisfaction
Healthcare
Near
perfect
scores
Outperforms alternative
digital marketing
Marketing
2.5x
performance
Accurately predicting supplies
& materials for future orders
Manufacturing
25%
time savings
“Driverless AI is giving
amazing results in terms of
feature and model
performance “
“Driverless AI powers our data
science team to operate at
scale. We have the opportunity
to impact care at large.”
“Driverless AI helped us gain
an edge for our clients. AI to
do AI, truly is improving our
system on a daily basis.”
“H2O Driverless AI feature
engineering is better than
anything I've seen out there
right now.”
Venkatesh Ramanathan
Sr. Data Scientist, PayPal
Martin Stein
Chief Product Officer, G5
Bharath Sudarshan
Dir. of Data Science, ArmadaHealth
Robert Coop
Sr. Data Scientist, SB&D
33. Financial Fraud Detection
“Driverless AI is giving
amazing results in terms of
feature and model
performance “
Venkatesh Ramanathan
Senior Data Scientist, PayPal
• Driverless AI matched
10 years of expert
feature engineering
• Increased accuracy
from 0.89 to 0.947 (6%)
in detecting fraudulent
activity
• 6X speed up when
running on an IBM Power
GPU-based server
34. Connecting Patients to
Specialists for Better Healthcare
• Companies have seen
“skyrocketing” net promoter
scores and “near perfect”
customer satisfaction rates
• Customer loyalty and
premium retention rates
have increased
• Reduces costs, while
patients receive care faster
“Driverless AI powers our data
science team to operate
efficiently and experiment at
scale. With this latest innovation,
we have the opportunity to
impact care at large.”
Bharath Sudarshan
Director of Data Science and Innovation
Armada Health
35. Marketing Optimization
for the Real Estate Market
“Driverless AI helped us gain
an edge with our Intelligent
Marketing Cloud for our clients.
AI to do AI, truly is improving our
system on a daily basis.”
Martin Stein
Chief Product Officer
• Outperforms other real
estate digital marketing
solutions by 2.5X
• A G5 client saved $500K
annual digital spend while
increasing web traffic 3X
• 10X faster model creation
36. Improve Manufacturing
Sales and Forecasting
“H2O Driverless AI feature
engineering is better than anything
I've seen out there right now. And
the scoring pipeline generation is
probably one of the bigger pluses
for me. It's a massive time saver.”
Robert Coop
Sr. Data Scientist
Stanley Black & Decker
• Time savings of 25%
with 1 data scientist
• Saved 1 month of time in
model tuning and training
for industrial product line
• Accurately predicted
supplies and materials
for a future client order
increasing forecast
accuracy
37. IBM & H2O Driverless AI
Simplifying and Accelerating
Enterprise AI Initiatives
38. H2O Driverless AI Benefits from the
Power Systems Advantage
High Speed Data Transfer
9.5x
Big Data Scale
2.6x
More RAM Max I/O bandwidth
GPU Accelerated ML
Integrated Systems Approach
Faster on GPUs
30x
39. H2O Driverless AI on IBM Power Systems
A Winning Combination
High Speed Data Transfer
1.5x
Big Data Scale
2x
Data Ingest
Faster Feature
Engineering
GPU Accelerated ML
Time Series
5x
Integrated Systems Approach
40. PowerAI
Deep Learning Impact
(DLI) Module
Data & Model
Management, ETL,
Visualize, Advise
IBM Spectrum Conductor with Spark
Cluster Virtualization,
Auto Hyper-Parameter Optimization
PowerAI: Open Source ML Frameworks
Large Model Support (LMS)
Distributed Deep Learning
(DDL)
Auto ML
PowerAI
Enterprise
PowerAI Vision
Auto-DL for Images & Video
Label Train Deploy
Accelerated
Infrastructure
Accelerated Servers Storage
AI for
Data Scientists and
non-Data Scientists
H2O Driverless AI
Auto-ML for Text & Numeric Data, NLP
Import Experiment Deploy
41. H2O Driverless AI Complements IBM PowerAI Vision
Sensors
Log
Transactional
IBM PowerAI delivers
Deep Learning for Images
H2O Driverless AI is
Automatic Machine Learning
NLP
44. H2O Driverless AI: How it Works
Local
Amazon S3
HDFS
X Y
Automatic
Scoring Pipeline
Machine learning
Interpretability
Deploy
Low-latency
Scoring to
Production
Modelling
Dataset
Model Recipes:
• IID data
• Time-series
• More on the way
Advanced
Feature
Engineering
Algorithm Model
Tuning
+ +
Survival of the Fittest
Automatic Machine Learning
Understand the data
shape, outliers, missing
values, etc.
Powered by GPU Acceleration
1
Drag and drop
data
2
Automatic
Visualization Use best practice model
recipes and the power of high
performance computing to
Iterate across thousands of
possible models including
advanced feature engineering
and parameter tuning
3
Automatic Machine Learning
Deploy ultra-low latency Python
or Java Automatic Scoring
Pipelines that include feature
transformations and models.
4
Automatic Scoring Pipelines
Ingest data from
cloud, big data and
desktop systems
Google BigQuery
Azure Blog Storage
Snowflake
Model
Documentation
45. The Driverless AI Experience
1. Import Data
2. Review Auto-Visualizations
3. Start Experiment
4. Review Winning Model
5. Review Model Interpretations
6. Deploy Model
46. The Driverless AI Experience
1. Import Data
2. Review Auto-
Visualizations
3. Start Experiment
4. Review Winning Model
5. Review Model Interpretations
6. Deploy Model
48. The Driverless AI Experience
Quickly start an experiment and benefit
from built-in automation:
1. Import Data
2. Review Auto-Visualizations
3. Start Experiment
4. Review Winning Model
5. Review Model Interpretations
6. Deploy Model • Feature Engineering
• Model Tuning
• Model Selection
49. The Driverless AI Experience 3. Start Experiment
Feature Engineering
Model Tuning
Quickly Start Experiment
Model Selection
50. These are the only required
settings – all others are optional
depending on the scenario
It’s Easy to Start an Experiment
Dataset being
used to train
the models
What column
are we trying
to predict?
Should certain
rows of data
have a
higher weight?
Data used to
calculate metrics
for the final model;
not used
during training
Is this a time-
series forecasting
exercise?
Columns to
exclude from
experiment
Data used for
parameter tuning
51. Experiment
Settings
• Relative time for completing
the experiment
• Higher settings mean:
• More iterations are
performed to find the
best set of features
• Longer “early stopping”
threshold
Time
• Relative accuracy – higher
values should lead to higher
confidence in model
performance (accuracy)
• Impacts things such as level
of data sampling, how many
models are used in the final
ensemble, parameter tuning
level, among others
Accuracy
• Relative interpretability –
higher values favor more
interpretable models
• The higher the interpretability
setting, the lower the
complexity of the engineered
features and of the
final model(s).
Interpretability
52. Auto Feature Generation
Kaggle Grandmaster Out of the Box
• Automatic Text Handling
• Frequency Encoding
• Cross Validation
Target Encoding
• Truncated Singular
Value Decompression
• Clustering and more
Feature Transformations
Examples of
Original Features
Examples of
Generated
Features
53. The Driverless AI Experience
1. Import Data
2. Review Auto-Visualizations
3. Start Experiment
4. Review Winning Model
5. Review Model Interpretations
6. Deploy Model
55. The Driverless AI Experience
1. Import Data
2. Review Auto-Visualizations
3. Start Experiment
4. Review Winning Model
5. Review Model Interpretations
6. Deploy Model
65. CONFIDENTIAL
H2O Driverless AI on the Cloud
• Easy setup on any cloud or on premise.
Support for Azure, AWS and Google Cloud
with marketplace offerings.
• Develop more models using H2O Driverless
AI automatic machine learning using high-
performance computing and evolutionary
algorithms to perform time-consuming
data science tasks like feature engineering
and model hyperparameter tuning.
• Leverage your existing ML workbench to
create and deploy streamlined production
models based on insights from Driverless
AI
66. H2O Driverless AI Delivers Automatic ML for the Enterprise
21 day free trial for Driverless AI
• Performs the function of an expert data
scientist
• Create models quickly with GPUs and
Machine Learning automation
• Delivers insights and interpretability
• Created and supported by world
renowned AI experts from H2O.ai
• Award-winning software
67. Getting Started
67
• Get the 21 day free trial for Driverless AI
• Don’t have the hardware try Qwiklab cloud training environment
• Go to your favorite cloud AWS, Azure, Google
• Try video Tutorial or follow Booklet
• Learn how Driverless AI delivers Trust & Explainable AI
• Learn more about NLP and Time-Series in Driverless AI
• Watch Replays from H2O World London 2018
• Watch “Democratizing Intelligence” by Sri Ambati, CEO &Founder
• Learn how PayPal is solving fraud with Driverless AI
• Docs
• H2O Community Slack
74. Credit Card Example
74
• Dataset:
• information on default payments, demographic factors, credit data, history of payment,
etc.
• Source: www.kaggle.com/uciml/default-of-credit-card-clients- dataset
• File System:
• CreditCard-train.csv (for training models)
• CreditCard-test.csv (for making new predictions)
• Our Goal:
• Predict whether someone will default on their credit card payment.
• Tutorial:
• http://docs.h2o.ai/driverless-ai/latest-stable/docs/booklets/DriverlessAIBooklet.pdf
• http://docs.h2o.ai/driverless-ai/latest-stable/docs/booklets/MLIBooklet.pdf
77. Target
Learn the Pattern
Education, Marriage, Age, Sex,
Repayment Status, Limit Balance ...
77
Learning from Credit Card Data
Features
Default Payment
Next Month
(Binary)
Predictions
Probability
(0...1)