Complex Models for Big Data

•Télécharger en tant que PPTX, PDF•

2 j'aime•1,364 vues

Max Welling (http://www.ics.uci.edu/~welling/) describes the how big data, massive simulation and advanced models go together to help us start solving challenging problems. He also describes his links to other computer science disciplines within the DSRC.

Technologie Formation

DS
RC

Data Science
Research Center

Complex Models
for
Big Data

Max Welling
UvA

DS
RC

The Four Paradigms

We have added big data to
computer simulation, experiment
and theory.

Not replaced it…

DS
RC

Big Simulation

Computer simulations have
become increasingly complex
(e.g. weather, earthquake models)

The Computational Wall: If a model has hundreds of parameters, how can we:

1) Find the parameter values that match the observations best?
2) Determine if we underfit (model too simple) or overfit (model too complex)?
3) Compare two models?

DS
RC

Parameter Inference
Parameter Update

Parameters

Simulation

Observations

DS
RC

Challenge I

The “posterior probability”
in closed form.

can not be computed

Solution: Markov Chain Monte Carlo Sampling (MCMC)

DS
RC

Challenge II

We cannot run MCMC because the likelihood
is not given in closed form (but rather as a simulation)

Solution: Likelihood Free MCMC (or Approximate Bayesian Computation)

Run many simulations
and compare samples
With observations.
Source: Csillery, Katalin, et al.
"Approximate Bayesian
computation (ABC)
in practice."Trends in
ecology & evolution 25.7
(2010): 410-418.

DS
RC

Challenge III

We need thousands of simulations to infer the posterior
(infeasible if every simulation takes a day or so)
Ted Meeds

If surrogate ~ log(P) with high
confidence then use surrogate to draw sample.
If not: simulate until enough confidence.

Surrogate of log(P)

Solution: Learn log(P) using Gaussian Process Surrogate functions (GPS)

D S Two Kinds of Complex Model
RC

Machine
Learning

Computational
Science
Model Capacity

“Let the model speak”

“Let the data speak”

DS
RC

3x Exponential Growth
in Machine Learning

Computer Power

Data Volume

Model Capacity

D S Growth in Model Capacity
RC
2020-2050 Human Brain
(N=+/- 100T)

?

Model Capacity over Time

2009: Hinton’s Deep Belief Net
(+/- N=10M)

2013: Google/Y!
(N=+/- 10B)

1943: First NN
(+/- N=10)

1988: NetTalk
(+/- N=20K)

D S Deep Learning: Neural Nets Strike
R C Back(again)
1970: NN discredited
(Minsky & Papert)

2 layers
1943: NN invented
(McCulloch & Pitts)

-Model Size: 10B parameters
-Used by: Yahoo!, Google,
Microsoft, Baidu,
IBM, Scyfer 

1986: Backpropagation
(Rumelhart, Hinton & Williams )

1995: SVM
(Vapnik)

3 layers

2009: Deep Learning
(Hinton)

many
layers

DS
RC

Paradox
Why does model capacity grow exponentially?
Raw Information: O(N)

Predictive Information: log(N)

Noise
?

DS
RC

Big Challenges from Industry
Scyfer connects industry to academia:
-inspire academia w/ relevant problems
-deliver ML products to industry
-host student projects
-provide employment for our students
= VALORISATION

What industry needs.

What academics are
interested in.

DS
RC

Intelligent Autonomous Systems Lab - UvA

Visual
Analytics

Shimon Whiteson

Leo Dorst

Business
Analytics

Decision
Theory

(Geometric Algebra)

Understand
and decide

(Reinforcement Learning
& Planning)

Joris Mooij
(Causality)
Distributed
Processing

Data

Reasoning

Knowledge
representati
on

Large Scale
Databases

Store and
process
Software
Eng.
System /
Network
Eng.

Analyze
and model

Multimedia
Retrieval

Modeling
and
simulation

Information
Retrieval

Machine
Learning

Ben Kröse
(Ambient Robotics)

Dariu Gavrilla
(Human-aware
Intelligent Systems)

Max Welling
(Machine Learning)

DS
RC

Our Future Need

Visual
Analytics

Shimon Whiteson

Leo Dorst

Business
Analytics

Decision
Theory

(Geometric Algebra)

Understand
and decide

(Reinforcement Learning
& Planning)

Joris Mooij
(Causality)
Distributed
Processing

Data

Reasoning

Knowledge
representati
on

Large Scale
Databases

Store and
process
Software
Eng.
System /
Network
Eng.

Analyze
and model

Multimedia
Retrieval

Modeling
and
simulation

Information
Retrieval

Machine
Learning

Ben Kröse
(Ambient Robotics)

Dariu Gavrilla
(Human-aware
Intelligent Systems)

Max Welling
(Machine Learning)

Contenu connexe

Tendances

Joint unsupervised learning of deep representations and image clustersUniversitat Politècnica de Catalunya

Deep Learning for Computer Vision: Generative models and adversarial training...Universitat Politècnica de Catalunya

Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017Universitat Politècnica de Catalunya

Bol.comBigDataExpo

Image Classification using deep learning Asma-AH

Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya

Deep learning with Tensorflow in Rmikaelhuss

Learning where to look: focus and attention in deep visionUniversitat Politècnica de Catalunya

Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)Universitat Politècnica de Catalunya

Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018Universitat Politècnica de Catalunya

Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Attention Models (UPC 2016)Universitat Politècnica de Catalunya

Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...Universitat Politècnica de Catalunya

Capitalico / Chart Pattern Matching in Financial Trading Using RNNAlpaca

Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya

Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Universitat Politècnica de Catalunya

Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Universitat Politècnica de Catalunya

Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018Universitat Politècnica de Catalunya

Object classification using CNN & VGG16 Model (Keras and Tensorflow) Lalit Jain

Cluster formation over huge volatile robotic data Eirini Ntoutsi

Tendances (20)

Joint unsupervised learning of deep representations and image clusters

Deep Learning for Computer Vision: Generative models and adversarial training...

Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017

Bol.com

Image Classification using deep learning

Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)

Deep learning with Tensorflow in R

Learning where to look: focus and attention in deep vision

Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)

Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018

Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...

Deep Learning for Computer Vision: Attention Models (UPC 2016)

Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...

Capitalico / Chart Pattern Matching in Financial Trading Using RNN

Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020

Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...

Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020

Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018

Object classification using CNN & VGG16 Model (Keras and Tensorflow)

Cluster formation over huge volatile robotic data

En vedette

Building new business models through big data dec 06 2012Aki Balogh

Data Science Highlights Joe Lamantia

Engineering patterns for implementing data science models on big data platformsHisham Arafat

Becoming Data-Driven Through Cultural ChangeCloudera, Inc.

From Insight to Action: Using Data Science to Transform Your OrganizationCloudera, Inc.

How to create new business models with Big Data and AnalyticsAki Balogh

Analytics Trends 2016: The next evolutionDeloitte United States

En vedette (7)

Building new business models through big data dec 06 2012

Data Science Highlights

Engineering patterns for implementing data science models on big data platforms

Becoming Data-Driven Through Cultural Change

From Insight to Action: Using Data Science to Transform Your Organization

How to create new business models with Big Data and Analytics

Analytics Trends 2016: The next evolution

Similaire à Complex Models for Big Data

Android and Deep LearningOswald Campesato

Diving into Deep Learning (Silicon Valley Code Camp 2017)Oswald Campesato

Deep Learning And Business Models (VNITC 2015-09-13)Ha Phuong

Introduction to Deep Learning and TensorflowOswald Campesato

U_N.o.1T: A U-Net exploration, in DepthManuel Nieves Sáez

Introduction to Deep LearningOswald Campesato

230208 MLOps Getting from Good to Great.pptxArthur240715

Cognitive Engine: Boosting Scientific Discoverydiannepatricia

Cyberinfrastructure and Applications Overview: Howard University June22marpierc

Angular and Deep LearningOswald Campesato

Agents In An Exponential World FosterIan Foster

The Other HPC: High Productivity ComputingUniversity of Washington

Deep learning for molecules, introduction to chainer chemistryKenta Oono

Full resume dr_russell_john_childs_2013Russell Childs

PointNetPetteriTeikariPhD

PortfolioIvan Khomyakov

Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI ProjectsWee Hyong Tok

Qiu bosc2010BOSC 2010

LR2. Summary Day 2Machine Learning Valencia

Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakPyData

Similaire à Complex Models for Big Data (20)

Android and Deep Learning

Diving into Deep Learning (Silicon Valley Code Camp 2017)

Deep Learning And Business Models (VNITC 2015-09-13)

Introduction to Deep Learning and Tensorflow

U_N.o.1T: A U-Net exploration, in Depth

Introduction to Deep Learning

230208 MLOps Getting from Good to Great.pptx

Cognitive Engine: Boosting Scientific Discovery

Cyberinfrastructure and Applications Overview: Howard University June22

Angular and Deep Learning

Agents In An Exponential World Foster

The Other HPC: High Productivity Computing

Deep learning for molecules, introduction to chainer chemistry

Full resume dr_russell_john_childs_2013

PointNet

Portfolio

Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects

Qiu bosc2010

LR2. Summary Day 2

Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak

Dernier

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxnull - The Open Security Community

Story boards and shot lists for my a level piececharlottematthew16

Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited

CloudStudio User manual (basic edition):comworks

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm

WordPress Websites for Engineers: Elevate Your Brandgvaughan

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski

Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation

Gen AI in Business - Global Trends Report 2024.pdfAddepto

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

AI as an Interface for Commercial BuildingsMemoori

Dernier (20)

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx

Story boards and shot lists for my a level piece

Ensuring Technical Readiness For Copilot in Microsoft 365

CloudStudio User manual (basic edition):

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

Scanning the Internet for External Cloud Exposures via SSL Certs

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)

Developer Data Modeling Mistakes: From Postgres to NoSQL

Streamlining Python Development: A Guide to a Modern Project Setup

WordPress Websites for Engineers: Elevate Your Brand

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

Human Factors of XR: Using Human Factors to Design XR Systems

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation

My Hashitalk Indonesia April 2024 Presentation

Connect Wave/ connectwave Pitch Deck Presentation

Gen AI in Business - Global Trends Report 2024.pdf

Dev Dives: Streamline document processing with UiPath Studio Web

AI as an Interface for Commercial Buildings

Complex Models for Big Data

1. DS RC Data Science Research Center Complex Models for Big Data Max Welling UvA

2. DS RC The Four Paradigms We have added big data to computer simulation, experiment and theory. Not replaced it…

3. DS RC Big Simulation Computer simulations have become increasingly complex (e.g. weather, earthquake models) The Computational Wall: If a model has hundreds of parameters, how can we: 1) Find the parameter values that match the observations best? 2) Determine if we underfit (model too simple) or overfit (model too complex)? 3) Compare two models?

4. DS RC Parameter Inference Parameter Update Parameters Simulation Observations

5. DS RC Challenge I The “posterior probability” in closed form. can not be computed Solution: Markov Chain Monte Carlo Sampling (MCMC)

6. DS RC Challenge II We cannot run MCMC because the likelihood is not given in closed form (but rather as a simulation) Solution: Likelihood Free MCMC (or Approximate Bayesian Computation) Run many simulations and compare samples With observations. Source: Csillery, Katalin, et al. "Approximate Bayesian computation (ABC) in practice."Trends in ecology & evolution 25.7 (2010): 410-418.

7. DS RC Challenge III We need thousands of simulations to infer the posterior (infeasible if every simulation takes a day or so) Ted Meeds If surrogate ~ log(P) with high confidence then use surrogate to draw sample. If not: simulate until enough confidence. Surrogate of log(P) Solution: Learn log(P) using Gaussian Process Surrogate functions (GPS)

8. D S Two Kinds of Complex Model RC Machine Learning Computational Science Model Capacity “Let the model speak” “Let the data speak”

9. DS RC 3x Exponential Growth in Machine Learning Computer Power Data Volume Model Capacity

10. D S Growth in Model Capacity RC 2020-2050 Human Brain (N=+/- 100T) ? Model Capacity over Time 2009: Hinton’s Deep Belief Net (+/- N=10M) 2013: Google/Y! (N=+/- 10B) 1943: First NN (+/- N=10) 1988: NetTalk (+/- N=20K)

11. D S Deep Learning: Neural Nets Strike R C Back(again) 1970: NN discredited (Minsky & Papert) 2 layers 1943: NN invented (McCulloch & Pitts) -Model Size: 10B parameters -Used by: Yahoo!, Google, Microsoft, Baidu, IBM, Scyfer  1986: Backpropagation (Rumelhart, Hinton & Williams ) 1995: SVM (Vapnik) 3 layers 2009: Deep Learning (Hinton) many layers

12. DS RC Paradox Why does model capacity grow exponentially? Raw Information: O(N) Predictive Information: log(N) Noise ?

13. DS RC Big Challenges from Industry Scyfer connects industry to academia: -inspire academia w/ relevant problems -deliver ML products to industry -host student projects -provide employment for our students = VALORISATION What industry needs. What academics are interested in.

14. DS RC Intelligent Autonomous Systems Lab - UvA Visual Analytics Shimon Whiteson Leo Dorst Business Analytics Decision Theory (Geometric Algebra) Understand and decide (Reinforcement Learning & Planning) Joris Mooij (Causality) Distributed Processing Data Reasoning Knowledge representati on Large Scale Databases Store and process Software Eng. System / Network Eng. Analyze and model Multimedia Retrieval Modeling and simulation Information Retrieval Machine Learning Ben Kröse (Ambient Robotics) Dariu Gavrilla (Human-aware Intelligent Systems) Max Welling (Machine Learning)

15. DS RC Our Future Need Visual Analytics Shimon Whiteson Leo Dorst Business Analytics Decision Theory (Geometric Algebra) Understand and decide (Reinforcement Learning & Planning) Joris Mooij (Causality) Distributed Processing Data Reasoning Knowledge representati on Large Scale Databases Store and process Software Eng. System / Network Eng. Analyze and model Multimedia Retrieval Modeling and simulation Information Retrieval Machine Learning Ben Kröse (Ambient Robotics) Dariu Gavrilla (Human-aware Intelligent Systems) Max Welling (Machine Learning)

16. DS RC Questions?

Complex Models for Big Data

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (7)

Similaire à Complex Models for Big Data

Similaire à Complex Models for Big Data (20)

Dernier

Dernier (20)

Complex Models for Big Data