Soumettre la recherche
Mettre en ligne
Introduction to Mahout given at Twin Cities HUG
•
Télécharger en tant que PPTX, PDF
•
0 j'aime
•
1,027 vues
MapR Technologies
Suivre
Introduction to Mahout and how to build a recommender
Lire moins
Lire la suite
Technologie
Formation
Signaler
Partager
Signaler
Partager
1 sur 72
Télécharger maintenant
Recommandé
GraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDB
ArangoDB Database
Practical Parallel Hypergraph Algorithms | PPoPP ’20
Practical Parallel Hypergraph Algorithms | PPoPP ’20
Subhajit Sahu
SASUM: A Sharing-based Approach to Fast Approximate Subgraph Matching for Lar...
SASUM: A Sharing-based Approach to Fast Approximate Subgraph Matching for Lar...
Kyong-Ha Lee
Scalable and Adaptive Graph Querying with MapReduce
Scalable and Adaptive Graph Querying with MapReduce
Kyong-Ha Lee
Fundamentals of Image Processing & Computer Vision with MATLAB
Fundamentals of Image Processing & Computer Vision with MATLAB
Ali Ghanbarzadeh
MATLAB & Image Processing
MATLAB & Image Processing
Techbuddy Consulting Pvt. Ltd.
Low power tool paper
Low power tool paper
M Madan Gopal
Web Traffic Time Series Forecasting
Web Traffic Time Series Forecasting
BillTubbs
Recommandé
GraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDB
ArangoDB Database
Practical Parallel Hypergraph Algorithms | PPoPP ’20
Practical Parallel Hypergraph Algorithms | PPoPP ’20
Subhajit Sahu
SASUM: A Sharing-based Approach to Fast Approximate Subgraph Matching for Lar...
SASUM: A Sharing-based Approach to Fast Approximate Subgraph Matching for Lar...
Kyong-Ha Lee
Scalable and Adaptive Graph Querying with MapReduce
Scalable and Adaptive Graph Querying with MapReduce
Kyong-Ha Lee
Fundamentals of Image Processing & Computer Vision with MATLAB
Fundamentals of Image Processing & Computer Vision with MATLAB
Ali Ghanbarzadeh
MATLAB & Image Processing
MATLAB & Image Processing
Techbuddy Consulting Pvt. Ltd.
Low power tool paper
Low power tool paper
M Madan Gopal
Web Traffic Time Series Forecasting
Web Traffic Time Series Forecasting
BillTubbs
HaLoop: Efficient Iterative Processing on Large-Scale Clusters
HaLoop: Efficient Iterative Processing on Large-Scale Clusters
University of Washington
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkX
Benjamin Bengfort
Implementation of Low Power and Area-Efficient Carry Select Adder
Implementation of Low Power and Area-Efficient Carry Select Adder
IJMTST Journal
Iaetsd vlsi architecture for exploiting carry save arithmetic using verilog hdl
Iaetsd vlsi architecture for exploiting carry save arithmetic using verilog hdl
Iaetsd Iaetsd
Implementation of D* Path Planning Algorithm with NXT LEGO Mindstorms Kit for...
Implementation of D* Path Planning Algorithm with NXT LEGO Mindstorms Kit for...
idescitation
Hadoop scheduler with deadline constraint
Hadoop scheduler with deadline constraint
ijccsa
Planning Evacuation Routes with the P-graph Framework
Planning Evacuation Routes with the P-graph Framework
Juan Carlos García Ojeda
Integrative Parallel Programming in HPC
Integrative Parallel Programming in HPC
Victor Eijkhout
Visualizing the Model Selection Process
Visualizing the Model Selection Process
Benjamin Bengfort
A Virtual Machine Placement Algorithm for Energy Efficient Cloud Resource Res...
A Virtual Machine Placement Algorithm for Energy Efficient Cloud Resource Res...
SuvomDas
post119s1-file3
post119s1-file3
Venkata Suhas Maringanti
Accurate Learning of Graph Representations with Graph Multiset Pooling
Accurate Learning of Graph Representations with Graph Multiset Pooling
MLAI2
Flex ch
Flex ch
Nirvana Metallic
Basics of Image Processing using MATLAB
Basics of Image Processing using MATLAB
vkn13
IRJET- A Review of Approximate Adders for Energy-Efficient Digital Signal Pro...
IRJET- A Review of Approximate Adders for Energy-Efficient Digital Signal Pro...
IRJET Journal
Big Data Analytics London
Big Data Analytics London
MapR Technologies
Transactional Data Mining Ted Dunning 2004
Transactional Data Mining Ted Dunning 2004
MapR Technologies
Real-time and Long-time Together
Real-time and Long-time Together
MapR Technologies
SD Forum 11 04-2010
SD Forum 11 04-2010
MapR Technologies
Strata 2014-tdunning-anomaly-detection-140211162923-phpapp01
Strata 2014-tdunning-anomaly-detection-140211162923-phpapp01
MapR Technologies
Mahout classifier tour
Mahout classifier tour
MapR Technologies
Big Data Lessons from the Cloud
Big Data Lessons from the Cloud
MapR Technologies
Contenu connexe
Tendances
HaLoop: Efficient Iterative Processing on Large-Scale Clusters
HaLoop: Efficient Iterative Processing on Large-Scale Clusters
University of Washington
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkX
Benjamin Bengfort
Implementation of Low Power and Area-Efficient Carry Select Adder
Implementation of Low Power and Area-Efficient Carry Select Adder
IJMTST Journal
Iaetsd vlsi architecture for exploiting carry save arithmetic using verilog hdl
Iaetsd vlsi architecture for exploiting carry save arithmetic using verilog hdl
Iaetsd Iaetsd
Implementation of D* Path Planning Algorithm with NXT LEGO Mindstorms Kit for...
Implementation of D* Path Planning Algorithm with NXT LEGO Mindstorms Kit for...
idescitation
Hadoop scheduler with deadline constraint
Hadoop scheduler with deadline constraint
ijccsa
Planning Evacuation Routes with the P-graph Framework
Planning Evacuation Routes with the P-graph Framework
Juan Carlos García Ojeda
Integrative Parallel Programming in HPC
Integrative Parallel Programming in HPC
Victor Eijkhout
Visualizing the Model Selection Process
Visualizing the Model Selection Process
Benjamin Bengfort
A Virtual Machine Placement Algorithm for Energy Efficient Cloud Resource Res...
A Virtual Machine Placement Algorithm for Energy Efficient Cloud Resource Res...
SuvomDas
post119s1-file3
post119s1-file3
Venkata Suhas Maringanti
Accurate Learning of Graph Representations with Graph Multiset Pooling
Accurate Learning of Graph Representations with Graph Multiset Pooling
MLAI2
Flex ch
Flex ch
Nirvana Metallic
Basics of Image Processing using MATLAB
Basics of Image Processing using MATLAB
vkn13
IRJET- A Review of Approximate Adders for Energy-Efficient Digital Signal Pro...
IRJET- A Review of Approximate Adders for Energy-Efficient Digital Signal Pro...
IRJET Journal
Tendances
(15)
HaLoop: Efficient Iterative Processing on Large-Scale Clusters
HaLoop: Efficient Iterative Processing on Large-Scale Clusters
Graph Analyses with Python and NetworkX
Graph Analyses with Python and NetworkX
Implementation of Low Power and Area-Efficient Carry Select Adder
Implementation of Low Power and Area-Efficient Carry Select Adder
Iaetsd vlsi architecture for exploiting carry save arithmetic using verilog hdl
Iaetsd vlsi architecture for exploiting carry save arithmetic using verilog hdl
Implementation of D* Path Planning Algorithm with NXT LEGO Mindstorms Kit for...
Implementation of D* Path Planning Algorithm with NXT LEGO Mindstorms Kit for...
Hadoop scheduler with deadline constraint
Hadoop scheduler with deadline constraint
Planning Evacuation Routes with the P-graph Framework
Planning Evacuation Routes with the P-graph Framework
Integrative Parallel Programming in HPC
Integrative Parallel Programming in HPC
Visualizing the Model Selection Process
Visualizing the Model Selection Process
A Virtual Machine Placement Algorithm for Energy Efficient Cloud Resource Res...
A Virtual Machine Placement Algorithm for Energy Efficient Cloud Resource Res...
post119s1-file3
post119s1-file3
Accurate Learning of Graph Representations with Graph Multiset Pooling
Accurate Learning of Graph Representations with Graph Multiset Pooling
Flex ch
Flex ch
Basics of Image Processing using MATLAB
Basics of Image Processing using MATLAB
IRJET- A Review of Approximate Adders for Energy-Efficient Digital Signal Pro...
IRJET- A Review of Approximate Adders for Energy-Efficient Digital Signal Pro...
En vedette
Big Data Analytics London
Big Data Analytics London
MapR Technologies
Transactional Data Mining Ted Dunning 2004
Transactional Data Mining Ted Dunning 2004
MapR Technologies
Real-time and Long-time Together
Real-time and Long-time Together
MapR Technologies
SD Forum 11 04-2010
SD Forum 11 04-2010
MapR Technologies
Strata 2014-tdunning-anomaly-detection-140211162923-phpapp01
Strata 2014-tdunning-anomaly-detection-140211162923-phpapp01
MapR Technologies
Mahout classifier tour
Mahout classifier tour
MapR Technologies
Big Data Lessons from the Cloud
Big Data Lessons from the Cloud
MapR Technologies
Securing Hadoop by MapR's Senior Principal Technologist Keys Botzum
Securing Hadoop by MapR's Senior Principal Technologist Keys Botzum
MapR Technologies
Devoxx Real-Time Learning
Devoxx Real-Time Learning
MapR Technologies
En vedette
(9)
Big Data Analytics London
Big Data Analytics London
Transactional Data Mining Ted Dunning 2004
Transactional Data Mining Ted Dunning 2004
Real-time and Long-time Together
Real-time and Long-time Together
SD Forum 11 04-2010
SD Forum 11 04-2010
Strata 2014-tdunning-anomaly-detection-140211162923-phpapp01
Strata 2014-tdunning-anomaly-detection-140211162923-phpapp01
Mahout classifier tour
Mahout classifier tour
Big Data Lessons from the Cloud
Big Data Lessons from the Cloud
Securing Hadoop by MapR's Senior Principal Technologist Keys Botzum
Securing Hadoop by MapR's Senior Principal Technologist Keys Botzum
Devoxx Real-Time Learning
Devoxx Real-Time Learning
Similaire à Introduction to Mahout given at Twin Cities HUG
Whats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache Mahout
Ted Dunning
What's Right and Wrong with Apache Mahout
What's Right and Wrong with Apache Mahout
MapR Technologies
Intro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco Vasquez
MapR Technologies
The power of hadoop in business
The power of hadoop in business
MapR Technologies
A Hands-on Intro to Data Science and R Presentation.ppt
A Hands-on Intro to Data Science and R Presentation.ppt
Sanket Shikhar
New directions for mahout
New directions for mahout
MapR Technologies
Predictive Analytics San Diego
Predictive Analytics San Diego
MapR Technologies
Boston Hug by Ted Dunning 2012
Boston Hug by Ted Dunning 2012
MapR Technologies
Data science and OSS
Data science and OSS
Kevin Crocker
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
TigerGraph
MapReduce basics
MapReduce basics
Harisankar H
Data Science At Scale for IoT on the Pivotal Platform
Data Science At Scale for IoT on the Pivotal Platform
Gautam S. Muralidhar
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
MLconf
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
Srivatsan Ramanujam
New Directions for Mahout
New Directions for Mahout
Ted Dunning
Which Algorithms Really Matter
Which Algorithms Really Matter
Ted Dunning
Optimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data Perspective
পল্লব রায়
Introduction to Spark
Introduction to Spark
Carol McDonald
Data Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup Group
Doug Needham
Cloudera Data Science Challenge
Cloudera Data Science Challenge
Mark Nichols, P.E.
Similaire à Introduction to Mahout given at Twin Cities HUG
(20)
Whats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache Mahout
What's Right and Wrong with Apache Mahout
What's Right and Wrong with Apache Mahout
Intro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco Vasquez
The power of hadoop in business
The power of hadoop in business
A Hands-on Intro to Data Science and R Presentation.ppt
A Hands-on Intro to Data Science and R Presentation.ppt
New directions for mahout
New directions for mahout
Predictive Analytics San Diego
Predictive Analytics San Diego
Boston Hug by Ted Dunning 2012
Boston Hug by Ted Dunning 2012
Data science and OSS
Data science and OSS
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
MapReduce basics
MapReduce basics
Data Science At Scale for IoT on the Pivotal Platform
Data Science At Scale for IoT on the Pivotal Platform
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
Narayanan Sundaram, Research Scientist, Intel Labs at MLconf SF - 11/13/15
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...
New Directions for Mahout
New Directions for Mahout
Which Algorithms Really Matter
Which Algorithms Really Matter
Optimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data Perspective
Introduction to Spark
Introduction to Spark
Data Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup Group
Cloudera Data Science Challenge
Cloudera Data Science Challenge
Plus de MapR Technologies
Converging your data landscape
Converging your data landscape
MapR Technologies
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
MapR Technologies
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
MapR Technologies
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
MapR Technologies
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
MapR Technologies
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning Logistics
MapR Technologies
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
MapR Technologies
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
MapR Technologies
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIs
MapR Technologies
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
MapR Technologies
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
MapR Technologies
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
MapR Technologies
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
MapR Technologies
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
MapR Technologies
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
MapR Technologies
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
MapR Technologies
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
MapR Technologies
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
MapR Technologies
MapR and Cisco Make IT Better
MapR and Cisco Make IT Better
MapR Technologies
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQL
MapR Technologies
Plus de MapR Technologies
(20)
Converging your data landscape
Converging your data landscape
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning Logistics
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
MapR and Cisco Make IT Better
MapR and Cisco Make IT Better
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQL
Dernier
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
LoriGlavin3
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
IES VE
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
LoriGlavin3
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
panagenda
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
Bernd Ruecker
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
UiPathCommunity
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
BookNet Canada
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
LoriGlavin3
A Framework for Development in the AI Age
A Framework for Development in the AI Age
Cprime
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
LoriGlavin3
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
Kaya Weers
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
marketing932765
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
Neo4j
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
Lonnie McRorey
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
itnewsafrica
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
Inflectra
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
fnnc6jmgwh
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
Mydbops
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
itnewsafrica
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
Ingrid Airi González
Dernier
(20)
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
A Framework for Development in the AI Age
A Framework for Development in the AI Age
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
Introduction to Mahout given at Twin Cities HUG
1.
1©MapR Technologies 2013-
Confidential Introduction to Mahout And How To Build a Recommender
2.
2©MapR Technologies 2013-
Confidential Me, Us Ted Dunning, Chief Application Architect, MapR Committer PMC member, Mahout, Zookeeper, Drill Bought the beer at the first HUG MapR Distributes more open source components for Hadoop Adds major technology for performance, HA, industry standard API’s Tonight Hash tag - #tchug See also - @ApacheMahout @ApacheDrill @ted_dunning and @mapR
3.
3©MapR Technologies 2013-
Confidential Sidebar on Drill Apache Drill – SQL on Hadoop (and other things) – Intended to solve problems for 1-5 years from now Not the problems from 1-10 years ago – Multiple levels of API supported • SQL-2003 • Logical plan language (DAG in JSON) • Physical plan language (DAG with push-down, exchange markers) • Execution plan language (many DAG’s) Current state – SQL 2003 support in place – Logical plan interpreter useful for testing – Value vectors near completion – High performance RPC working
4.
4©MapR Technologies 2013-
Confidential More on Drill Just completed OSCON workshop Workshop materials available shortly – Extracted technology demonstrators – Sample queries Send me email or tweet for more info
5.
5©MapR Technologies 2013-
Confidential What’s Up? What is Mahout? – Math library – Clustering, classifiers, other stuff Recommendation – Generalities – Algorithm Specifics – System Design – Important things never mentioned Final thoughts
6.
6©MapR Technologies 2013-
Confidential What is Mahout? “Scalable machine learning” – not just Hadoop-oriented machine learning – not entirely, that is. Just mostly. Components – math library – clustering – classification – decompositions – recommendations
7.
7©MapR Technologies 2013-
Confidential What is Mahout? “Scalable machine learning” – not just Hadoop-oriented machine learning – not entirely, that is. Just mostly. Components – math library – clustering – classification – decompositions – recommendations
8.
8©MapR Technologies 2013-
Confidential Mahout Math
9.
9©MapR Technologies 2013-
Confidential Mahout Math Goals are – basic linear algebra, – and statistical sampling, – and good clustering, – decent speed, – extensibility, – especially for sparse data But not – totally badass speed – comprehensive set of algorithms – optimization, root finders, quadrature
10.
10©MapR Technologies 2013-
Confidential Matrices and Vectors At the core: – DenseVector, RandomAccessSparseVector – DenseMatrix, SparseRowMatrix Highly composable API Important ideas: – view*, assign and aggregate – iteration m.viewDiagonal().assign(v)
11.
11©MapR Technologies 2013-
Confidential Assign? View? Why assign? – Copying is the major cost for naïve matrix packages – In-place operations critical to reasonable performance – Many kinds of updates required, so functional style very helpful Why view? – In-place operations often required for blocks, rows, columns or diagonals – With views, we need #assign + #views methods – Without views, we need #assign x #views methods Synergies – With both views and assign, many loops become single line
12.
12©MapR Technologies 2013-
Confidential Assign Matrices Vectors Matrix assign(double value); Matrix assign(double[][] values); Matrix assign(Matrix other); Matrix assign(DoubleFunction f); Matrix assign(Matrix other, DoubleDoubleFunction f); Vector assign(double value); Vector assign(double[] values); Vector assign(Vector other); Vector assign(DoubleFunction f); Vector assign(Vector other, DoubleDoubleFunction f); Vector assign(DoubleDoubleFunction f, double y);
13.
13©MapR Technologies 2013-
Confidential Views Matrices Vectors Matrix viewPart(int[] offset, int[] size); Matrix viewPart(int row, int rlen, int col, int clen); Vector viewRow(int row); Vector viewColumn(int column); Vector viewDiagonal(); Vector viewPart(int offset, int length);
14.
14©MapR Technologies 2013-
Confidential Aggregates Matrices Vectors double zSum(); double aggregate( DoubleDoubleFunction reduce, DoubleFunction map); double aggregate(Vector other, DoubleDoubleFunction aggregator, DoubleDoubleFunction combiner); double zSum(); Vector aggregateRows(VectorFunction f); Vector aggregateColumns(VectorFunction f); double aggregate(DoubleDoubleFunction combiner, DoubleFunction mapper);
15.
15©MapR Technologies 2013-
Confidential Predefined Functions Many handy functions ABS LOG2 ACOS NEGATE ASIN RINT ATAN SIGN CEIL SIN COS SQRT EXP SQUARE FLOOR SIGMOID IDENTITY SIGMOIDGRADIENT INV TAN LOGARITHM
16.
16©MapR Technologies 2013-
Confidential Examples double alpha; a.assign(alpha); a.assign(b, Functions.chain( Functions.plus(beta), Functions.times(alpha)); A =a A =aB+ b
17.
17©MapR Technologies 2013-
Confidential Sparse Optimizations DoubleDoubleFunction abstract properties And Vector properties public boolean isLikeRightPlus(); public boolean isLikeLeftMult(); public boolean isLikeRightMult(); public boolean isLikeMult(); public boolean isCommutative(); public boolean isAssociative(); public boolean isAssociativeAndCommutative(); public boolean isDensifying(); public boolean isDense(); public boolean isSequentialAccess(); public double getLookupCost(); public double getIteratorAdvanceCost(); public boolean isAddConstantTime();
18.
18©MapR Technologies 2013-
Confidential More Examples The trace of a matrix Set diagonal to zero Set diagonal to negative of row sums
19.
19©MapR Technologies 2013-
Confidential Examples The trace of a matrix Set diagonal to zero Set diagonal to negative of row sums m.viewDiagonal().zSum()
20.
20©MapR Technologies 2013-
Confidential Examples The trace of a matrix Set diagonal to zero Set diagonal to negative of row sums m.viewDiagonal().zSum() m.viewDiagonal().assign(0)
21.
21©MapR Technologies 2013-
Confidential Examples The trace of a matrix Set diagonal to zero Set diagonal to negative of row sums excluding the diagonal m.viewDiagonal().zSum() m.viewDiagonal().assign(0) Vector diag = m.viewDiagonal().assign(0); diag.assign(m.rowSums().assign(Functions.MINUS));
22.
22©MapR Technologies 2013-
Confidential Iteration Matrices are Iterable in Mahout Vectors are densely or sparsely iterable // compute both row and columns sums in one pass for (MatrixSlice row: m) { rSums.set(row.index(), row.zSum()); cSums.assign(row, Functions.PLUS); } double entropy = 0; for (Vector.Element e: v.nonZeroes()) { entropy += e.get() * Math.log(e.get()); }
23.
23©MapR Technologies 2013-
Confidential Random Sampling Samples from some type Lots of kinds ChineseRestaurant Missing Normal Empirical Multinomial PoissonSampler IndianBuffet MultiNormal Sampler public interface Sampler<T> { T sample(); } public abstract class AbstractSamplerFunction extends DoubleFunction implements Sampler<Double>
24.
24©MapR Technologies 2013-
Confidential Clustering and Such Streaming k-means and ball k-means – streaming reduces very large data to a cluster sketch – ball k-means is a high quality k-means implementation – the cluster sketch is also usable for other applications – single machine threaded and map-reduce versions available SVD and friends – stochastic SVD has in-memory, single machine out-of-core and map-reduce versions – good for reducing very large sparse matrices to tall skinny dense ones Spectral clustering – based on SVD, allows massive dimensional clustering
25.
25©MapR Technologies 2013-
Confidential Mahout Math Summary Matrices, Vectors – views – in-place assignment – aggregations – iterations Functions – lots built-in – cooperate with sparse vector optimizations Sampling – abstract samplers – samplers as functions Other stuff … clustering, SVD
26.
26©MapR Technologies 2013-
Confidential Recommenders
27.
27©MapR Technologies 2013-
Confidential Recommendations Often known as collaborative filtering Actors interact with items – observe successful interaction We want to suggest additional successful interactions Observations inherently very sparse
28.
28©MapR Technologies 2013-
Confidential The Big Ideas Cooccurrence is the core operation (and it is pretty simple) Cooccurrence can be extended to handle important new capabilities Recommendation systems can be deployed ideally using search technology
29.
29©MapR Technologies 2013-
Confidential Examples of Recommendations Customers buying books (Linden et al) Web visitors rating music (Shardanand and Maes) or movies (Riedl, et al), (Netflix) Internet radio listeners not skipping songs (Musicmatch) Internet video watchers watching >30 s (Veoh) Visibility in a map UI (new Google maps)
30.
30©MapR Technologies 2013-
Confidential A simple recommendation architecture Look at the history of interactions Find significant item cooccurrence in user histories Use these cooccurring items as “indicators” For all indicators in user history, accumulate scores for related items
31.
31©MapR Technologies 2013-
Confidential Recommendation Basics History: User Thing 1 3 2 4 3 4 2 3 3 2 1 1 2 1
32.
32©MapR Technologies 2013-
Confidential Recommendation Basics History as matrix: (t1, t3) cooccur 2 times, (t1, t4) once, (t2, t4) once, (t3, t4) once t1 t2 t3 t4 u1 1 0 1 0 u2 1 0 1 1 u3 0 1 0 1
33.
33©MapR Technologies 2013-
Confidential A Quick Simplification Users who do h Also do r Ah AT Ah( ) AT A( )h User-centric recommendations Item-centric recommendations
34.
34©MapR Technologies 2013-
Confidential Recommendation Basics Coocurrence t1 t2 t3 t4 t1 2 0 2 1 t2 0 1 0 1 t3 2 0 1 1 t4 1 1 1 2
35.
35©MapR Technologies 2013-
Confidential Problems with Raw Cooccurrence Very popular items co-occur with everything – Welcome document – Elevator music That isn’t interesting – We want anomalous cooccurrence
36.
36©MapR Technologies 2013-
Confidential Recommendation Basics Coocurrence t1 t2 t3 t4 t1 2 0 2 1 t2 0 1 0 1 t3 2 0 1 1 t4 1 1 1 2 t3 not t3 t1 2 1 not t1 1 1
37.
37©MapR Technologies 2013-
Confidential Spot the Anomaly Root LLR is roughly like standard deviations A not A B 13 1000 not B 1000 100,000 A not A B 1 0 not B 0 2 A not A B 1 0 not B 0 10,000 A not A B 10 0 not B 0 100,000 0.44 0.98 2.26 7.15
38.
39©MapR Technologies 2013-
Confidential Threshold by Score Coocurrence t1 t2 t3 t4 t1 2 0 2 1 t2 0 1 0 1 t3 2 0 1 1 t4 1 1 1 2
39.
40©MapR Technologies 2013-
Confidential Threshold by Score Significant cooccurrence => Indicators t1 t2 t3 t4 t1 1 0 0 1 t2 0 1 0 1 t3 0 0 1 1 t4 1 0 0 1
40.
41©MapR Technologies 2013-
Confidential So Far, So Good Classic recommendation systems based on these approaches – Musicmatch (ca 2000) – Veoh Networks (ca 2005) Currently available in Mahout – See RowSimilarityJob Very simple to deploy – Compute indicators – Store in search engine – Works very well with enough data
41.
42©MapR Technologies 2013-
Confidential What’s right about this?
42.
43©MapR Technologies 2013-
Confidential Virtues of Current State of the Art Lots of well publicized history – Musicmatch, Veoh, Netflix, Amazon, Overstock Lots of support – Mahout, commercial offerings like Myrrix Lots of existing code – Mahout, commercial codes Proven track record Well socialized solution
43.
44©MapR Technologies 2013-
Confidential What’s wrong about this?
44.
45©MapR Technologies 2013-
Confidential Problems for Recommenders Cold start Disjoint populations Long tail Multiple kinds of evidence (multi-modal recommendations) – unstructured add-on data – other transaction streams – textual descriptions
45.
46©MapR Technologies 2013-
Confidential What is this multi-modal stuff? But people don’t just do one thing One kind of behavior is useful for predicting other kinds Having a complete picture is important for accuracy What has the user said, viewed, clicked, closed, bought lately?
46.
47©MapR Technologies 2013-
Confidential Example Multi-modal Inputs Overlap in restaurant visits is useful Big spender cues Cuisine as an indicator Review text as an indicator
47.
48©MapR Technologies 2013-
Confidential Too Limited People do more than one kind of thing Different kinds of behaviors give different quality, quantity and kind of information We don’t have to do co-occurrence We can do cross-occurrence Result is cross-recommendation
48.
49©MapR Technologies 2013-
Confidential Heh?
49.
51©MapR Technologies 2013-
Confidential For example Users enter queries (A) – (actor = user, item=query) Users view videos (B) – (actor = user, item=video) ATA gives query recommendation – “did you mean to ask for” BTB gives video recommendation – “you might like these videos”
50.
52©MapR Technologies 2013-
Confidential The punch-line BTA recommends videos in response to a query – (isn’t that a search engine?) – (not quite, it doesn’t look at content or meta-data)
51.
53©MapR Technologies 2013-
Confidential Real-life example Query: “Paco de Lucia” Conventional meta-data search results: – “hombres del paco” times 400 – not much else Recommendation based search: – Flamenco guitar and dancers – Spanish and classical guitar – Van Halen doing a classical/flamenco riff
52.
54©MapR Technologies 2013-
Confidential Real-life example
53.
55©MapR Technologies 2013-
Confidential Hypothetical Example Want a navigational ontology? Just put labels on a web page with traffic – This gives A = users x label clicks Remember viewing history – This gives B = users x items Cross recommend – B’A = label to item mapping After several users click, results are whatever users think they should be
54.
56©MapR Technologies 2013-
Confidential
55.
57©MapR Technologies 2013-
Confidential Nice. But we can do better?
56.
58©MapR Technologies 2013-
Confidential Ausers things
57.
59©MapR Technologies 2013-
Confidential A1 A2 é ë ù û users thing type 1 thing type 2
58.
60©MapR Technologies 2013-
Confidential A1 A2 é ë ù û T A1 A2 é ë ù û= A1 T A2 T é ë ê ê ù û ú ú A1 A2 é ë ù û = A1 T A1 A1 T A2 AT 2A1 AT 2A2 é ë ê ê ù û ú ú r1 r2 é ë ê ê ù û ú ú = A1 T A1 A1 T A2 AT 2A1 AT 2A2 é ë ê ê ù û ú ú h1 h2 é ë ê ê ù û ú ú r1 = A1 T A1 A1 T A2 é ëê ù ûú h1 h2 é ë ê ê ù û ú ú
59.
61©MapR Technologies 2013-
Confidential Summary Input: Multiple kinds of behavior on one set of things Output: Recommendations for one kind of behavior with a different set of things Cross recommendation is a special case
60.
62©MapR Technologies 2013-
Confidential Now again, without the scary math
61.
63©MapR Technologies 2013-
Confidential Input Data User transactions – user id, merchant id – SIC code, amount – Descriptions, cuisine, … Offer transactions – user id, offer id – vendor id, merchant id’s, – offers, views, accepts
62.
64©MapR Technologies 2013-
Confidential Input Data User transactions – user id, merchant id – SIC code, amount – Descriptions, cuisine, … Offer transactions – user id, offer id – vendor id, merchant id’s, – offers, views, accepts Derived user data – merchant id’s – anomalous descriptor terms – offer & vendor id’s Derived merchant data – local top40 – SIC code – vendor code – amount distribution
63.
65©MapR Technologies 2013-
Confidential Cross-recommendation Per merchant indicators – merchant id’s – chain id’s – SIC codes – indicator terms from text – offer vendor id’s Computed by finding anomalous (indicator => merchant) rates
64.
66©MapR Technologies 2013-
Confidential How can we deploy this?
65.
67©MapR Technologies 2013-
Confidential Search-based Recommendations Sample document – Merchant Id – Field for text description – Phone – Address – Location
66.
68©MapR Technologies 2013-
Confidential Search-based Recommendations Sample document – Merchant Id – Field for text description – Phone – Address – Location – Indicator merchant id’s – Indicator industry (SIC) id’s – Indicator offers – Indicator text – Local top40
67.
69©MapR Technologies 2013-
Confidential Search-based Recommendations Sample document – Merchant Id – Field for text description – Phone – Address – Location – Indicator merchant id’s – Indicator industry (SIC) id’s – Indicator offers – Indicator text – Local top40 Sample query – Current location – Recent merchant descriptions – Recent merchant id’s – Recent SIC codes – Recent accepted offers – Local top40
68.
70©MapR Technologies 2013-
Confidential Search-based Recommendations Sample document – Merchant Id – Field for text description – Phone – Address – Location – Indicator merchant id’s – Indicator industry (SIC) id’s – Indicator offers – Indicator text – Local top40 Sample query – Current location – Recent merchant descriptions – Recent merchant id’s – Recent SIC codes – Recent accepted offers – Local top40 Original data and meta-data Derived from cooccurrence and cross-occurrence analysis Recommendation query
69.
71©MapR Technologies 2013-
Confidential SolR Indexer SolR Indexer Solr indexing Cooccurrence (Mahout) Item meta- data Index shards Complete history Analyze with Map-Reduce
70.
72©MapR Technologies 2013-
Confidential SolR Indexer SolR Indexer Solr search Web tier Item meta- data Index shards User history Deploy with Conventional Search System
71.
73©MapR Technologies 2013-
Confidential Objective Results At a very large credit card company History is all transactions Development time to minimal viable product about 4 months General release 2-3 months later Search-based recs at or equal in quality to other techniques
72.
74©MapR Technologies 2013-
Confidential Contact: – tdunning@maprtech.com – @ted_dunning – @apachemahout – @user-subscribe@mahout.apache.org Slides and such http://www.slideshare.net/tdunning Hash tags: #mapr #apachemahout #recommendations
Télécharger maintenant