SlideShare une entreprise Scribd logo
1  sur  94
Introduction to Machine Learning




     Lior Rokach
     Department of Information Systems Engineering
     Ben-Gurion University of the Negev
About Me
Prof. Lior Rokach
Department of Information Systems Engineering
Faculty of Engineering Sciences
Head of the Machine Learning Lab
Ben-Gurion University of the Negev

Email: liorrk@bgu.ac.il
http://www.ise.bgu.ac.il/faculty/liorr/

PhD (2004) from Tel Aviv University
Machine Learning
• Herbert Alexander Simon:
  “Learning is any process by
  which a system improves
  performance from experience.”
• “Machine Learning is concerned
  with computer programs that
  automatically improve their
  performance through
                                   Herbert Simon
  experience. “                    Turing Award 1975
                                   Nobel Prize in Economics 1978
Why Machine Learning?
• Develop systems that can automatically adapt and customize
  themselves to individual users.
   – Personalized news or mail filter
• Discover new knowledge from large databases (data mining).
   – Market basket analysis (e.g. diapers and beer)
• Ability to mimic human and replace certain monotonous tasks -
  which require some intelligence.
       • like recognizing handwritten characters
• Develop systems that are too difficult/expensive to construct
  manually because they require specific detailed skills or
  knowledge tuned to a specific task (knowledge engineering
  bottleneck).


                                        4
Why now?
• Flood of available data (especially with the
  advent of the Internet)
• Increasing computational power
• Growing progress in available algorithms and
  theory developed by researchers
• Increasing support from industries




                       5
ML Applications
The concept of learning in a ML system
• Learning = Improving with experience at some
  task
  – Improve over task T,
  – With respect to performance measure, P
  – Based on experience, E.




                         7
Motivating Example
            Learning to Filter Spam
Example: Spam Filtering
Spam - is all email the user does not
want to receive and has not asked to
receive
  T: Identify Spam Emails
  P:
      % of spam emails that were filtered
      % of ham/ (non-spam) emails that
      were incorrectly filtered-out
  E: a database of emails that were
  labelled by users
The Learning Process
The Learning Process

              Model Learning    Model
                               Testing
The Learning Process in our Example

                                               Model Learning    Model
                                                                Testing




                   ● Number of recipients
                   ● Size of message
                   ● Number of attachments
                   ● Number of "re's" in the
                   subject line
    Email Server   …
Data Set
                                                                Target
                              Input Attributes
                                                               Attribute


            Number of      Email        Country    Customer   Email Type
              new        Length (K)       (IP)       Type
            Recipients
                0              2        Germany      Gold       Ham
                1              4        Germany      Silver     Ham
                5              2         Nigeria    Bronze      Spam
Instances




                2              4          Russia    Bronze      Spam
                3              4        Germany     Bronze      Ham
                0              1           USA       Silver     Ham
                4              2           USA       Silver     Spam


                    Numeric              Nominal    Ordinal
Step 4: Model Learning




                            Learner               Classifier
 Database
                            Inducer         Classification Model
Training Set
                      Induction Algorithm
Step 5: Model Testing




 Database                  Learner
Training Set               Inducer
                                                 Classifier
                     Induction Algorithm
                                           Classification Model
Learning Algorithms
Error
Linear Classifiers


       Email Length


                                         How would you
                                       classify this data?




                      New Recipients
Linear Classifiers


       Email Length


                                         How would you
                                       classify this data?




                      New Recipients
When a new email is sent
1. We first place the new email in the space
2. Classify it according to the subspace in which it resides


          Email Length




                         New Recipients
Linear Classifiers


       Email Length


                                         How would you
                                       classify this data?




                      New Recipients
Linear Classifiers


       Email Length


                                         How would you
                                       classify this data?




                      New Recipients
Linear Classifiers


       Email Length


                                         How would you
                                       classify this data?




                      New Recipients
Linear Classifiers


       Email Length


                                       Any of these would
                                                  be fine..

                                       ..but which is best?




                      New Recipients
Classifier Margin


                                      Define the margin of
                                        a linear classifier as
      Email Length

                                         the width that the
                                         boundary could be
                                       increased by before
                                        hitting a datapoint.




                     New Recipients
Maximum Margin


                                             The maximum
                                               margin linear
     Email Length

                                            classifier is the
                                      linear classifier with
                                             the, maximum
                                                     margin.
                                        This is the simplest
                                       kind of SVM (Called
                                                   an LSVM)
                    New Recipients
                                     Linear SVM
No Linear Classifier can cover all
           instances

     Email Length


                                       How would you
                                     classify this data?




                    New Recipients
• Ideally, the best decision boundary should be the
  one which provides an optimal performance such as
  in the following figure
No Linear Classifier can cover all
           instances

     Email Length




                    New Recipients
• However, our satisfaction is premature
  because the central aim of designing a
  classifier is to correctly classify novel input




              Issue of generalization!
Which one?




   2 Errors                 0 Errors
Simple model            Complicated model
Evaluating What’s Been Learned
1.   We randomly select a portion of the data to be used for training (the
     training set)
2.   Train the model on the training set.
3.   Once the model is trained, we run the model on the remaining instances
     (the test set) to see how it performs

                                                             Confusion Matrix

                                                                   Classified As
     Email Length




                                                                   Blue    Red




                                                   Actual
                                                            Blue     7      1
                                                            Red      0      5




                    New Recipients
The Non-linearly separable case


      Email Length




                     New Recipients
The Non-linearly separable case


      Email Length




                     New Recipients
The Non-linearly separable case


      Email Length




                     New Recipients
The Non-linearly separable case


      Email Length




                     New Recipients
Lazy Learners
• Generalization beyond the training data is
  delayed until a new instance is provided to the
  system




Training Set          Learner            Classifier
Lazy Learners
       Instance-based learning




Training Set
Lazy Learner: k-Nearest Neighbors
                                • What should be k?
                                • Which distance measure should be used?




                                       K=3
       Email Length




                      New Recipients
Decision tree
–   A flow-chart-like tree structure
–   Internal node denotes a test on an attribute
–   Branch represents an outcome of the test
–   Leaf nodes represent class labels or class distribution
Top Down Induction of Decision Trees
                                                              Email Len


                                                       <1.8          ≥1.8
Email Length




                                                    Spam             Ham

                                                    1 Error         8 Errors




               New Recipients
                                A single level decision tree is also known as
                                               Decision Stump
Top Down Induction of Decision Trees
                                           Email Len


                                   <1.8           ≥1.8
Email Length




                                Spam              Ham

                                1 Error          8 Errors



                                           New Recip


               New Recipients      <2.3           ≥2.3

                                Spam              Ham

                                4 Errors        11 Errors
Top Down Induction of Decision Trees
                                           Email Len


                                   <1.8           ≥1.8
Email Length




                                Spam              Ham

                                1 Error          8 Errors



                                           New Recip


               New Recipients      <2.3           ≥2.3

                                Spam              Ham

                                4 Errors        11 Errors
Top Down Induction of Decision Trees
                                          Email Len


                                   <1.8              ≥1.8
Email Length




                                Spam
                                                     Email Len
                                1 Error

                                               <4            ≥4

                                           Ham              Spam

               New Recipients             3 Errors          1 Error
Top Down Induction of Decision Trees
                                            Email Len


                                     <1.8            ≥1.8
Email Length




                                  Spam
                                                     Email Len
                                  1 Error

                                                <4           ≥4

                                                            Spam
                                           New Recip
               New Recipients                               1 Error

                                     <1              ≥1

                                Spam                 Ham
                                0 Errors          1 Error
Which One?
Overfitting and underfitting
Error




                               Tree Size
  Overtraining: means that it learns the training set too well – it overfits to the
  training set such that it performs poorly on the test set.
  Underfitting: when model is too simple, both training and test errors are large
Neural Network Model
              Inputs

                                                                       Output
   Age                 34          .6             .
                                   .2     S       4                       0.6
                              .1                       .5
Gender                   2
                              .3                  .2        S
                                          S
                                                       .8
                                   .7                                “Probability of
                         4                                           beingAlive”
 Stage                               .2


                                                                 Dependent
Independent                   Weights     HiddenLaye   Weights   variable
variables                                 r
                                                                 Prediction


  Machine Learning, Dr. Lior
  Rokach, Ben-Gurion University
“Combined logistic models”
              Inputs

                                                                          Output
   Age                 34          .6
                                                     .5                      0.6
                              .1
Gender                   2                                    S
                                   .7                .8                 “Probability of
                         4                                              beingAlive”
 Stage


                                                                    Dependent
Independent                   Weights   HiddenLaye        Weights   variable
variables                               r
                                                                    Prediction


  Machine Learning, Dr. Lior
  Rokach, Ben-Gurion University
Inputs

                                                                           Output
   Age                 34
                                   .2                 .5
                                                                              0.6
Gender                   2
                              .3
                                                               S
                                                                         “Probability of
                                                      .8
                         4                                               beingAlive”
 Stage                              .2


                                                                     Dependent
Independent                   Weights    HiddenLaye        Weights   variable
variables                                r
                                                                     Prediction


  Machine Learning, Dr. Lior
  Rokach, Ben-Gurion University
Inputs

                                                                            Output
   Age                 34          .6
                                   .2                  .5
                                                                               0.6
                              .1
Gender                   1
                              .3
                                                                S
                                   .7                                     “Probability of
                                                       .8
                         4                                                beingAlive”
 Stage                               .2


                                                                      Dependent
Independent                   Weights     HiddenLaye        Weights   variable
variables                                 r
                                                                      Prediction


  Machine Learning, Dr. Lior
  Rokach, Ben-Gurion University
Age                 34          .6             .
                                   .2     S       4                       0.6
                              .1                       .5
Gender                   2
                              .3                  .2        S
                                          S
                                                       .8
                                   .7                                “Probability of
                         4                                           beingAlive”
 Stage                               .2


                                                                 Dependent
Independent                   Weights     HiddenLaye   Weights   variable
variables                                 r
                                                                 Prediction


  Machine Learning, Dr. Lior
  Rokach, Ben-Gurion University
Ensemble Learning
• The idea is to use multiple models to obtain
  better predictive performance than could be
  obtained from any of the constituent models.

• Boosting involves incrementally building an
  ensemble by training each new model instance to
  emphasize the training instances that previous
  models misclassified.
Example of Ensemble of Weak Classifiers




    Training             Combined classifier
Main Principles
Occam's razor
                   (14th-century)


• In Latin “lex parsimoniae”, translating to “law
  of parsimony”
• The explanation of any phenomenon should
  make as few assumptions as
  possible, eliminating those that make no
  difference in the observable predictions of
  the explanatory hypothesis or theory
• The Occam Dilemma: Unfortunately, in
  ML, accuracy and simplicity (interpretability)
  are in conflict.
No Free Lunch Theorem in Machine
         Learning (Wolpert, 2001)
• “For any two learning algorithms, there are
  just as many situations (appropriately
  weighted) in which algorithm one is superior
  to algorithm two as vice versa, according to
  any of the measures of "superiority"
So why developing new algorithms?
• Practitioner are mostly concerned with choosing the most
  appropriate algorithm for the problem at hand
• This requires some a priori knowledge – data distribution,
  prior probabilities, complexity of the problem, the physics of
  the underlying phenomenon, etc.
• The No Free Lunch theorem tells us that – unless we have
  some a priori knowledge – simple classifiers (or complex ones
  for that matter) are not necessarily better than others.
  However, given some a priori information, certain classifiers
  may better MATCH the characteristics of certain type of
  problems.
• The main challenge of the practitioner is then, to identify the
  correct match between the problem and the classifier!
  …which is yet another reason to arm yourself with a diverse
  set of learner arsenal !
Less is More
The Curse of Dimensionality
     (Bellman, 1961)
Less is More
        The Curse of Dimensionality
• Learning from a high-dimensional feature space requires an
  enormous amount of training to ensure that there are
  several samples with each combination of values.
• With a fixed number of training instances, the predictive
  power reduces as the dimensionality increases.
• As a counter-measure, many dimensionality reduction
  techniques have been proposed, and it has been shown
  that when done properly, the properties or structures of
  the objects can be well preserved even in the lower
  dimensions.
• Nevertheless, naively applying dimensionality reduction
  can lead to pathological results.
While dimensionality reduction is an important tool in machine learning/data mining, we
must always be aware that it can distort the data in misleading ways.

Above is a two dimensional projection of an intrinsically three dimensional world….
Original photographer unknown
See also www.cs.gmu.edu/~jessica/DimReducDanger.htm   (c) eamonn keogh
Screen dumps of a short video from www.cs.gmu.edu/~jessica/DimReducDanger.htm
I recommend you imbed the original video instead

A cloud of points in 3D

                     Can be projected into 2D
                            XY or XZ or YZ

                                          In 2D XZ we see
                                                a triangle

                                                             In 2D YZ we see
                                                                    a square


                                                                               In 2D XY we see
                                                                                        a circle
Less is More?
• In the past the published advice was that high
  dimensionality is dangerous.
• But, Reducing dimensionality reduces the amount
  of information available for prediction.
• Today: try going in the opposite direction: Instead
  of reducing dimensionality, increase it by adding
  many functions of the predictor variables.
• The higher the dimensionality of the set of
  features, the more likely it is that separation
  occurs.
Meaningfulness of Answers

     • A big data-mining risk is that you will
       “discover” patterns that are meaningless.
     • Statisticians call it Bonferroni’s principle:
       (roughly) if you look in more places for
       interesting patterns than your amount of
       data will support, you are bound to find
       crap.

69
Examples of Bonferroni’s Principle
1. Track Terrorists
2. The Rhine Paradox: a great example of how
   not to conduct scientific research.




70
Why Tracking Terrorists Is (Almost)
               Impossible!
• Suppose we believe that certain groups of evil-
  doers are meeting occasionally in hotels to plot
  doing evil.
• We want to find (unrelated) people who at
  least twice have stayed at the same hotel on
  the same day.



71
The Details
     • 109 people being tracked.
     • 1000 days.
     • Each person stays in a hotel 1% of the time
       (10 days out of 1000).
     • Hotels hold 100 people (so 105 hotels).
     • If everyone behaves randomly (I.e., no evil-
       doers) will the data mining detect anything
       suspicious?
72
p at
           q at
          some     Calculations – (1)
some      hotel                            Same
hotel                                      hotel

     • Probability that given persons p and q will
       be at the same hotel on given day d :
        – 1/100  1/100  10-5 = 10-9.
     • Probability that p and q will be at the same
       hotel on given days d1 and d2:
        – 10-9  10-9 = 10-18.
     • Pairs of days:
        – 5105.
73
Calculations – (2)

     • Probability that p and q will be at the same
       hotel on some two days:
       – 5105  10-18 = 510-13.
     • Pairs of people:
       – 51017.
     • Expected number of “suspicious” pairs of
       people:
       – 51017  510-13 = 250,000.
74
Conclusion
• Suppose there are (say) 10 pairs of evil-doers
  who definitely stayed at the same hotel twice.
• Analysts have to sift through 250,010
  candidates to find the 10 real cases.
     – Not gonna happen.
     – But how can we improve the scheme?




75
Moral
• When looking for a property (e.g., “two
  people stayed at the same hotel twice”), make
  sure that the property does not allow so many
  possibilities that random data will surely
  produce facts “of interest.”




76
Rhine Paradox – (1)

• Joseph Rhine was a parapsychologist in the
  1950’s who hypothesized that some people
  had Extra-Sensory Perception.
• He devised (something like) an experiment
  where subjects were asked to guess 10 hidden
  cards – red or blue.
• He discovered that almost 1 in 1000 had ESP –
  they were able to get all 10 right!
 77
Rhine Paradox – (2)
• He told these people they had ESP and called
  them in for another test of the same type.
• Alas, he discovered that almost all of them
  had lost their ESP.
• What did he conclude?
     – Answer on next slide.




78
Rhine Paradox – (3)
• He concluded that you shouldn’t tell people
  they have ESP; it causes them to lose it.




79
Moral
• Understanding Bonferroni’s Principle will help
  you look a little less stupid than a
  parapsychologist.




80
Instability and the
                    Rashomon Effect
• Rashomon is a Japanese movie in which four
  people, from different vantage points, witness
  a criminal incident. When they come to testify
  in court, they all report the same facts, but
  their stories of what happened are very
  different.
• The Rashomon effect is the effect of the
  subjectivity of perception on recollection.
• The Rashomon Effect in ML is that there is
  often a multitude of classifiers of giving about
  the same minimum error rate.
• For example in decition trees, if the training set
  is perturbed only slightly, I can get a tree quite
  different from the original but with almost the
  same test set error.
The Wisdom of Crowds
Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes

               Business, Economies, Societies and Nations




              • Under certain controlled conditions, the
                aggregation of information in groups,
                resulting in decisions that are often
                superior to those that can been made by
                any single - even experts.
              • Imitates our second nature to seek
                several opinions before making any
                crucial decision. We weigh the individual
                opinions, and combine them to reach a
                final decision
Committees of Experts
   – “ … a medical school that has the objective that all
     students, given a problem, come up with an identical
     solution”

• There is not much point in setting up a committee of
  experts from such a group - such a committee will not
  improve on the judgment of an individual.

• Consider:
   – There needs to be disagreement for the committee to have
     the potential to be better than an individual.
Other Learning Tasks
Supervised Learning - Multi Class


      Email Length




                     New Recipients
Supervised Learning - Multi Label
Multi-label learning refers to the classification problem where each example can be
assigned to multiple class labels simultaneously
Supervised Learning - Regression
Find a relationship between a numeric dependent variable and one or more
independent variables


               Email Length




                              New Recipients
Unsupervised Learning - Clustering
  Clustering is the assignment of a set of observations into subsets (called
  clusters) so that observations in the same cluster are similar in some sense


             Email Length




                            New Recipients
Unsupervised Learning–Anomaly Detection
   Detecting patterns in a given data set that do not conform to an established
   normal behavior.


                 Email Length




                                New Recipients
Source of Training Data
• Provided random examples outside of the learner’s control.
    – Passive Learning
    – Negative examples available or only positive? Semi-Supervised Learning
    – Imbalanced
• Good training examples selected by a “benevolent teacher.”
    – “Near miss” examples
• Learner can query an oracle about class of an unlabeled example in the
  environment.
    – Active Learning
• Learner can construct an arbitrary example and query an oracle for its label.
• Learner can run directly in the environment without any human guidance and
  obtain feedback.
    – Reinforcement Learning
• There is no existing class concept
    – A form of discovery
    – Unsupervised Learning
         • Clustering
         • Association Rules

         •



                                             90
Other Learning Tasks
• Other Supervised Learning Settings
    –   Multi-Class Classification
    –   Multi-Label Classification
    –   Semi-supervised classification – make use of labeled and unlabeled data
    –   One Class Classification – only instances from one label are given
• Ranking and Preference Learning
• Sequence labeling
• Cost-sensitive Learning
• Online learning and Incremental Learning- Learns one instance at a
  time.
• Concept Drift
• Multi-Task and Transfer Learning
• Collective classification – When instances are dependent!
Software
Want to Learn More?
Want to Learn More?

• Thomas Mitchell (1997), Machine Learning, Mcgraw-Hill.

• R. Duda, P. Hart, and D. Stork (2000), Pattern
  Classification, Second Edition.

• Ian H. Witten, Eibe Frank, Mark A. Hall (2011), Data
  Mining: Practical Machine Learning Tools and
  Techniques, Third Edition, The Morgan Kaufmann

• Oded Maimon, Lior Rokach (2010), Data Mining and
  Knowledge Discovery Handbook, Second Edition, Springer.

Contenu connexe

Tendances

Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)SwatiTripathi44
 
Machine Learning
Machine LearningMachine Learning
Machine LearningKumar P
 
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete DeckAI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete DeckSlideTeam
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.butest
 
Fraud detection with Machine Learning
Fraud detection with Machine LearningFraud detection with Machine Learning
Fraud detection with Machine LearningScaleway
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Suraj Aavula
 
Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?Marina Santini
 
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...SlideTeam
 
neural network
neural networkneural network
neural networkSTUDENT
 
Machine Learning
Machine LearningMachine Learning
Machine LearningRahul Kumar
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural networkDEEPASHRI HK
 
Types of Machine Learning
Types of Machine LearningTypes of Machine Learning
Types of Machine LearningSamra Shahzadi
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision treesKnoldus Inc.
 

Tendances (20)

Machine learning
Machine learning Machine learning
Machine learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Machine Can Think
Machine Can ThinkMachine Can Think
Machine Can Think
 
Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete DeckAI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.
 
Fraud detection with Machine Learning
Fraud detection with Machine LearningFraud detection with Machine Learning
Fraud detection with Machine Learning
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
 
Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?
 
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
 
Deep learning
Deep learningDeep learning
Deep learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
neural network
neural networkneural network
neural network
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Types of Machine Learning
Types of Machine LearningTypes of Machine Learning
Types of Machine Learning
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 

En vedette

Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningLars Marius Garshol
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.pptbutest
 
Machine learning prediction of stock markets
Machine learning prediction of stock marketsMachine learning prediction of stock markets
Machine learning prediction of stock marketsNikola Milosevic
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial IntelligenceAbbas Hashmi
 
Is SIEM really Dead ? OR Can it evolve into a Platform ?
Is SIEM really Dead ? OR Can it evolve into a Platform ?Is SIEM really Dead ? OR Can it evolve into a Platform ?
Is SIEM really Dead ? OR Can it evolve into a Platform ?Aujas
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learningbutest
 
Data Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksData Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksBICA Labs
 
"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ...
"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ..."Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ...
"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ...Edge AI and Vision Alliance
 
Introduction to Digital Image Processing Using MATLAB
Introduction to Digital Image Processing Using MATLABIntroduction to Digital Image Processing Using MATLAB
Introduction to Digital Image Processing Using MATLABRay Phan
 
Agile Software Development Scrum Vs Lean
Agile Software Development Scrum Vs LeanAgile Software Development Scrum Vs Lean
Agile Software Development Scrum Vs LeanAbdul Wahid
 
Image Processing Basics
Image Processing BasicsImage Processing Basics
Image Processing BasicsNam Le
 
Directions towards a cool consumer review platform using machine learning (ml...
Directions towards a cool consumer review platform using machine learning (ml...Directions towards a cool consumer review platform using machine learning (ml...
Directions towards a cool consumer review platform using machine learning (ml...Dhwaj Raj
 
.Net development with Azure Machine Learning (AzureML) Nov 2014
.Net development with Azure Machine Learning (AzureML) Nov 2014.Net development with Azure Machine Learning (AzureML) Nov 2014
.Net development with Azure Machine Learning (AzureML) Nov 2014Mark Tabladillo
 
Lessons learned
Lessons learnedLessons learned
Lessons learnedhexgnu
 
8085 Paper Presentation slides,ppt,microprocessor 8085 ,guide, instruction set
8085 Paper Presentation slides,ppt,microprocessor 8085 ,guide, instruction set8085 Paper Presentation slides,ppt,microprocessor 8085 ,guide, instruction set
8085 Paper Presentation slides,ppt,microprocessor 8085 ,guide, instruction setSaumitra Rukmangad
 
Is Machine learning for your business? - Girls in Tech Luxembourg
Is Machine learning for your business? - Girls in Tech LuxembourgIs Machine learning for your business? - Girls in Tech Luxembourg
Is Machine learning for your business? - Girls in Tech LuxembourgMarie-Adélaïde Gervis
 
Assignment of arbitrarily distributed random samples to the fixed probability...
Assignment of arbitrarily distributed random samples to the fixed probability...Assignment of arbitrarily distributed random samples to the fixed probability...
Assignment of arbitrarily distributed random samples to the fixed probability...Denis Dus
 
Reproducibility and automation of machine learning process
Reproducibility and automation of machine learning processReproducibility and automation of machine learning process
Reproducibility and automation of machine learning processDenis Dus
 
Machine Learning on Big Data
Machine Learning on Big DataMachine Learning on Big Data
Machine Learning on Big DataMax Lin
 

En vedette (20)

Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 
Machine Learning for Dummies
Machine Learning for DummiesMachine Learning for Dummies
Machine Learning for Dummies
 
Machine learning prediction of stock markets
Machine learning prediction of stock marketsMachine learning prediction of stock markets
Machine learning prediction of stock markets
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
Is SIEM really Dead ? OR Can it evolve into a Platform ?
Is SIEM really Dead ? OR Can it evolve into a Platform ?Is SIEM really Dead ? OR Can it evolve into a Platform ?
Is SIEM really Dead ? OR Can it evolve into a Platform ?
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learning
 
Data Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksData Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural Networks
 
"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ...
"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ..."Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ...
"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded ...
 
Introduction to Digital Image Processing Using MATLAB
Introduction to Digital Image Processing Using MATLABIntroduction to Digital Image Processing Using MATLAB
Introduction to Digital Image Processing Using MATLAB
 
Agile Software Development Scrum Vs Lean
Agile Software Development Scrum Vs LeanAgile Software Development Scrum Vs Lean
Agile Software Development Scrum Vs Lean
 
Image Processing Basics
Image Processing BasicsImage Processing Basics
Image Processing Basics
 
Directions towards a cool consumer review platform using machine learning (ml...
Directions towards a cool consumer review platform using machine learning (ml...Directions towards a cool consumer review platform using machine learning (ml...
Directions towards a cool consumer review platform using machine learning (ml...
 
.Net development with Azure Machine Learning (AzureML) Nov 2014
.Net development with Azure Machine Learning (AzureML) Nov 2014.Net development with Azure Machine Learning (AzureML) Nov 2014
.Net development with Azure Machine Learning (AzureML) Nov 2014
 
Lessons learned
Lessons learnedLessons learned
Lessons learned
 
8085 Paper Presentation slides,ppt,microprocessor 8085 ,guide, instruction set
8085 Paper Presentation slides,ppt,microprocessor 8085 ,guide, instruction set8085 Paper Presentation slides,ppt,microprocessor 8085 ,guide, instruction set
8085 Paper Presentation slides,ppt,microprocessor 8085 ,guide, instruction set
 
Is Machine learning for your business? - Girls in Tech Luxembourg
Is Machine learning for your business? - Girls in Tech LuxembourgIs Machine learning for your business? - Girls in Tech Luxembourg
Is Machine learning for your business? - Girls in Tech Luxembourg
 
Assignment of arbitrarily distributed random samples to the fixed probability...
Assignment of arbitrarily distributed random samples to the fixed probability...Assignment of arbitrarily distributed random samples to the fixed probability...
Assignment of arbitrarily distributed random samples to the fixed probability...
 
Reproducibility and automation of machine learning process
Reproducibility and automation of machine learning processReproducibility and automation of machine learning process
Reproducibility and automation of machine learning process
 
Machine Learning on Big Data
Machine Learning on Big DataMachine Learning on Big Data
Machine Learning on Big Data
 

Similaire à Introduction to Machine Learning

Mlintro 120730222641-phpapp01-210624192524
Mlintro 120730222641-phpapp01-210624192524Mlintro 120730222641-phpapp01-210624192524
Mlintro 120730222641-phpapp01-210624192524Scott Domes
 
A Survey on Spam Filtering Methods and Mapreduce with SVM
A Survey on Spam Filtering Methods and Mapreduce with SVMA Survey on Spam Filtering Methods and Mapreduce with SVM
A Survey on Spam Filtering Methods and Mapreduce with SVMIRJET Journal
 
miniproject.ppt.pptx
miniproject.ppt.pptxminiproject.ppt.pptx
miniproject.ppt.pptxAnush90
 
final-spam-e-mail-detection-180125111231.pptx
final-spam-e-mail-detection-180125111231.pptxfinal-spam-e-mail-detection-180125111231.pptx
final-spam-e-mail-detection-180125111231.pptxinfotowards
 
Classifying Text using CNN
Classifying Text using CNNClassifying Text using CNN
Classifying Text using CNNSomnath Banerjee
 
671gdhfhfghhfhfghfghfghfgh163663-Project-2-PPT.pptx
671gdhfhfghhfhfghfghfghfgh163663-Project-2-PPT.pptx671gdhfhfghhfhfghfghfghfgh163663-Project-2-PPT.pptx
671gdhfhfghhfhfghfghfghfgh163663-Project-2-PPT.pptx0901CS211114SOURAVDI
 
Final spam-e-mail-detection
Final  spam-e-mail-detectionFinal  spam-e-mail-detection
Final spam-e-mail-detectionPartnered Health
 
Email Classification
Email ClassificationEmail Classification
Email ClassificationXi Chen
 
machine learning project with advanced technology and am uploading it for my ...
machine learning project with advanced technology and am uploading it for my ...machine learning project with advanced technology and am uploading it for my ...
machine learning project with advanced technology and am uploading it for my ...vamsikuntumalla
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsSanghamitra Deb
 
Icacci presentation-isi-ransomware
Icacci presentation-isi-ransomwareIcacci presentation-isi-ransomware
Icacci presentation-isi-ransomwarevinaykumar R
 
HBaseCon 2015: Running ML Infrastructure on HBase
HBaseCon 2015: Running ML Infrastructure on HBaseHBaseCon 2015: Running ML Infrastructure on HBase
HBaseCon 2015: Running ML Infrastructure on HBaseHBaseCon
 
Emailphishing(deep anti phishnet applying deep neural networks for phishing e...
Emailphishing(deep anti phishnet applying deep neural networks for phishing e...Emailphishing(deep anti phishnet applying deep neural networks for phishing e...
Emailphishing(deep anti phishnet applying deep neural networks for phishing e...Venkat Projects
 
Deep belief networks for spam filtering
Deep belief networks for spam filteringDeep belief networks for spam filtering
Deep belief networks for spam filteringSOYEON KIM
 
NLP Classifier Models & Metrics
NLP Classifier Models & MetricsNLP Classifier Models & Metrics
NLP Classifier Models & MetricsSanghamitra Deb
 
Study of Various Techniques to Filter Spam Emails
Study of Various Techniques to Filter Spam EmailsStudy of Various Techniques to Filter Spam Emails
Study of Various Techniques to Filter Spam EmailsIRJET Journal
 
mailfilter.ppt
mailfilter.pptmailfilter.ppt
mailfilter.pptbutest
 
Guiding through a typical Machine Learning Pipeline
Guiding through a typical Machine Learning PipelineGuiding through a typical Machine Learning Pipeline
Guiding through a typical Machine Learning PipelineMichael Gerke
 

Similaire à Introduction to Machine Learning (20)

Mlintro 120730222641-phpapp01-210624192524
Mlintro 120730222641-phpapp01-210624192524Mlintro 120730222641-phpapp01-210624192524
Mlintro 120730222641-phpapp01-210624192524
 
A Survey on Spam Filtering Methods and Mapreduce with SVM
A Survey on Spam Filtering Methods and Mapreduce with SVMA Survey on Spam Filtering Methods and Mapreduce with SVM
A Survey on Spam Filtering Methods and Mapreduce with SVM
 
Sms spam classification
Sms spam classificationSms spam classification
Sms spam classification
 
miniproject.ppt.pptx
miniproject.ppt.pptxminiproject.ppt.pptx
miniproject.ppt.pptx
 
final-spam-e-mail-detection-180125111231.pptx
final-spam-e-mail-detection-180125111231.pptxfinal-spam-e-mail-detection-180125111231.pptx
final-spam-e-mail-detection-180125111231.pptx
 
Classifying Text using CNN
Classifying Text using CNNClassifying Text using CNN
Classifying Text using CNN
 
671gdhfhfghhfhfghfghfghfgh163663-Project-2-PPT.pptx
671gdhfhfghhfhfghfghfghfgh163663-Project-2-PPT.pptx671gdhfhfghhfhfghfghfghfgh163663-Project-2-PPT.pptx
671gdhfhfghhfhfghfghfghfgh163663-Project-2-PPT.pptx
 
Final spam-e-mail-detection
Final  spam-e-mail-detectionFinal  spam-e-mail-detection
Final spam-e-mail-detection
 
Email Classification
Email ClassificationEmail Classification
Email Classification
 
Haicku submission
Haicku submissionHaicku submission
Haicku submission
 
machine learning project with advanced technology and am uploading it for my ...
machine learning project with advanced technology and am uploading it for my ...machine learning project with advanced technology and am uploading it for my ...
machine learning project with advanced technology and am uploading it for my ...
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_experts
 
Icacci presentation-isi-ransomware
Icacci presentation-isi-ransomwareIcacci presentation-isi-ransomware
Icacci presentation-isi-ransomware
 
HBaseCon 2015: Running ML Infrastructure on HBase
HBaseCon 2015: Running ML Infrastructure on HBaseHBaseCon 2015: Running ML Infrastructure on HBase
HBaseCon 2015: Running ML Infrastructure on HBase
 
Emailphishing(deep anti phishnet applying deep neural networks for phishing e...
Emailphishing(deep anti phishnet applying deep neural networks for phishing e...Emailphishing(deep anti phishnet applying deep neural networks for phishing e...
Emailphishing(deep anti phishnet applying deep neural networks for phishing e...
 
Deep belief networks for spam filtering
Deep belief networks for spam filteringDeep belief networks for spam filtering
Deep belief networks for spam filtering
 
NLP Classifier Models & Metrics
NLP Classifier Models & MetricsNLP Classifier Models & Metrics
NLP Classifier Models & Metrics
 
Study of Various Techniques to Filter Spam Emails
Study of Various Techniques to Filter Spam EmailsStudy of Various Techniques to Filter Spam Emails
Study of Various Techniques to Filter Spam Emails
 
mailfilter.ppt
mailfilter.pptmailfilter.ppt
mailfilter.ppt
 
Guiding through a typical Machine Learning Pipeline
Guiding through a typical Machine Learning PipelineGuiding through a typical Machine Learning Pipeline
Guiding through a typical Machine Learning Pipeline
 

Dernier

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Dernier (20)

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 

Introduction to Machine Learning

  • 1. Introduction to Machine Learning Lior Rokach Department of Information Systems Engineering Ben-Gurion University of the Negev
  • 2. About Me Prof. Lior Rokach Department of Information Systems Engineering Faculty of Engineering Sciences Head of the Machine Learning Lab Ben-Gurion University of the Negev Email: liorrk@bgu.ac.il http://www.ise.bgu.ac.il/faculty/liorr/ PhD (2004) from Tel Aviv University
  • 3. Machine Learning • Herbert Alexander Simon: “Learning is any process by which a system improves performance from experience.” • “Machine Learning is concerned with computer programs that automatically improve their performance through Herbert Simon experience. “ Turing Award 1975 Nobel Prize in Economics 1978
  • 4. Why Machine Learning? • Develop systems that can automatically adapt and customize themselves to individual users. – Personalized news or mail filter • Discover new knowledge from large databases (data mining). – Market basket analysis (e.g. diapers and beer) • Ability to mimic human and replace certain monotonous tasks - which require some intelligence. • like recognizing handwritten characters • Develop systems that are too difficult/expensive to construct manually because they require specific detailed skills or knowledge tuned to a specific task (knowledge engineering bottleneck). 4
  • 5. Why now? • Flood of available data (especially with the advent of the Internet) • Increasing computational power • Growing progress in available algorithms and theory developed by researchers • Increasing support from industries 5
  • 7. The concept of learning in a ML system • Learning = Improving with experience at some task – Improve over task T, – With respect to performance measure, P – Based on experience, E. 7
  • 8. Motivating Example Learning to Filter Spam Example: Spam Filtering Spam - is all email the user does not want to receive and has not asked to receive T: Identify Spam Emails P: % of spam emails that were filtered % of ham/ (non-spam) emails that were incorrectly filtered-out E: a database of emails that were labelled by users
  • 10. The Learning Process Model Learning Model Testing
  • 11. The Learning Process in our Example Model Learning Model Testing ● Number of recipients ● Size of message ● Number of attachments ● Number of "re's" in the subject line Email Server …
  • 12. Data Set Target Input Attributes Attribute Number of Email Country Customer Email Type new Length (K) (IP) Type Recipients 0 2 Germany Gold Ham 1 4 Germany Silver Ham 5 2 Nigeria Bronze Spam Instances 2 4 Russia Bronze Spam 3 4 Germany Bronze Ham 0 1 USA Silver Ham 4 2 USA Silver Spam Numeric Nominal Ordinal
  • 13. Step 4: Model Learning Learner Classifier Database Inducer Classification Model Training Set Induction Algorithm
  • 14. Step 5: Model Testing Database Learner Training Set Inducer Classifier Induction Algorithm Classification Model
  • 16.
  • 17.
  • 18. Error
  • 19.
  • 20.
  • 21.
  • 22. Linear Classifiers Email Length How would you classify this data? New Recipients
  • 23. Linear Classifiers Email Length How would you classify this data? New Recipients
  • 24. When a new email is sent 1. We first place the new email in the space 2. Classify it according to the subspace in which it resides Email Length New Recipients
  • 25. Linear Classifiers Email Length How would you classify this data? New Recipients
  • 26. Linear Classifiers Email Length How would you classify this data? New Recipients
  • 27. Linear Classifiers Email Length How would you classify this data? New Recipients
  • 28. Linear Classifiers Email Length Any of these would be fine.. ..but which is best? New Recipients
  • 29. Classifier Margin Define the margin of a linear classifier as Email Length the width that the boundary could be increased by before hitting a datapoint. New Recipients
  • 30. Maximum Margin The maximum margin linear Email Length classifier is the linear classifier with the, maximum margin. This is the simplest kind of SVM (Called an LSVM) New Recipients Linear SVM
  • 31. No Linear Classifier can cover all instances Email Length How would you classify this data? New Recipients
  • 32. • Ideally, the best decision boundary should be the one which provides an optimal performance such as in the following figure
  • 33. No Linear Classifier can cover all instances Email Length New Recipients
  • 34. • However, our satisfaction is premature because the central aim of designing a classifier is to correctly classify novel input Issue of generalization!
  • 35. Which one? 2 Errors 0 Errors Simple model Complicated model
  • 36. Evaluating What’s Been Learned 1. We randomly select a portion of the data to be used for training (the training set) 2. Train the model on the training set. 3. Once the model is trained, we run the model on the remaining instances (the test set) to see how it performs Confusion Matrix Classified As Email Length Blue Red Actual Blue 7 1 Red 0 5 New Recipients
  • 37. The Non-linearly separable case Email Length New Recipients
  • 38. The Non-linearly separable case Email Length New Recipients
  • 39. The Non-linearly separable case Email Length New Recipients
  • 40. The Non-linearly separable case Email Length New Recipients
  • 41. Lazy Learners • Generalization beyond the training data is delayed until a new instance is provided to the system Training Set Learner Classifier
  • 42. Lazy Learners Instance-based learning Training Set
  • 43. Lazy Learner: k-Nearest Neighbors • What should be k? • Which distance measure should be used? K=3 Email Length New Recipients
  • 44. Decision tree – A flow-chart-like tree structure – Internal node denotes a test on an attribute – Branch represents an outcome of the test – Leaf nodes represent class labels or class distribution
  • 45. Top Down Induction of Decision Trees Email Len <1.8 ≥1.8 Email Length Spam Ham 1 Error 8 Errors New Recipients A single level decision tree is also known as Decision Stump
  • 46. Top Down Induction of Decision Trees Email Len <1.8 ≥1.8 Email Length Spam Ham 1 Error 8 Errors New Recip New Recipients <2.3 ≥2.3 Spam Ham 4 Errors 11 Errors
  • 47. Top Down Induction of Decision Trees Email Len <1.8 ≥1.8 Email Length Spam Ham 1 Error 8 Errors New Recip New Recipients <2.3 ≥2.3 Spam Ham 4 Errors 11 Errors
  • 48. Top Down Induction of Decision Trees Email Len <1.8 ≥1.8 Email Length Spam Email Len 1 Error <4 ≥4 Ham Spam New Recipients 3 Errors 1 Error
  • 49. Top Down Induction of Decision Trees Email Len <1.8 ≥1.8 Email Length Spam Email Len 1 Error <4 ≥4 Spam New Recip New Recipients 1 Error <1 ≥1 Spam Ham 0 Errors 1 Error
  • 51. Overfitting and underfitting Error Tree Size Overtraining: means that it learns the training set too well – it overfits to the training set such that it performs poorly on the test set. Underfitting: when model is too simple, both training and test errors are large
  • 52. Neural Network Model Inputs Output Age 34 .6 . .2 S 4 0.6 .1 .5 Gender 2 .3 .2 S S .8 .7 “Probability of 4 beingAlive” Stage .2 Dependent Independent Weights HiddenLaye Weights variable variables r Prediction Machine Learning, Dr. Lior Rokach, Ben-Gurion University
  • 53. “Combined logistic models” Inputs Output Age 34 .6 .5 0.6 .1 Gender 2 S .7 .8 “Probability of 4 beingAlive” Stage Dependent Independent Weights HiddenLaye Weights variable variables r Prediction Machine Learning, Dr. Lior Rokach, Ben-Gurion University
  • 54. Inputs Output Age 34 .2 .5 0.6 Gender 2 .3 S “Probability of .8 4 beingAlive” Stage .2 Dependent Independent Weights HiddenLaye Weights variable variables r Prediction Machine Learning, Dr. Lior Rokach, Ben-Gurion University
  • 55. Inputs Output Age 34 .6 .2 .5 0.6 .1 Gender 1 .3 S .7 “Probability of .8 4 beingAlive” Stage .2 Dependent Independent Weights HiddenLaye Weights variable variables r Prediction Machine Learning, Dr. Lior Rokach, Ben-Gurion University
  • 56. Age 34 .6 . .2 S 4 0.6 .1 .5 Gender 2 .3 .2 S S .8 .7 “Probability of 4 beingAlive” Stage .2 Dependent Independent Weights HiddenLaye Weights variable variables r Prediction Machine Learning, Dr. Lior Rokach, Ben-Gurion University
  • 57. Ensemble Learning • The idea is to use multiple models to obtain better predictive performance than could be obtained from any of the constituent models. • Boosting involves incrementally building an ensemble by training each new model instance to emphasize the training instances that previous models misclassified.
  • 58. Example of Ensemble of Weak Classifiers Training Combined classifier
  • 60. Occam's razor (14th-century) • In Latin “lex parsimoniae”, translating to “law of parsimony” • The explanation of any phenomenon should make as few assumptions as possible, eliminating those that make no difference in the observable predictions of the explanatory hypothesis or theory • The Occam Dilemma: Unfortunately, in ML, accuracy and simplicity (interpretability) are in conflict.
  • 61. No Free Lunch Theorem in Machine Learning (Wolpert, 2001) • “For any two learning algorithms, there are just as many situations (appropriately weighted) in which algorithm one is superior to algorithm two as vice versa, according to any of the measures of "superiority"
  • 62. So why developing new algorithms? • Practitioner are mostly concerned with choosing the most appropriate algorithm for the problem at hand • This requires some a priori knowledge – data distribution, prior probabilities, complexity of the problem, the physics of the underlying phenomenon, etc. • The No Free Lunch theorem tells us that – unless we have some a priori knowledge – simple classifiers (or complex ones for that matter) are not necessarily better than others. However, given some a priori information, certain classifiers may better MATCH the characteristics of certain type of problems. • The main challenge of the practitioner is then, to identify the correct match between the problem and the classifier! …which is yet another reason to arm yourself with a diverse set of learner arsenal !
  • 63. Less is More The Curse of Dimensionality (Bellman, 1961)
  • 64. Less is More The Curse of Dimensionality • Learning from a high-dimensional feature space requires an enormous amount of training to ensure that there are several samples with each combination of values. • With a fixed number of training instances, the predictive power reduces as the dimensionality increases. • As a counter-measure, many dimensionality reduction techniques have been proposed, and it has been shown that when done properly, the properties or structures of the objects can be well preserved even in the lower dimensions. • Nevertheless, naively applying dimensionality reduction can lead to pathological results.
  • 65. While dimensionality reduction is an important tool in machine learning/data mining, we must always be aware that it can distort the data in misleading ways. Above is a two dimensional projection of an intrinsically three dimensional world….
  • 66. Original photographer unknown See also www.cs.gmu.edu/~jessica/DimReducDanger.htm (c) eamonn keogh
  • 67. Screen dumps of a short video from www.cs.gmu.edu/~jessica/DimReducDanger.htm I recommend you imbed the original video instead A cloud of points in 3D Can be projected into 2D XY or XZ or YZ In 2D XZ we see a triangle In 2D YZ we see a square In 2D XY we see a circle
  • 68. Less is More? • In the past the published advice was that high dimensionality is dangerous. • But, Reducing dimensionality reduces the amount of information available for prediction. • Today: try going in the opposite direction: Instead of reducing dimensionality, increase it by adding many functions of the predictor variables. • The higher the dimensionality of the set of features, the more likely it is that separation occurs.
  • 69. Meaningfulness of Answers • A big data-mining risk is that you will “discover” patterns that are meaningless. • Statisticians call it Bonferroni’s principle: (roughly) if you look in more places for interesting patterns than your amount of data will support, you are bound to find crap. 69
  • 70. Examples of Bonferroni’s Principle 1. Track Terrorists 2. The Rhine Paradox: a great example of how not to conduct scientific research. 70
  • 71. Why Tracking Terrorists Is (Almost) Impossible! • Suppose we believe that certain groups of evil- doers are meeting occasionally in hotels to plot doing evil. • We want to find (unrelated) people who at least twice have stayed at the same hotel on the same day. 71
  • 72. The Details • 109 people being tracked. • 1000 days. • Each person stays in a hotel 1% of the time (10 days out of 1000). • Hotels hold 100 people (so 105 hotels). • If everyone behaves randomly (I.e., no evil- doers) will the data mining detect anything suspicious? 72
  • 73. p at q at some Calculations – (1) some hotel Same hotel hotel • Probability that given persons p and q will be at the same hotel on given day d : – 1/100  1/100  10-5 = 10-9. • Probability that p and q will be at the same hotel on given days d1 and d2: – 10-9  10-9 = 10-18. • Pairs of days: – 5105. 73
  • 74. Calculations – (2) • Probability that p and q will be at the same hotel on some two days: – 5105  10-18 = 510-13. • Pairs of people: – 51017. • Expected number of “suspicious” pairs of people: – 51017  510-13 = 250,000. 74
  • 75. Conclusion • Suppose there are (say) 10 pairs of evil-doers who definitely stayed at the same hotel twice. • Analysts have to sift through 250,010 candidates to find the 10 real cases. – Not gonna happen. – But how can we improve the scheme? 75
  • 76. Moral • When looking for a property (e.g., “two people stayed at the same hotel twice”), make sure that the property does not allow so many possibilities that random data will surely produce facts “of interest.” 76
  • 77. Rhine Paradox – (1) • Joseph Rhine was a parapsychologist in the 1950’s who hypothesized that some people had Extra-Sensory Perception. • He devised (something like) an experiment where subjects were asked to guess 10 hidden cards – red or blue. • He discovered that almost 1 in 1000 had ESP – they were able to get all 10 right! 77
  • 78. Rhine Paradox – (2) • He told these people they had ESP and called them in for another test of the same type. • Alas, he discovered that almost all of them had lost their ESP. • What did he conclude? – Answer on next slide. 78
  • 79. Rhine Paradox – (3) • He concluded that you shouldn’t tell people they have ESP; it causes them to lose it. 79
  • 80. Moral • Understanding Bonferroni’s Principle will help you look a little less stupid than a parapsychologist. 80
  • 81. Instability and the Rashomon Effect • Rashomon is a Japanese movie in which four people, from different vantage points, witness a criminal incident. When they come to testify in court, they all report the same facts, but their stories of what happened are very different. • The Rashomon effect is the effect of the subjectivity of perception on recollection. • The Rashomon Effect in ML is that there is often a multitude of classifiers of giving about the same minimum error rate. • For example in decition trees, if the training set is perturbed only slightly, I can get a tree quite different from the original but with almost the same test set error.
  • 82. The Wisdom of Crowds Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations • Under certain controlled conditions, the aggregation of information in groups, resulting in decisions that are often superior to those that can been made by any single - even experts. • Imitates our second nature to seek several opinions before making any crucial decision. We weigh the individual opinions, and combine them to reach a final decision
  • 83. Committees of Experts – “ … a medical school that has the objective that all students, given a problem, come up with an identical solution” • There is not much point in setting up a committee of experts from such a group - such a committee will not improve on the judgment of an individual. • Consider: – There needs to be disagreement for the committee to have the potential to be better than an individual.
  • 85. Supervised Learning - Multi Class Email Length New Recipients
  • 86. Supervised Learning - Multi Label Multi-label learning refers to the classification problem where each example can be assigned to multiple class labels simultaneously
  • 87. Supervised Learning - Regression Find a relationship between a numeric dependent variable and one or more independent variables Email Length New Recipients
  • 88. Unsupervised Learning - Clustering Clustering is the assignment of a set of observations into subsets (called clusters) so that observations in the same cluster are similar in some sense Email Length New Recipients
  • 89. Unsupervised Learning–Anomaly Detection Detecting patterns in a given data set that do not conform to an established normal behavior. Email Length New Recipients
  • 90. Source of Training Data • Provided random examples outside of the learner’s control. – Passive Learning – Negative examples available or only positive? Semi-Supervised Learning – Imbalanced • Good training examples selected by a “benevolent teacher.” – “Near miss” examples • Learner can query an oracle about class of an unlabeled example in the environment. – Active Learning • Learner can construct an arbitrary example and query an oracle for its label. • Learner can run directly in the environment without any human guidance and obtain feedback. – Reinforcement Learning • There is no existing class concept – A form of discovery – Unsupervised Learning • Clustering • Association Rules • 90
  • 91. Other Learning Tasks • Other Supervised Learning Settings – Multi-Class Classification – Multi-Label Classification – Semi-supervised classification – make use of labeled and unlabeled data – One Class Classification – only instances from one label are given • Ranking and Preference Learning • Sequence labeling • Cost-sensitive Learning • Online learning and Incremental Learning- Learns one instance at a time. • Concept Drift • Multi-Task and Transfer Learning • Collective classification – When instances are dependent!
  • 93. Want to Learn More?
  • 94. Want to Learn More? • Thomas Mitchell (1997), Machine Learning, Mcgraw-Hill. • R. Duda, P. Hart, and D. Stork (2000), Pattern Classification, Second Edition. • Ian H. Witten, Eibe Frank, Mark A. Hall (2011), Data Mining: Practical Machine Learning Tools and Techniques, Third Edition, The Morgan Kaufmann • Oded Maimon, Lior Rokach (2010), Data Mining and Knowledge Discovery Handbook, Second Edition, Springer.