Experiments with Machine Learning - GDG Lviv

•

51 j'aime•4,115 vues

Yuriy Guts

Lviv GDG Meetup presentation. From zero to human speech recognition.

Données & analyses Technologie

Experiments with Machine Learning
Yuriy Guts
Solutions Architect

First Things First
What Is Machine Learning?

“ A computer program is said to learn from experience E
with respect to some class of tasks T and performance measure P,
if its performance at tasks in T, as measured by P,
improves with experience E.
— Tom M. Mitchell

Categories of Machine Learning
1. Supervised Learning.
2. Unsupervised Learning.
3. Reinforcement Learning.

Regression
Predict a continuous dependent variable
based on independent predictors

Classification
Assign an observation to some category
from a known discrete list of categories

Logistic Regression
hypothesis = 1 / (1 + exp(-‐theta' * x));

Logistic Regression: Cost Function
hypotheses = sigmoid(X * theta);
cost = (1 / m) * (-‐y' * log(hypotheses) -‐ (1 -‐ y)' * log(1 -‐ hypotheses));

Let’s classify human speech!
Decide whether a spoken phrase contains the word ‘Google’ or not

‘Google’ Detector: Feature Mapping
Options for building X[ ]:
Input: Audio file (WAV, 16 bit mono, 44.1 kHz)
Output: 1 if it contains the word ‘Google’, otherwise 0
1. Use raw waveform as a feature vector.
But: will have 66150 features for a 1.5 second file.
Kinda scary, and easy to overfit.
2. Use Mel-Frequency Cepstral Coefficients (MFCC).
Believed to be closer to human auditory response.
Depending on parameters, can give about 80 features per file.

[cepstra, aSpectrum, pSpectrum] = MFCC(waveform);
x = [cepstra(1); cepstra(2); ...; cepstra(n)];

Let’s code it up
MATLAB, logistic regression with conjugate gradient optimization

yuriy . guts @ gmail . com
linkedin . com / in / yuriyguts
github.com/YuriyGuts/gdg-speech-classifier
Q & A

Recommandé

Intro to machine learningAndreas Chandra

pptbutest

Raw 2009 -THE ROLE OF LATEST FIXATIONS ON ONGOING VISUAL SEARCH A MODEL TO E...Giacomo Veneri

Introdution and designing a learning systemswapnac12

.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...NETFest

Uncertainty Awareness in Integrating Machine Learning and Game TheoryRikiya Takahashi

Karn rakamthong m63 no7rapekung

Recommandé

Intro to machine learningAndreas Chandra

pptbutest

Raw 2009 -THE ROLE OF LATEST FIXATIONS ON ONGOING VISUAL SEARCH A MODEL TO E...Giacomo Veneri

Introdution and designing a learning systemswapnac12

.NET Fest 2017. Игорь Кочетов. Классификация результатов тестирования произво...NETFest

Uncertainty Awareness in Integrating Machine Learning and Game TheoryRikiya Takahashi

Karn rakamthong m63 no7rapekung

Session 21butest

MS CS - Selecting Machine Learning AlgorithmKaniska Mandal

CSI 5387: Concept Learning Systems / Machine Learning butest

Combinatorial optimization CO-1man003

SEGAN: Speech Enhancement Generative Adversarial NetworkUniversitat Politècnica de Catalunya

On the Dynamics of Machine Learning Algorithms and Behavioral Game TheoryRikiya Takahashi

Data-Driven Recommender Systemsrecsysfr

6 gamesMhd Sb

RuleML2015: Input-Output STIT Logic for Normative SystemsRuleML

Slides for a talk on search-based testing for Event-B modelsAlin Stefanescu

Data Structures problems 2002Sanjay Goel

Dictionary Learning for Massive Matrix Factorizationrecsysfr

Combining Lazy Learning, Racing and Subsampling for Effective Feature SelectionGianluca Bontempi

Georgetown B-school Talk 2021Charles Martin

Predicting Preference Reversals via Gaussian Process Uncertainty AversionRikiya Takahashi

Search relevanceCharles Martin

27332020002_PC-CS601_Robotics_Debjit Doira.pdfAdharchandsaha

01_introduction_ML.pdfgiridharsripathi

Lecture 01: Machine Learning for Language Technology - IntroductionMarina Santini

nnml.pptyang947066

Lec1 intoduction.pptxOussama Haj Salem

课堂讲义(最后更新:2009-9-25)butest

Contenu connexe

Tendances

Session 21butest

MS CS - Selecting Machine Learning AlgorithmKaniska Mandal

CSI 5387: Concept Learning Systems / Machine Learning butest

Combinatorial optimization CO-1man003

SEGAN: Speech Enhancement Generative Adversarial NetworkUniversitat Politècnica de Catalunya

On the Dynamics of Machine Learning Algorithms and Behavioral Game TheoryRikiya Takahashi

Data-Driven Recommender Systemsrecsysfr

6 gamesMhd Sb

RuleML2015: Input-Output STIT Logic for Normative SystemsRuleML

Slides for a talk on search-based testing for Event-B modelsAlin Stefanescu

Data Structures problems 2002Sanjay Goel

Dictionary Learning for Massive Matrix Factorizationrecsysfr

Combining Lazy Learning, Racing and Subsampling for Effective Feature SelectionGianluca Bontempi

Georgetown B-school Talk 2021Charles Martin

Predicting Preference Reversals via Gaussian Process Uncertainty AversionRikiya Takahashi

Search relevanceCharles Martin

Tendances (16)

Session 21

MS CS - Selecting Machine Learning Algorithm

CSI 5387: Concept Learning Systems / Machine Learning

Combinatorial optimization CO-1

SEGAN: Speech Enhancement Generative Adversarial Network

On the Dynamics of Machine Learning Algorithms and Behavioral Game Theory

Data-Driven Recommender Systems

6 games

RuleML2015: Input-Output STIT Logic for Normative Systems

Slides for a talk on search-based testing for Event-B models

Data Structures problems 2002

Dictionary Learning for Massive Matrix Factorization

Combining Lazy Learning, Racing and Subsampling for Effective Feature Selection

Georgetown B-school Talk 2021

Predicting Preference Reversals via Gaussian Process Uncertainty Aversion

Search relevance

Similaire à Experiments with Machine Learning - GDG Lviv

27332020002_PC-CS601_Robotics_Debjit Doira.pdfAdharchandsaha

01_introduction_ML.pdfgiridharsripathi

Lecture 01: Machine Learning for Language Technology - IntroductionMarina Santini

nnml.pptyang947066

Lec1 intoduction.pptxOussama Haj Salem

课堂讲义(最后更新:2009-9-25)butest

Machine learningsujinkim136

Introduction to Machine Learning FogGuru MSCA Project

Artificial Intelligence AI Topics History and Overviewbutest

기계학습(Machine learning) 입문하기Terry Taewoong Um

Practical deepllearningv1arthi v

Lecture 03: Machine Learning for Language Technology - Linear ClassifiersMarina Santini

ML_ Unit_1_PART_ASrimatre K

Machine learning and Neural Networksbutest

Equirs: Explicitly Query Understanding Information Retrieval System Based on HmmInternational Journal of Engineering Inventions www.ijeijournal.com

Max Entropyjianingy

A Few Useful Things to Know about Machine Learningnep_test_account

Machine Learning SeminarEdwin Efraín Jiménez Lepe

Machine Learning Applications in NLP.pptbutest

Similaire à Experiments with Machine Learning - GDG Lviv (20)

27332020002_PC-CS601_Robotics_Debjit Doira.pdf

01_introduction_ML.pdf

Lecture 01: Machine Learning for Language Technology - Introduction

nnml.ppt

Lec1 intoduction.pptx

课堂讲义(最后更新:2009-9-25)

Machine learning

Introduction to Machine Learning

Artificial Intelligence AI Topics History and Overview

기계학습(Machine learning) 입문하기

Practical deepllearningv1

Lecture 03: Machine Learning for Language Technology - Linear Classifiers

ML_ Unit_1_PART_A

Machine learning and Neural Networks

Equirs: Explicitly Query Understanding Information Retrieval System Based on Hmm

Max Entropy

A Few Useful Things to Know about Machine Learning

Machine Learning Seminar

Machine Learning Applications in NLP.ppt

Plus de Yuriy Guts

Target Leakage in Machine Learning (ODSC East 2020)Yuriy Guts

Automated Machine LearningYuriy Guts

Target Leakage in Machine LearningYuriy Guts

Paraphrase Detection in NLPYuriy Guts

UCU NLP Summer Workshops 2017 - Part 2Yuriy Guts

Natural Language Processing (NLP)Yuriy Guts

NoSQL (ELEKS DevTalks #1 - Jan 2015)Yuriy Guts

A Developer Overview of RedisYuriy Guts

[JEEConf 2015] Lessons from Building a Modern B2C System in ScalaYuriy Guts

Redis for .NET DevelopersYuriy Guts

Aspect-Oriented Programming (AOP) in .NETYuriy Guts

Non-Functional RequirementsYuriy Guts

Introduction to Software ArchitectureYuriy Guts

UML for Business AnalystsYuriy Guts

Intro to Software Engineering for non-IT AudienceYuriy Guts

ELEKS DevTalks #4: Amazon Web Services Crash CourseYuriy Guts

ELEKS Summer School 2012: .NET 09 - DatabasesYuriy Guts

ELEKS Summer School 2012: .NET 06 - MultithreadingYuriy Guts

ELEKS Summer School 2012: .NET 04 - Resources and MemoryYuriy Guts

Plus de Yuriy Guts (19)

Target Leakage in Machine Learning (ODSC East 2020)

Automated Machine Learning

Target Leakage in Machine Learning

Paraphrase Detection in NLP

UCU NLP Summer Workshops 2017 - Part 2

Natural Language Processing (NLP)

NoSQL (ELEKS DevTalks #1 - Jan 2015)

A Developer Overview of Redis

[JEEConf 2015] Lessons from Building a Modern B2C System in Scala

Redis for .NET Developers

Aspect-Oriented Programming (AOP) in .NET

Non-Functional Requirements

Introduction to Software Architecture

UML for Business Analysts

Intro to Software Engineering for non-IT Audience

ELEKS DevTalks #4: Amazon Web Services Crash Course

ELEKS Summer School 2012: .NET 09 - Databases

ELEKS Summer School 2012: .NET 06 - Multithreading

ELEKS Summer School 2012: .NET 04 - Resources and Memory

Dernier

Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics

Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen

Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy

Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy

ASML's Taxonomy Adventure by Daniel Cantervoginip

DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett

Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics

Semantic Shed - Squashing and Squeezing.pptxMike Bennett

Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics

Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics

Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter

RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort

毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss

Learn How Data Science Changes Our WorldEduminds Learning

Easter Eggs From Star Wars and in cars 1 and 217djon017

Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster

April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024Timothy Spann

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort

Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2

Dernier (20)

Heart Disease Classification Report: A Data Analysis Project

Data Factory in Microsoft Fabric (MsBIP #82)

Student Profile Sample report on improving academic performance by uniting gr...

Student profile product demonstration on grades, ability, well-being and mind...

ASML's Taxonomy Adventure by Daniel Canter

DBA Basics: Getting Started with Performance Tuning.pdf

Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...

Semantic Shed - Squashing and Squeezing.pptx

Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...

Predicting Salary Using Data Science: A Comprehensive Analysis.pdf

Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...

RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi

毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree

Learn How Data Science Changes Our World

Easter Eggs From Star Wars and in cars 1 and 2

Top 5 Best Data Analytics Courses In Queens

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024

April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service

Identifying Appropriate Test Statistics Involving Population Mean

Experiments with Machine Learning - GDG Lviv

1. Experiments with Machine Learning Yuriy Guts Solutions Architect

2. First Things First What Is Machine Learning?

3. “ A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. — Tom M. Mitchell

4. Categories of Machine Learning 1. Supervised Learning. 2. Unsupervised Learning. 3. Reinforcement Learning.

7. Regression Predict a continuous dependent variable based on independent predictors

10.

11. Linear Regression

12.

13.

14.

15. Classification Assign an observation to some category from a known discrete list of categories

16. Logistic Regression hypothesis = 1 / (1 + exp(-‐theta' * x));

17. Logistic Regression: Cost Function hypotheses = sigmoid(X * theta); cost = (1 / m) * (-‐y' * log(hypotheses) -‐ (1 -‐ y)' * log(1 -‐ hypotheses));

18. Let’s classify human speech! Decide whether a spoken phrase contains the word ‘Google’ or not

19. ‘Google’ Detector: Feature Mapping Options for building X[ ]: Input: Audio file (WAV, 16 bit mono, 44.1 kHz) Output: 1 if it contains the word ‘Google’, otherwise 0 1. Use raw waveform as a feature vector. But: will have 66150 features for a 1.5 second file. Kinda scary, and easy to overfit. 2. Use Mel-Frequency Cepstral Coefficients (MFCC). Believed to be closer to human auditory response. Depending on parameters, can give about 80 features per file.

20. [cepstra, aSpectrum, pSpectrum] = MFCC(waveform); x = [cepstra(1); cepstra(2); ...; cepstra(n)];

21. Let’s code it up MATLAB, logistic regression with conjugate gradient optimization

22. yuriy . guts @ gmail . com linkedin . com / in / yuriyguts github.com/YuriyGuts/gdg-speech-classifier Q & A