SlideShare a Scribd company logo
1 of 84
Download to read offline
Developing Korean Chatbot 101
Jaemin Cho
Hello!
I am Jaemin Cho
● B.S. in Industrial Engineering @ SNU
● Former NLP Researcher @
● Interests:
○ ML / DL / RL
○ Sequence modeling
■ NLP / Dialogue
■ Sourcecode
■ Music / Dance
What is Chatbot?
Human-level General Conversation
General Conversation
Super Smart Home API
Smart Home API
Human Customer Service
Customer Service
Different Goals,
Single task!
Sequence to Sequence mapping
Chatbot as Sequence to Sequence mapping
◎ Just like translation
○ Hello (Eng.) => 안녕하세요! (Kor.)
◎ Question (+ Context) => Answer
Deep
Learning
Doing great jobs in
many fields!
RNN Encoder-Decoder (+ attention + augmented memory)
That looks Coooool!
Where is my J.A.R.V.I.S.?
Of course,
You can make deep learning bots.
However, purely generative bot say random words.
Because they don’t understand what they are talking about.
Words and understanding
◎ Words
○ Words / characters are symbols
○ A Language is already a function
◉ f : thought/concept -> word
○ Words are already result of representation learning
◉ Not like RGB image channels
○ Element of Natural Language Graph Model
사과
Apple
f1
= Korean
f2
= English
Words and understanding
◎ When learning a new word
○ Mimic others’ usage
◉ Indirectly learn by examples
○ Grammar / Dictionaries
◉ Directly learn Knowledge Structure
◉ Transfer learning
Words and understanding
Words and understanding
◎ We use languages
○ To communicate
○ To successfully express information/idea
◉ Requires to represent prior knowledge
◉ Ex. Ontology (Entity - Properties - Relationships)
Words and understanding
◎ Understanding a new concept requires
○ Prior knowledge
◉ Relationships between existing concepts
○ Operations
◉ Scoring / Comparing similarities
◉ Identifying nearest concept
◉ Updating existing informations
◉ Creating / Deleting concepts / connections
Human Brain
# of synapses > 1014
Human vs Neural Networks
Neural Networks
# of synapses < 1010
To maintain Human-level conversations,
AI should understand meaning of sentence.
Memory structure / DB management
Human Brain >>>>>>>>>> 넘사벽 >>>>>>>>> Neural Networks
Deep Learning cannot understand what you mean
Even state-of-the-art models are still not structured enough to
successfully represent languages and prior knowledges
If you still want to build your own
Deep Learning chatbots..!
◎ WildML(Denny Britz)’s Blog Post
◉ RNN Retrieval model
◉ Dual Encoder LSTM
◉ Trained on Ubuntu Q&A Corpus
◉ Sourcecode provided
◎ Jungkyu Shin’s 미소녀봇
◉ RNN Generative model
◉ Trained on Japanese anime subtitles
◉ Good Explanation of overall architecture of bot
◉ no sourcecode provided
So.. now what?
Why do you want to build bots?
To make money! ( ͡° ͜ʖ ͡°)
Business
Topic
- narrow
Tasks
- Domain-specific
- Relatively Small in number
Important
- To provide information
- And NOT to make mistakes
Bots for business / Conversational AI
Friend
Topic
- broad
Tasks
- General & and abstract
- Numerous
Important
- To maintain natural dialogue
- And make it pleasant
Today, I’ll talk about
Bots for business!
Again, for making money... ( ͡° ͜ʖ ͡°)
More specifically..
Intent Schema /
Architecture
Corpus Feature
engineering
NLP / NLU Tools Classification /
Generation algorithms
And some more! (DM, OOV …)
Focus on a few intents!
Divide-and-Conquer
Intent Schema
◎ For Business bots, some questions are
more important than others
○ Don’t need to deal with everyday conversations
○ Focus on small number of topics and tasks, which
are more important in business
◎ Hierarchical Intent schema
○ 1) Classify questions into intents
◉ Business / Non-Business
○ 2) Generate responses differently at each intent
◉ Focus more on important intents
○ Easier to debug / monitor
Hierarchical Intent Schema
Business Intent Non-Business
Level-1 Classifier
Business Intent
1
Business Intent
2
Non-Business
Intent 1
Non-Business
Intent 2
Generation Module
1
Generation Module
2
Generation Module
3
Generation Module
4
Level-2 Classifier 1 Level-2 Classifier 2
Response
Sentence
Architecture
End-to-End vs Modularization
Architecture
◎ End-to-end model is (academically) fancier
◎ However, Deep Learning is Black Box
○ Hard to understand reasoning pattern
◎ Modularization gives you
○ Easier debugging
○ Flexibility
○ Accountability
Architecture
◎ Core modules
○ Sentence vectorizer
○ Intent classifier
○ Response generator
◎ Optional
○ Tone generation
○ Error correction
What data can / should we
use?
“Among leading AI teams, many can likely replicate others’ software in, at most, 1–2 years. But it
is exceedingly difficult to get access to someone else’s data. Thus data, rather than software, is
the defensible barrier for many businesses.”
Andrew Ng, “What Artificial Intelligence Can and Can’t Do Right Now”, Harvard Business Review
Corpus
◎ Open Corpora
○ General topics
○ Old, mostly written language
○ Sejong / KAIST Corpus
○ Namu Wiki dump / Wikipedia dump
○ Naver sentiment movie corpus
◎ Web scraping
○ You can configure what you scrap
◉ General or domain specific
○ colloquial language, newly coined words
○ SNS - Facebook, Twitter
○ Online forums, blogs, cafes
Corpus
◎ None of these provide perfectly fit
domain-specific Q&A
◎ You should make sure that
you (will) have enough chat data
Before you start bot business
How to vectorize a sentence?
Hierarchical Intent Schema
Business Intent Non-Business
Level-1 Classifier
Business Intent
1
Business Intent
2
Non-Business
Intent 1
Non-Business
Intent 2
Generation Module
1
Generation Module
2
Generation Module
3
Generation Module
4
Level-2 Classifier 1 Level-2 Classifier 2
Response
Sentence
Hierarchical Intent Schema
Business Intent Non-Business
Level-1 Classifier
Business Intent
1
Business Intent
2
Non-Business
Intent 1
Non-Business
Intent 2
Generation Module
1
Generation Module
2
Generation Module
3
Generation Module
4
Level-2 Classifier 1 Level-2 Classifier 2
Response
Sentence
Sentence Vectorizer
Sentence vectorization
Sentence
0.25, 0.5, -0.41, 0.30, -0.12, 0.65, ……………… , 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, …………, 0.24, 0.35, 0 ,1, 1, 1
Word Embeddings Keywords Custom Features
Feature Engineering
◎ Sentence as sequence of words
○ Get word embeddings
◉ CBOW / Skip-grams
◉ Gensim / fastText
○ How to combine words?
◉ Sum / Average
◉ Concatenate
● padding required for fixed-length vector
◉ RNTN / Tree-LSTM
● robust for long sentences / Parser required
RNTN / Tree-LSTM
Feature Engineering
◎ Character-level embedding
○ Information loss during word normalization
◉ Tense, singular/plural, sex ...
◉ Even meaning can be affected
○ C2W
◉ Char embedding + cached word embedding
◎ Directly generate sentence vector
○ Doc2Vec (paragraph vec)
○ Skip-thought vectors
C2W / Doc2Vec / Skip-thoughts
Feature Engineering
◎ Word sense disambiguation (WSD)
○ homonyms and polysemous words
○ POS embedding
◉ Get embedding after auto-tagging the corpus
◉ Ex. v(사과/Noun) ≠ v(사과/Verb)
◎ Space information
○ Sentence = words + spaces
○ Space information loss during tokenization
○ Prefix, suffix padding with special character
○ Space as a word
Feature Engineering
◎ Co-occurrence is not almighty
○ Only captures syntax
○ Can’t capture meaning
○ Ex1. v(Football) ≒ v(Baseball)
○ Ex2. v(Loan) ≒ v(Investment)
◎ Need something more than co-occurrence!
Feature Engineering
◎ Keyword Occurrences
○ Top K most frequent words from your own data
○ Keyword Occurrence vector of length K
◎ And some more...
○ POS Tagger, Parser, NE Tagger
○ Word n-grams, Character n-grams (subwords)
○ Reverse word order (≒ Bi-RNN)
○ Length of query
○ Non-language data
◉ Location / Time
◉ Private info.
● Purchase history / Customer type / etc.
NLP/NLU Tools
◎ Goal
○ Information gain in sentence vectorization
○ If accuracy decreases => Not worth it!
◎ Existing tools (Ex. Taggers of KoNLPy)
○ Trained with general, written language (Sejong /
Wikipedia)
○ Cannot process
◉ Colloquial styles
◉ newly-coined words
◉ domain-specific expression
○ Train your tool with your own corpus!
NLP/NLU Tools
◎ POS tagger
○ 조사 helps semantic role labeling (SRL)
◉ 주격조사 => 주어, 목적격조사 => 목적어
○ Word Normalization
○ Mecab-ko, Twitter, Komoran (3.0)..
○ Rouzeta (FST)
◎ Parser
○ Head information, Phrase tag
○ Korean vs English
◉ Dependency parser might work better for Korean
○ dparser / SyntaxNet
NLP/NLU Tools
NLP/NLU Tools
◎ NE tagger
○ annie (CRF + SVM)
◉ Not the best, but the only open-source Korean NE tagger
○ Tagger (Bi-LSTM + CRF / Theano)
◉ Trained with English
◉ IOB format
○ 2016 국립국어원 국어정보경진대회 - NER
◎ 국립국어원 국어정보경진대회
○ The only annual competition for Korean NLP
NLP/NLU Tools
◎ Helpful for those who don’t have enough
time to develop own tools!
◎ Make sure you understand how they work!
○ Again, they are trained with general corpora
○ Maybe enough for toy academic usage
○ But not enough for business
○ You should be able to
◉ Train with your own data
◉ Tweak parameters (and model itself)!
NLP/NLU Tools
◎ Sequence Labeling
○ POS-Tagging, Parsing, NE-Tagging, Spacer
◎ Data Format
○ IOB
○ PTB
○ CoNLL-U
○ Sejong
◎ Algorithms
○ PGM: CRF
○ Neural Networks: RNN
○ Hybrid: LSTM-CRF
IOB
PTB
CoNLL-U (Universal Dependencies)
Sejong Treebank
Classification / Generation algorithms
◎ Classification
○ SVM
◉ Scikit-Learn
○ Decision Trees (Random Forest / Gradient Boosting)
◉ Scikit-Learn / Xgboost / LightGBM
○ Linear Models
◉ fastText
○ Neural Networks (CNN / RNN)
◉ TensorFlow / Theano
◉ Try simple implementation first! (tf.contrib /
Keras)
◉ likejazz’s cnn-text-classification-tf
◉ Requires HUGE data
Classification / Generation algorithms
◎ Generation
○ Predefined answers
◉ Randomly select a response from ‘response list’
◉ Slot filling
● response = “Hello {customer_name}!”.format(customer_name=customer_name)
○ Neural models
◉ Seq2Seq + attention + augmented memory
◉ Copying + Two step (Latent Predictor Networks)
◉ Dual Encoder, HRED
◉ Beam Search
◉ Easy seq2seq / OpenNMT
◉ Need Huge data
◉ Check out QA competitions
● SQuAD leaderboards
SQuAD Leaderboards
◎
Classification / Generation algorithms
◎ Executed every time processing query
◎ Critical to response time
○ These can take time > 1 sec
◉ import tensorflow as tf
◉ load(‘./model.pkl’)
○ Pre-load
○ Caching
ML modules to train
◎ Sentence Vectorizer
○ Word/Character/POS embedding
○ Word vector concatenating operator
○ extra features to capture meaning
◎ Intent Classifier
◎ Response Generator
◎ POS tagger / Parser / NE Tagger
◎ (Optional)
○ Tone generator
○ Error Corrector
◉ Typo / Grammar / Space (띄어쓰기)
Non-ML modules to prepare
◎ Predefined answers
○ List of answers to be randomly selected
○ Answers with unique entity slots to be filled
◎ DB Integration
○ Update chat history to training data
◎ Web Scraper
○ HTML / XML / JSON parsing
◎ Format converter
○ Open source data have different formats
○ PTB / CoNLL / IOB …
◎ Server
Optional, but highly recommended to equip
◎ Data Admin / Input panel
○ Easy Overview / Edit
○ Mechanical Turk
◎ Custom Dictionary
○ Domain-specific expressions
○ Integration with existing tools / DB
◎ Scorer for each module
○ One Click cross validation / test
◉ Crucial with small data / complicated architecture
◎ Visualization
○ Performance overview
○ Confusion matrix
○ T-SNE for sentence vectors
Two tricky problems: DM and OOV
Let’s go a little further!
Dialogue Management
◎ Finite State scenario
◎ Markov Decision Process
Dialogue Management - Finite State-based Scenarios
◎ Hand-crafted by dialogue experts
◎ Predetermined Scenario
◎ Pros.
○ Simple model
○ Natural way to deal with well-structured tasks
○ Information exchange is tractable
◎ Cons.
○ Inflexible
◉ Customers should follow predefined flow
○ Low maintainability
◉ different scenarios as system gets bigger
Dialogue Management - Finite State-based Scenarios
Dialogue Management - Markov Decision Process
◎ State transition problem
○ State: high level context
○ Action: To choose next context
○ Agent: Bot
◎ Deep RL
○ Imitation / Forward Prediction / HRED
◎ Not suitable for business yet
○ No universal reward function / evaluation metric
○ Requires huge labeled dialogue data
○ Top papers are still solving toy problems
◉ accuracy < 50% or # of action < 10
Dialogue Management - Markov Decision Process
Dialogue Management - Markov Decision Process
◎ Very Interesting & maybe right way to go
○ But cannot cover in 2 mins ㅜㅜ
○ NLP / DL / RL + a
◎ Reading lists
◎ Spoken Dialogue Management Using Probabilistic Reasoning (2000)
◎ Optimizing Dialogue Management with Reinforcement Learning : Experiments with the NJFun System (2000)
◎ A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion (2015)
◎ Strategic Dialogue Management via Deep Reinforcement Learning (2015)
◎ Continuously Learning Neural Dialogue Management (2016)
◎ How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue
Response Generation (2017)
◎ Dialogue Learning with Human-In-The-Loop (2017)
◎ End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager (2017)
Out-of-Vocabulary Words
◎ Replace with the most similar word
○ Dictionary / WordNet
○ Web search
◉ Naver / Wikipedia / Namuwiki
◉ Select top k articles
◉ POS-Tagging and get the most frequent word
◎ Get word embedding with subword information
○ C2W
○ fastText
◉ Not compatible with Gensim
Should we really develop
all of these?
There are 100+ bot builders...
Bot builders
◎ Bot builders provide many tools
○ NLU engines
○ DB management
○ GUI Interface
○ Serving with different platforms
◎ You have to pay for the service
◎ You cannot customize modules / architectures
More importantly,
Are bots worth to develop?
Can they actually replace
human worker / websites / apps ?
Bots are too hyped!
◎ Inefficient to existing platforms
○ # of inputs / response time
○ Many big companies develop bots for
◉ Promotion / Branding
◉ Part of long-term AI Research
◎ Assistance instead of replacement
○ Handle simple queries only
◉ Pass dialogue to human if confidence is low
○ GUI customer service advisor
◉ Like Smart Reply
Let’s share our knowledge
◎ Let’s not reinvent wheels!
○ Tons of Dataset/algorithms have been published in
journals, but not open-sourced
◎ Data / Algorithm sharing will flourish Korean
NLP ecosystem
Let’s share our knowledge
◎
Data & Ada Hiring
Alexa Prize
Thanks!
Any questions?
You can find me at:
● heythisischo@gmail.com
● j-min
● J-min Cho
● Jaemin Cho

More Related Content

Viewers also liked

20170227 파이썬으로 챗봇_만들기
20170227 파이썬으로 챗봇_만들기20170227 파이썬으로 챗봇_만들기
20170227 파이썬으로 챗봇_만들기Kim Sungdong
 
챗봇 개발을 위한 네이버 랩스 api
챗봇 개발을 위한 네이버 랩스 api챗봇 개발을 위한 네이버 랩스 api
챗봇 개발을 위한 네이버 랩스 apiNAVER D2
 
챗봇 시작해보기
챗봇 시작해보기챗봇 시작해보기
챗봇 시작해보기성일 한
 
왓슨컴퓨터의 인공지능
왓슨컴퓨터의 인공지능왓슨컴퓨터의 인공지능
왓슨컴퓨터의 인공지능SeokWon Kim
 
[트래블챗] Ai기반 챗봇 커머스 플랫폼 사업계획서 20170213_f
[트래블챗] Ai기반 챗봇 커머스 플랫폼 사업계획서 20170213_f[트래블챗] Ai기반 챗봇 커머스 플랫폼 사업계획서 20170213_f
[트래블챗] Ai기반 챗봇 커머스 플랫폼 사업계획서 20170213_fLee Hansub
 
한양대학교 스마트창작터 사업계획서_중고거래 구매대행 서비스 Buying_오세훈
한양대학교 스마트창작터 사업계획서_중고거래 구매대행 서비스 Buying_오세훈한양대학교 스마트창작터 사업계획서_중고거래 구매대행 서비스 Buying_오세훈
한양대학교 스마트창작터 사업계획서_중고거래 구매대행 서비스 Buying_오세훈세훈 오
 
사업계획서 알음알음 포스코기술투자
사업계획서 알음알음 포스코기술투자사업계획서 알음알음 포스코기술투자
사업계획서 알음알음 포스코기술투자Sangmo Kang
 
Python과 Tensorflow를 활용한 AI Chatbot 개발 및 실무 적용
Python과 Tensorflow를 활용한  AI Chatbot 개발 및 실무 적용Python과 Tensorflow를 활용한  AI Chatbot 개발 및 실무 적용
Python과 Tensorflow를 활용한 AI Chatbot 개발 및 실무 적용Susang Kim
 
Human Motion Forecasting (Generation) with RNNs
Human Motion Forecasting (Generation) with RNNsHuman Motion Forecasting (Generation) with RNNs
Human Motion Forecasting (Generation) with RNNsTerry Taewoong Um
 
Deformable Convolutional Network (2017)
Deformable Convolutional Network (2017)Deformable Convolutional Network (2017)
Deformable Convolutional Network (2017)Terry Taewoong Um
 
Understanding Black-box Predictions via Influence Functions (2017)
Understanding Black-box Predictions via Influence Functions (2017)Understanding Black-box Predictions via Influence Functions (2017)
Understanding Black-box Predictions via Influence Functions (2017)Terry Taewoong Um
 
Learning with side information through modality hallucination (2016)
Learning with side information through modality hallucination (2016)Learning with side information through modality hallucination (2016)
Learning with side information through modality hallucination (2016)Terry Taewoong Um
 
Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...Bhaskar Mitra
 
Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...
Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...
Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...Association for Computational Linguistics
 
Deep Learning for Chatbot (3/4)
Deep Learning for Chatbot (3/4)Deep Learning for Chatbot (3/4)
Deep Learning for Chatbot (3/4)Jaemin Cho
 
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRecommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRoelof Pieters
 
Philosophy of Deep Learning
Philosophy of Deep LearningPhilosophy of Deep Learning
Philosophy of Deep LearningMelanie Swan
 

Viewers also liked (20)

20170227 파이썬으로 챗봇_만들기
20170227 파이썬으로 챗봇_만들기20170227 파이썬으로 챗봇_만들기
20170227 파이썬으로 챗봇_만들기
 
챗봇 개발을 위한 네이버 랩스 api
챗봇 개발을 위한 네이버 랩스 api챗봇 개발을 위한 네이버 랩스 api
챗봇 개발을 위한 네이버 랩스 api
 
챗봇 시작해보기
챗봇 시작해보기챗봇 시작해보기
챗봇 시작해보기
 
Korean Word Network
Korean Word NetworkKorean Word Network
Korean Word Network
 
왓슨컴퓨터의 인공지능
왓슨컴퓨터의 인공지능왓슨컴퓨터의 인공지능
왓슨컴퓨터의 인공지능
 
챗봇 스터디
챗봇 스터디챗봇 스터디
챗봇 스터디
 
Watson System
Watson SystemWatson System
Watson System
 
[트래블챗] Ai기반 챗봇 커머스 플랫폼 사업계획서 20170213_f
[트래블챗] Ai기반 챗봇 커머스 플랫폼 사업계획서 20170213_f[트래블챗] Ai기반 챗봇 커머스 플랫폼 사업계획서 20170213_f
[트래블챗] Ai기반 챗봇 커머스 플랫폼 사업계획서 20170213_f
 
한양대학교 스마트창작터 사업계획서_중고거래 구매대행 서비스 Buying_오세훈
한양대학교 스마트창작터 사업계획서_중고거래 구매대행 서비스 Buying_오세훈한양대학교 스마트창작터 사업계획서_중고거래 구매대행 서비스 Buying_오세훈
한양대학교 스마트창작터 사업계획서_중고거래 구매대행 서비스 Buying_오세훈
 
사업계획서 알음알음 포스코기술투자
사업계획서 알음알음 포스코기술투자사업계획서 알음알음 포스코기술투자
사업계획서 알음알음 포스코기술투자
 
Python과 Tensorflow를 활용한 AI Chatbot 개발 및 실무 적용
Python과 Tensorflow를 활용한  AI Chatbot 개발 및 실무 적용Python과 Tensorflow를 활용한  AI Chatbot 개발 및 실무 적용
Python과 Tensorflow를 활용한 AI Chatbot 개발 및 실무 적용
 
Human Motion Forecasting (Generation) with RNNs
Human Motion Forecasting (Generation) with RNNsHuman Motion Forecasting (Generation) with RNNs
Human Motion Forecasting (Generation) with RNNs
 
Deformable Convolutional Network (2017)
Deformable Convolutional Network (2017)Deformable Convolutional Network (2017)
Deformable Convolutional Network (2017)
 
Understanding Black-box Predictions via Influence Functions (2017)
Understanding Black-box Predictions via Influence Functions (2017)Understanding Black-box Predictions via Influence Functions (2017)
Understanding Black-box Predictions via Influence Functions (2017)
 
Learning with side information through modality hallucination (2016)
Learning with side information through modality hallucination (2016)Learning with side information through modality hallucination (2016)
Learning with side information through modality hallucination (2016)
 
Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...
 
Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...
Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...
Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...
 
Deep Learning for Chatbot (3/4)
Deep Learning for Chatbot (3/4)Deep Learning for Chatbot (3/4)
Deep Learning for Chatbot (3/4)
 
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRecommender Systems, Matrices and Graphs
Recommender Systems, Matrices and Graphs
 
Philosophy of Deep Learning
Philosophy of Deep LearningPhilosophy of Deep Learning
Philosophy of Deep Learning
 

Similar to Developing Korean Chatbot 101

Lessons learned with Bdd: a tutorial
Lessons learned with Bdd: a tutorialLessons learned with Bdd: a tutorial
Lessons learned with Bdd: a tutorialAlan Richardson
 
The Rise Of Conversational AI with David Low
The Rise Of Conversational AI with David LowThe Rise Of Conversational AI with David Low
The Rise Of Conversational AI with David LowDatabricks
 
Cepstrum Placement Talk 2022.pptx
Cepstrum Placement Talk 2022.pptxCepstrum Placement Talk 2022.pptx
Cepstrum Placement Talk 2022.pptxgyan98
 
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdfITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdfOrtus Solutions, Corp
 
What Is Coding And Why Should You Learn It?
What Is Coding And Why Should You Learn It?What Is Coding And Why Should You Learn It?
What Is Coding And Why Should You Learn It?Syed Hassan Raza
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPMENGSAYLOEM1
 
Know the Startup World - Demi Ben Ari - Ofek Alumni
Know the Startup World - Demi Ben Ari - Ofek AlumniKnow the Startup World - Demi Ben Ari - Ofek Alumni
Know the Startup World - Demi Ben Ari - Ofek AlumniDemi Ben-Ari
 
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0Plain Concepts
 
How to Implement Domain Driven Design in Real Life SDLC
How to Implement Domain Driven Design  in Real Life SDLCHow to Implement Domain Driven Design  in Real Life SDLC
How to Implement Domain Driven Design in Real Life SDLCAbdul Karim
 
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systemsQi He
 
Python enterprise vento di liberta
Python enterprise vento di libertaPython enterprise vento di liberta
Python enterprise vento di libertaSimone Federici
 
Programming terms &amp; concepts - Using Java
Programming terms &amp; concepts - Using JavaProgramming terms &amp; concepts - Using Java
Programming terms &amp; concepts - Using JavaRebecca DuPont, PhD
 
TDC 2020 - Implementing a Mini-Language
TDC 2020 - Implementing a Mini-LanguageTDC 2020 - Implementing a Mini-Language
TDC 2020 - Implementing a Mini-LanguageLuciano Sabença
 
Importance Of Being Driven
Importance Of Being DrivenImportance Of Being Driven
Importance Of Being DrivenAntonio Terreno
 
The Ring programming language version 1.9 book - Part 97 of 210
The Ring programming language version 1.9 book - Part 97 of 210The Ring programming language version 1.9 book - Part 97 of 210
The Ring programming language version 1.9 book - Part 97 of 210Mahmoud Samir Fayed
 
[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?
[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?
[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?Hayahide Yamagishi
 
Yves Peirsman - Deep Learning for NLP
Yves Peirsman - Deep Learning for NLPYves Peirsman - Deep Learning for NLP
Yves Peirsman - Deep Learning for NLPHendrik D'Oosterlinck
 
introtonlp-190218095523 (1).pdf
introtonlp-190218095523 (1).pdfintrotonlp-190218095523 (1).pdf
introtonlp-190218095523 (1).pdfAdityaMishra178868
 
How to code in the XXI century without losing your head
How to code in the XXI century without losing your headHow to code in the XXI century without losing your head
How to code in the XXI century without losing your headRené Olivo
 

Similar to Developing Korean Chatbot 101 (20)

Learning to code in 2020
Learning to code in 2020Learning to code in 2020
Learning to code in 2020
 
Lessons learned with Bdd: a tutorial
Lessons learned with Bdd: a tutorialLessons learned with Bdd: a tutorial
Lessons learned with Bdd: a tutorial
 
The Rise Of Conversational AI with David Low
The Rise Of Conversational AI with David LowThe Rise Of Conversational AI with David Low
The Rise Of Conversational AI with David Low
 
Cepstrum Placement Talk 2022.pptx
Cepstrum Placement Talk 2022.pptxCepstrum Placement Talk 2022.pptx
Cepstrum Placement Talk 2022.pptx
 
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdfITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
ITB_2023_Chatgpt_Box_Scott_Steinbeck.pdf
 
What Is Coding And Why Should You Learn It?
What Is Coding And Why Should You Learn It?What Is Coding And Why Should You Learn It?
What Is Coding And Why Should You Learn It?
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
 
Know the Startup World - Demi Ben Ari - Ofek Alumni
Know the Startup World - Demi Ben Ari - Ofek AlumniKnow the Startup World - Demi Ben Ari - Ofek Alumni
Know the Startup World - Demi Ben Ari - Ofek Alumni
 
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
 
How to Implement Domain Driven Design in Real Life SDLC
How to Implement Domain Driven Design  in Real Life SDLCHow to Implement Domain Driven Design  in Real Life SDLC
How to Implement Domain Driven Design in Real Life SDLC
 
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
[AAAI 2019 tutorial] End-to-end goal-oriented question answering systems
 
Python enterprise vento di liberta
Python enterprise vento di libertaPython enterprise vento di liberta
Python enterprise vento di liberta
 
Programming terms &amp; concepts - Using Java
Programming terms &amp; concepts - Using JavaProgramming terms &amp; concepts - Using Java
Programming terms &amp; concepts - Using Java
 
TDC 2020 - Implementing a Mini-Language
TDC 2020 - Implementing a Mini-LanguageTDC 2020 - Implementing a Mini-Language
TDC 2020 - Implementing a Mini-Language
 
Importance Of Being Driven
Importance Of Being DrivenImportance Of Being Driven
Importance Of Being Driven
 
The Ring programming language version 1.9 book - Part 97 of 210
The Ring programming language version 1.9 book - Part 97 of 210The Ring programming language version 1.9 book - Part 97 of 210
The Ring programming language version 1.9 book - Part 97 of 210
 
[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?
[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?
[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?
 
Yves Peirsman - Deep Learning for NLP
Yves Peirsman - Deep Learning for NLPYves Peirsman - Deep Learning for NLP
Yves Peirsman - Deep Learning for NLP
 
introtonlp-190218095523 (1).pdf
introtonlp-190218095523 (1).pdfintrotonlp-190218095523 (1).pdf
introtonlp-190218095523 (1).pdf
 
How to code in the XXI century without losing your head
How to code in the XXI century without losing your headHow to code in the XXI century without losing your head
How to code in the XXI century without losing your head
 

Recently uploaded

Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - GuideGOPINATHS437943
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
The SRE Report 2024 - Great Findings for the teams
The SRE Report 2024 - Great Findings for the teamsThe SRE Report 2024 - Great Findings for the teams
The SRE Report 2024 - Great Findings for the teamsDILIPKUMARMONDAL6
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating SystemRashmi Bhat
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...Amil Baba Dawood bangali
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
Industrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIESIndustrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIESNarmatha D
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingBootNeck1
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm Systemirfanmechengr
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 

Recently uploaded (20)

Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
The SRE Report 2024 - Great Findings for the teams
The SRE Report 2024 - Great Findings for the teamsThe SRE Report 2024 - Great Findings for the teams
The SRE Report 2024 - Great Findings for the teams
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating System
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
NO1 Certified Black Magic Specialist Expert Amil baba in Uae Dubai Abu Dhabi ...
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
Industrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIESIndustrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIES
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event Scheduling
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 

Developing Korean Chatbot 101

  • 1. Developing Korean Chatbot 101 Jaemin Cho
  • 2. Hello! I am Jaemin Cho ● B.S. in Industrial Engineering @ SNU ● Former NLP Researcher @ ● Interests: ○ ML / DL / RL ○ Sequence modeling ■ NLP / Dialogue ■ Sourcecode ■ Music / Dance
  • 11. Chatbot as Sequence to Sequence mapping ◎ Just like translation ○ Hello (Eng.) => 안녕하세요! (Kor.) ◎ Question (+ Context) => Answer
  • 13.
  • 14. RNN Encoder-Decoder (+ attention + augmented memory)
  • 15. That looks Coooool! Where is my J.A.R.V.I.S.?
  • 16. Of course, You can make deep learning bots. However, purely generative bot say random words. Because they don’t understand what they are talking about.
  • 17. Words and understanding ◎ Words ○ Words / characters are symbols ○ A Language is already a function ◉ f : thought/concept -> word ○ Words are already result of representation learning ◉ Not like RGB image channels ○ Element of Natural Language Graph Model 사과 Apple f1 = Korean f2 = English
  • 18. Words and understanding ◎ When learning a new word ○ Mimic others’ usage ◉ Indirectly learn by examples ○ Grammar / Dictionaries ◉ Directly learn Knowledge Structure ◉ Transfer learning
  • 20. Words and understanding ◎ We use languages ○ To communicate ○ To successfully express information/idea ◉ Requires to represent prior knowledge ◉ Ex. Ontology (Entity - Properties - Relationships)
  • 21. Words and understanding ◎ Understanding a new concept requires ○ Prior knowledge ◉ Relationships between existing concepts ○ Operations ◉ Scoring / Comparing similarities ◉ Identifying nearest concept ◉ Updating existing informations ◉ Creating / Deleting concepts / connections
  • 22. Human Brain # of synapses > 1014 Human vs Neural Networks Neural Networks # of synapses < 1010 To maintain Human-level conversations, AI should understand meaning of sentence. Memory structure / DB management Human Brain >>>>>>>>>> 넘사벽 >>>>>>>>> Neural Networks
  • 23. Deep Learning cannot understand what you mean Even state-of-the-art models are still not structured enough to successfully represent languages and prior knowledges
  • 24. If you still want to build your own Deep Learning chatbots..! ◎ WildML(Denny Britz)’s Blog Post ◉ RNN Retrieval model ◉ Dual Encoder LSTM ◉ Trained on Ubuntu Q&A Corpus ◉ Sourcecode provided ◎ Jungkyu Shin’s 미소녀봇 ◉ RNN Generative model ◉ Trained on Japanese anime subtitles ◉ Good Explanation of overall architecture of bot ◉ no sourcecode provided
  • 26. Why do you want to build bots? To make money! ( ͡° ͜ʖ ͡°)
  • 27. Business Topic - narrow Tasks - Domain-specific - Relatively Small in number Important - To provide information - And NOT to make mistakes Bots for business / Conversational AI Friend Topic - broad Tasks - General & and abstract - Numerous Important - To maintain natural dialogue - And make it pleasant
  • 28. Today, I’ll talk about Bots for business! Again, for making money... ( ͡° ͜ʖ ͡°)
  • 29. More specifically.. Intent Schema / Architecture Corpus Feature engineering NLP / NLU Tools Classification / Generation algorithms And some more! (DM, OOV …)
  • 30. Focus on a few intents! Divide-and-Conquer
  • 31. Intent Schema ◎ For Business bots, some questions are more important than others ○ Don’t need to deal with everyday conversations ○ Focus on small number of topics and tasks, which are more important in business ◎ Hierarchical Intent schema ○ 1) Classify questions into intents ◉ Business / Non-Business ○ 2) Generate responses differently at each intent ◉ Focus more on important intents ○ Easier to debug / monitor
  • 32. Hierarchical Intent Schema Business Intent Non-Business Level-1 Classifier Business Intent 1 Business Intent 2 Non-Business Intent 1 Non-Business Intent 2 Generation Module 1 Generation Module 2 Generation Module 3 Generation Module 4 Level-2 Classifier 1 Level-2 Classifier 2 Response Sentence
  • 34. Architecture ◎ End-to-end model is (academically) fancier ◎ However, Deep Learning is Black Box ○ Hard to understand reasoning pattern ◎ Modularization gives you ○ Easier debugging ○ Flexibility ○ Accountability
  • 35. Architecture ◎ Core modules ○ Sentence vectorizer ○ Intent classifier ○ Response generator ◎ Optional ○ Tone generation ○ Error correction
  • 36. What data can / should we use? “Among leading AI teams, many can likely replicate others’ software in, at most, 1–2 years. But it is exceedingly difficult to get access to someone else’s data. Thus data, rather than software, is the defensible barrier for many businesses.” Andrew Ng, “What Artificial Intelligence Can and Can’t Do Right Now”, Harvard Business Review
  • 37. Corpus ◎ Open Corpora ○ General topics ○ Old, mostly written language ○ Sejong / KAIST Corpus ○ Namu Wiki dump / Wikipedia dump ○ Naver sentiment movie corpus ◎ Web scraping ○ You can configure what you scrap ◉ General or domain specific ○ colloquial language, newly coined words ○ SNS - Facebook, Twitter ○ Online forums, blogs, cafes
  • 38. Corpus ◎ None of these provide perfectly fit domain-specific Q&A ◎ You should make sure that you (will) have enough chat data Before you start bot business
  • 39. How to vectorize a sentence?
  • 40. Hierarchical Intent Schema Business Intent Non-Business Level-1 Classifier Business Intent 1 Business Intent 2 Non-Business Intent 1 Non-Business Intent 2 Generation Module 1 Generation Module 2 Generation Module 3 Generation Module 4 Level-2 Classifier 1 Level-2 Classifier 2 Response Sentence
  • 41. Hierarchical Intent Schema Business Intent Non-Business Level-1 Classifier Business Intent 1 Business Intent 2 Non-Business Intent 1 Non-Business Intent 2 Generation Module 1 Generation Module 2 Generation Module 3 Generation Module 4 Level-2 Classifier 1 Level-2 Classifier 2 Response Sentence Sentence Vectorizer
  • 42. Sentence vectorization Sentence 0.25, 0.5, -0.41, 0.30, -0.12, 0.65, ……………… , 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, …………, 0.24, 0.35, 0 ,1, 1, 1 Word Embeddings Keywords Custom Features
  • 43. Feature Engineering ◎ Sentence as sequence of words ○ Get word embeddings ◉ CBOW / Skip-grams ◉ Gensim / fastText ○ How to combine words? ◉ Sum / Average ◉ Concatenate ● padding required for fixed-length vector ◉ RNTN / Tree-LSTM ● robust for long sentences / Parser required
  • 45. Feature Engineering ◎ Character-level embedding ○ Information loss during word normalization ◉ Tense, singular/plural, sex ... ◉ Even meaning can be affected ○ C2W ◉ Char embedding + cached word embedding ◎ Directly generate sentence vector ○ Doc2Vec (paragraph vec) ○ Skip-thought vectors
  • 46. C2W / Doc2Vec / Skip-thoughts
  • 47. Feature Engineering ◎ Word sense disambiguation (WSD) ○ homonyms and polysemous words ○ POS embedding ◉ Get embedding after auto-tagging the corpus ◉ Ex. v(사과/Noun) ≠ v(사과/Verb) ◎ Space information ○ Sentence = words + spaces ○ Space information loss during tokenization ○ Prefix, suffix padding with special character ○ Space as a word
  • 48. Feature Engineering ◎ Co-occurrence is not almighty ○ Only captures syntax ○ Can’t capture meaning ○ Ex1. v(Football) ≒ v(Baseball) ○ Ex2. v(Loan) ≒ v(Investment) ◎ Need something more than co-occurrence!
  • 49. Feature Engineering ◎ Keyword Occurrences ○ Top K most frequent words from your own data ○ Keyword Occurrence vector of length K ◎ And some more... ○ POS Tagger, Parser, NE Tagger ○ Word n-grams, Character n-grams (subwords) ○ Reverse word order (≒ Bi-RNN) ○ Length of query ○ Non-language data ◉ Location / Time ◉ Private info. ● Purchase history / Customer type / etc.
  • 50. NLP/NLU Tools ◎ Goal ○ Information gain in sentence vectorization ○ If accuracy decreases => Not worth it! ◎ Existing tools (Ex. Taggers of KoNLPy) ○ Trained with general, written language (Sejong / Wikipedia) ○ Cannot process ◉ Colloquial styles ◉ newly-coined words ◉ domain-specific expression ○ Train your tool with your own corpus!
  • 51. NLP/NLU Tools ◎ POS tagger ○ 조사 helps semantic role labeling (SRL) ◉ 주격조사 => 주어, 목적격조사 => 목적어 ○ Word Normalization ○ Mecab-ko, Twitter, Komoran (3.0).. ○ Rouzeta (FST) ◎ Parser ○ Head information, Phrase tag ○ Korean vs English ◉ Dependency parser might work better for Korean ○ dparser / SyntaxNet
  • 53. NLP/NLU Tools ◎ NE tagger ○ annie (CRF + SVM) ◉ Not the best, but the only open-source Korean NE tagger ○ Tagger (Bi-LSTM + CRF / Theano) ◉ Trained with English ◉ IOB format ○ 2016 국립국어원 국어정보경진대회 - NER ◎ 국립국어원 국어정보경진대회 ○ The only annual competition for Korean NLP
  • 54. NLP/NLU Tools ◎ Helpful for those who don’t have enough time to develop own tools! ◎ Make sure you understand how they work! ○ Again, they are trained with general corpora ○ Maybe enough for toy academic usage ○ But not enough for business ○ You should be able to ◉ Train with your own data ◉ Tweak parameters (and model itself)!
  • 55. NLP/NLU Tools ◎ Sequence Labeling ○ POS-Tagging, Parsing, NE-Tagging, Spacer ◎ Data Format ○ IOB ○ PTB ○ CoNLL-U ○ Sejong ◎ Algorithms ○ PGM: CRF ○ Neural Networks: RNN ○ Hybrid: LSTM-CRF
  • 56. IOB
  • 57. PTB
  • 60. Classification / Generation algorithms ◎ Classification ○ SVM ◉ Scikit-Learn ○ Decision Trees (Random Forest / Gradient Boosting) ◉ Scikit-Learn / Xgboost / LightGBM ○ Linear Models ◉ fastText ○ Neural Networks (CNN / RNN) ◉ TensorFlow / Theano ◉ Try simple implementation first! (tf.contrib / Keras) ◉ likejazz’s cnn-text-classification-tf ◉ Requires HUGE data
  • 61. Classification / Generation algorithms ◎ Generation ○ Predefined answers ◉ Randomly select a response from ‘response list’ ◉ Slot filling ● response = “Hello {customer_name}!”.format(customer_name=customer_name) ○ Neural models ◉ Seq2Seq + attention + augmented memory ◉ Copying + Two step (Latent Predictor Networks) ◉ Dual Encoder, HRED ◉ Beam Search ◉ Easy seq2seq / OpenNMT ◉ Need Huge data ◉ Check out QA competitions ● SQuAD leaderboards
  • 63. Classification / Generation algorithms ◎ Executed every time processing query ◎ Critical to response time ○ These can take time > 1 sec ◉ import tensorflow as tf ◉ load(‘./model.pkl’) ○ Pre-load ○ Caching
  • 64. ML modules to train ◎ Sentence Vectorizer ○ Word/Character/POS embedding ○ Word vector concatenating operator ○ extra features to capture meaning ◎ Intent Classifier ◎ Response Generator ◎ POS tagger / Parser / NE Tagger ◎ (Optional) ○ Tone generator ○ Error Corrector ◉ Typo / Grammar / Space (띄어쓰기)
  • 65. Non-ML modules to prepare ◎ Predefined answers ○ List of answers to be randomly selected ○ Answers with unique entity slots to be filled ◎ DB Integration ○ Update chat history to training data ◎ Web Scraper ○ HTML / XML / JSON parsing ◎ Format converter ○ Open source data have different formats ○ PTB / CoNLL / IOB … ◎ Server
  • 66. Optional, but highly recommended to equip ◎ Data Admin / Input panel ○ Easy Overview / Edit ○ Mechanical Turk ◎ Custom Dictionary ○ Domain-specific expressions ○ Integration with existing tools / DB ◎ Scorer for each module ○ One Click cross validation / test ◉ Crucial with small data / complicated architecture ◎ Visualization ○ Performance overview ○ Confusion matrix ○ T-SNE for sentence vectors
  • 67. Two tricky problems: DM and OOV Let’s go a little further!
  • 68. Dialogue Management ◎ Finite State scenario ◎ Markov Decision Process
  • 69. Dialogue Management - Finite State-based Scenarios ◎ Hand-crafted by dialogue experts ◎ Predetermined Scenario ◎ Pros. ○ Simple model ○ Natural way to deal with well-structured tasks ○ Information exchange is tractable ◎ Cons. ○ Inflexible ◉ Customers should follow predefined flow ○ Low maintainability ◉ different scenarios as system gets bigger
  • 70. Dialogue Management - Finite State-based Scenarios
  • 71. Dialogue Management - Markov Decision Process ◎ State transition problem ○ State: high level context ○ Action: To choose next context ○ Agent: Bot ◎ Deep RL ○ Imitation / Forward Prediction / HRED ◎ Not suitable for business yet ○ No universal reward function / evaluation metric ○ Requires huge labeled dialogue data ○ Top papers are still solving toy problems ◉ accuracy < 50% or # of action < 10
  • 72. Dialogue Management - Markov Decision Process
  • 73. Dialogue Management - Markov Decision Process ◎ Very Interesting & maybe right way to go ○ But cannot cover in 2 mins ㅜㅜ ○ NLP / DL / RL + a ◎ Reading lists ◎ Spoken Dialogue Management Using Probabilistic Reasoning (2000) ◎ Optimizing Dialogue Management with Reinforcement Learning : Experiments with the NJFun System (2000) ◎ A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion (2015) ◎ Strategic Dialogue Management via Deep Reinforcement Learning (2015) ◎ Continuously Learning Neural Dialogue Management (2016) ◎ How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation (2017) ◎ Dialogue Learning with Human-In-The-Loop (2017) ◎ End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager (2017)
  • 74. Out-of-Vocabulary Words ◎ Replace with the most similar word ○ Dictionary / WordNet ○ Web search ◉ Naver / Wikipedia / Namuwiki ◉ Select top k articles ◉ POS-Tagging and get the most frequent word ◎ Get word embedding with subword information ○ C2W ○ fastText ◉ Not compatible with Gensim
  • 75. Should we really develop all of these? There are 100+ bot builders...
  • 76.
  • 77. Bot builders ◎ Bot builders provide many tools ○ NLU engines ○ DB management ○ GUI Interface ○ Serving with different platforms ◎ You have to pay for the service ◎ You cannot customize modules / architectures
  • 78. More importantly, Are bots worth to develop? Can they actually replace human worker / websites / apps ?
  • 79. Bots are too hyped! ◎ Inefficient to existing platforms ○ # of inputs / response time ○ Many big companies develop bots for ◉ Promotion / Branding ◉ Part of long-term AI Research ◎ Assistance instead of replacement ○ Handle simple queries only ◉ Pass dialogue to human if confidence is low ○ GUI customer service advisor ◉ Like Smart Reply
  • 80. Let’s share our knowledge ◎ Let’s not reinvent wheels! ○ Tons of Dataset/algorithms have been published in journals, but not open-sourced ◎ Data / Algorithm sharing will flourish Korean NLP ecosystem
  • 81. Let’s share our knowledge ◎
  • 82. Data & Ada Hiring
  • 84. Thanks! Any questions? You can find me at: ● heythisischo@gmail.com ● j-min ● J-min Cho ● Jaemin Cho