SlideShare une entreprise Scribd logo
1  sur  83
Télécharger pour lire hors ligne
@graphific
Roelof Pieters
Introduc0on	
  to	
  

Deep	
  Learning	
  for	
  NLP
22	
  January	
  2015	
  

Stockholm	
  Natural	
  Language	
  Processing	
  Meetup
FEEDA
Slides at:

http://www.slideshare.net/roelofp/220115dlmeetup
1
Deep
Learning ???
2
A couple of headlines… [all November ’14]
3
(source: Google Trends)
4
Machine Learning ??
- Audience Check -
5
• “Brain” inspired / simulations:
• vision: make learning algorithms 

better and easier to use
• goal: revolutions in (practical) 

advances for machine learning and AI
• Deep Learning = subfield of Machine Learning
Deep Learning ??
6
Biological Inspiration
7
Deep Learning ??
8
DL: Impact
9
Speech Recognition
DL: Impact
10
Deep Learning for the win!
a few examples:
• IJCNN 2011 Traffic Sign Recognition Competition
• ISBI 2012 Segmentation of neuronal structures in EM stacks
challenge
• ICDAR 2011 Chinese handwriting recognition
• Deals with “construction and study of systems that can
learn from data”
Machine Learning ??
A computer program is said to learn from
experience (E) with respect to some class
of tasks (T) and performance measure (P),
if its performance at tasks in T, as measured
by P, improves with experience E
— T. Mitchell 1997
11
Machine Learning ??
Traditional Programming:
Data
Program
Output
Data
Program
Output
Machine Learning:
12
Supervised (inductive) learning
• Training data includes desired outputs
Unsupervised learning
• Training data does not include desired outputs
Semi-supervised learning
• Training data includes a few desired outputs
Reinforcement learning
• Rewards from sequence of actions
Types of Learning
13
ML: Traditional Approach
1. Gather as much LABELED data as you can get
2. Throw some algorithms at it (mainly put in an SVM and
keep it at that)
3. If you actually have tried more algos: Pick the best
4. Spend hours hand engineering some features / feature
selection / dimensionality reduction (PCA, SVD, etc)
5. Repeat…
For each new problem/question::
14
Machine Learning for NLP
Data
Classic Approach: Data is fed into a learning algorithm:
Learning 

Algorithm
15
Machine Learning for NLP
some of the (many) treebank datasets
source: http://www-nlp.stanford.edu/links/statnlp.html#Treebanks
!
16
Penn Treebank
That’s a lot of “manual” work:
17
• the students went to class
DT NN VB P NN
• plays well with others
VB ADV P NN
NN NN P DT
• fruit flies like a banana
NN NN VB DT NN
NN VB P DT NN
NN NN P DT NN
NN VB VB DT NN
With a lot of issues:
Penn Treebank
18
Machine Learning for NLP
Learning 

Algorithm
Data
“Features”
Prediction
Prediction/

Classifier
train set
test set
19
Machine Learning for NLP
Learning 

Algorithm
“Features”
Prediction
Prediction/

Classifier
train set
test set
20
Machine Learning for NLP
• Until the early 1990’s, NLP systems were built manually
with hand-crafted dictionaries and rules.
• As large electronic text corpora became increasingly
available, researchers began using machine learning
techniques to automatically build NLP systems.
• Today, the vast majority of NLP systems use machine
learning.
21
2. Neural Networks

and a short history lesson
22
Perceptron (1957)
Frank Rosenblatt 

(1928-1971)
Original Perceptron
Simplified model:
(From Perceptrons by M. L Minsky and S. Papert,
1969, Cambridge, MA: MIT Press. Copyright 1969
by MIT Press.
23
Perceptron (1957)
Perceptron Research, youtube clip: 

https://www.youtube.com/watch?v=cNxadbrN_aI&feature=youtu.be&t=12
24
Perceptron (1957)
25
or
Multilayer Perceptron (1986)
inputs
weights
bias
activation
26
Neuron Model
All you need to know:
27
Activation functions
28
Backpropagation (1974/1986)
1974 Paul Werbos’ invents Backpropagation algorithm for NN
1986 Backdrop popularized by Rumelhart, Hinton, Williams
1990: Renewed Interest in NN’s
29
Backprop Renaissance
Forward Propagation
• Sum inputs, produce activation, feed-forward
30
Backprop Renaissance
Back Propagation (of error)
• Calculate total error at the top
• Calculate contributions to error at each step going
backwards
31
• Compute gradient of example-wise loss wrt
parameters
• Simply applying the derivative chain rule wisely 





• If computing the loss (example, parameters) is O(n)
computation, then so is computing the gradient
Backpropagation
32
Simple Chain Rule
33
Training procedure
• Initialize randomly
• Sequentially give it data.
• See what the difference is between network output
and actual output.
• Update the weights according to this error.
• End result: give a model input, and it produces a
proper output.
Quest for the weights. The weights are the model!
To reiterate:
34
So why only now?
• Inspired by the architectural depth of the brain,
researchers wanted for decades to train deep
multi-layer neural networks.
• No successful attempts were reported before 2006
…Exception: convolutional neural networks,
LeCun 1998
• SVM: Vapnik and his co-workers developed the
Support Vector Machine (1993) (shallow
architecture).
• Breakthrough in 2006!
35
2006 Breakthrough
• More data
• Faster hardware: GPU’s, multi-core CPU’s
• Working ideas on how to train deep architectures
36
2006 Breakthrough
• More data
• Faster hardware: GPU’s, multi-core CPU’s
• Working ideas on how to train deep architectures
37
2006 Breakthrough
38
2006 Breakthrough
• More data
• Faster hardware: GPU’s, multi-core CPU’s
• Working ideas on how to train deep architectures
39
2006 Breakthrough
40
2006 Breakthrough
• More data
• Faster hardware: GPU’s, multi-core CPU’s
• Working ideas on how to train deep
architectures
41
2006 Breakthrough
Stacked Restricted Boltzman Machines* (RBM)
Hinton, G. E, Osindero, S., and Teh, Y. W. (2006).

A fast learning algorithm for deep belief nets.

Neural Computation, 18:1527-1554.
Stacked Autoencoders (AE)
Bengio, Y., Lamblin, P., Popovici, P., Larochelle, H. (2007).

Greedy Layer-Wise Training of Deep Networks,

Advances in Neural Information Processing Systems 19
* called Deep Belief Networks (DBN)
42
3. Deep Learning

onwards we go…
43
44
Hierarchies
Efficient
Generalization
Distributed
Sharing
Unsupervised*
Black Box
Training Time
Major PWNAGE!
Much Data
Why go Deep ?
45
No More Handcrafted Features !
46
— Andrew Ng
“I’ve worked all my life in
Machine Learning, and I’ve
never seen one algorithm knock
over benchmarks like Deep
Learning”
Deep Learning: Why?
47
Biological Justification
Deep Learning = Brain “inspired”

Audio/Visual Cortex has multiple stages == Hierarchical
• Computational Biology • CVAP
• Jorge Dávila-Chacón
• “that guy”
“Brainiacs” “Pragmatists”vs
48
Different Levels of Abstraction
49
Hierarchical Learning
• Natural progression
from low level to high
level structure as seen
in natural complexity
Different Levels of Abstraction
Feature Representation
50
Hierarchical Learning
• Natural progression
from low level to high
level structure as seen
in natural complexity
• Easier to monitor what
is being learnt and to
guide the machine to
better subspaces
Different Levels of Abstraction
Feature Representation
51
Hierarchical Learning
• Natural progression
from low level to high
level structure as seen
in natural complexity
• Easier to monitor what
is being learnt and to
guide the machine to
better subspaces
• A good lower level
representation can be
used for many distinct
tasks
Different Levels of Abstraction
Feature Representation
52
Hierarchical Learning
• Natural progression
from low level to high
level structure as seen
in natural complexity
• Easier to monitor what
is being learnt and to
guide the machine to
better subspaces
• A good lower level
representation can be
used for many distinct
tasks
Different Levels of Abstraction
Feature Representation
53
• Shared Low Level
Representations
• Multi-Task Learning
• Unsupervised Training
Generalizable Learning
54
• Shared Low Level
Representations
• Multi-Task Learning
• Unsupervised Training
• Partial Feature Sharing
• Mixed Mode Learning
• Composition of
Functions
Generalizable Learning
55
Classic Deep Architecture
Input layer
Hidden layers
Output layer
56
Modern Deep Architecture
Input layer
Hidden layers
Output layer
57
Deep Learning: Why? (again)
Beat state of the art in many areas:
• Language Modeling (2012, Mikolov et al)
• Image Recognition (Krizhevsky won
2012 ImageNet competition)
• Sentiment Classification (2011, Socher et
al)
• Speech Recognition (2010, Dahl et al)
• MNIST hand-written digit recognition
(Ciresan et al, 2010)
58
One Model rules them all ?



DL approaches have been successfully applied to:
Deep Learning: Why for NLP ?
Automatic summarization Coreference resolution Discourse analysis
Machine translation Morphological segmentation Named entity recognition (NER)
Natural language generation
Natural language understanding
Optical character recognition (OCR)
Part-of-speech tagging
Parsing
Question answering
Relationship extraction
sentence boundary disambiguation
Sentiment analysis
Speech recognition
Speech segmentation
Topic segmentation and recognition
Word segmentation
Word sense disambiguation
Information retrieval (IR)
Information extraction (IE)
Speech processing
59
- COFFEE BREAK -
after the break we return with: CODE
Download the code samples already now from:
https://github.com/graphific/DL-Meetup-intro
http://goo.gl/abX1E2shortened url: 
 60
• Deep Neural Network
• Multilayer Perceptron (MLP) or Artificial Neural
Network (ANN)
1. MLP
Logistic regression
Training regime: 

Stochastic Gradient Descent (SGD) with minibatches
MNIST dataset
Simple hidden layer
61
2. Convolutional Neural Network
62
from: Krizhevsky, Sutskever, Hinton. (2012). ImageNet Classification with Deep Convolutional Neural Networks
[breakthrough in object recognition, Imagenet 2012]
Convolutional Neural Network
http://ufldl.stanford.edu/wiki/index.php/
Feature_extraction_using_convolution
movie time:
http://www.cs.toronto.edu/~hinton/adi/index.htm
63
Thats it, no more code! (for now)
64
Deep Learning: Future Developments
Currently an explosion of developments
• Hessian-Free networks (2010)
• Long Short Term Memory (2011)
• Large Convolutional nets, max-pooling (2011)
• Nesterov’s Gradient Descent (2013)
Currently state of the art but...
• No way of doing logical inference (extrapolation)
• No easy integration of abstract knowledge
• Hypothetic space bias might not conform with reality
65
Deep Learning: Future Challenges
a
66
Szegedy, C., Wojciech, Z., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R. (2013) Intriguing
properties of neural networks
L: correctly identified, Center: added noise x10, R: “Ostrich”
• cuda-convnet2 (Alex Krizhevsky, Toronto) (c++/
CUDA, optimized for GTX 580) 

https://code.google.com/p/cuda-convnet2/
• Caffe (Berkeley) (Cuda/OpenCL, Theano, Python)

http://caffe.berkeleyvision.org/
• OverFeat (NYU) 

http://cilvr.nyu.edu/doku.php?id=code:start
Wanna Play ?
• Theano - CPU/GPU symbolic expression compiler in
python (from LISA lab at University of Montreal). http://
deeplearning.net/software/theano/
• Pylearn2 - library designed to make machine learning
research easy. http://deeplearning.net/software/pylearn2/
• Torch - Matlab-like environment for state-of-the-art
machine learning algorithms in lua (from Ronan Collobert,
Clement Farabet and Koray Kavukcuoglu) http://torch.ch/
• more info: http://deeplearning.net/software links/
Wanna Play ?
Wanna Play ?
as PhD candidate KTH/CSC:
“Always interested in discussing
Machine Learning, Deep
Architectures, Graphs, and
Language Technology”
In touch!
roelof@kth.se
www.csc.kth.se/~roelof/
Internship / EntrepeneurshipAcademic/Research
as CIO/CTO Feeda:
“Always looking for additions to our 

brand new R&D team”



[Internships upcoming on 

KTH exjobb website…]
roelof@feeda.com
www.feeda.com
Feeda
69
Were Hiring!
roelof@feeda.com
www.feeda.com
Feeda
• Dev Ops
• Software Developers
• Data Scientists
70
Thanks for listening
Mingling time!
71
72
Can’t get enough?
Come to my talk Tomorrow (friday)
Description on KTH website
Visual-Semantic Embeddings: 

some thoughts on Language
Roelof Pieters TCS/CSC
Friday jan 23 13:30.
Room 304, Teknikringen 14 level 3
Appendum
Some of the exciting recent developments in NLP

especially Distributed Semantics
73
Word Embeddings: Turian (2010)
Turian, J., Ratinov, L., Bengio, Y. (2010). Word representations: A simple and general method for semi-supervised learning
code & info: http://metaoptimize.com/projects/wordreprs/74
Word Embeddings: Turian (2010)
Turian, J., Ratinov, L., Bengio, Y. (2010). Word representations: A simple and general method for semi-supervised learning
code & info: http://metaoptimize.com/projects/wordreprs/75
Word Embeddings: Collobert & Weston (2011)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P. (2011) .
Natural Language Processing (almost) from Scratch
76
Multi-embeddings: Stanford (2012)
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng 

Improving Word Representations via Global Context and Multiple Word Prototypes
77
Linguistic Regularities: Mikolov (2013)
code & info: https://code.google.com/p/word2vec/
Mikolov, T., Yih, W., & Zweig, G. (2013). Linguistic Regularities in Continuous Space Word Representations
78
Word Embeddings for MT: Mikolov (2013)
Mikolov, T., Le, V. L., Sutskever, I. (2013) . Exploiting Similarities among Languages for Machine Translation
79
Recursive Deep Models & Sentiment: Socher (2013)
Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Chris Manning, Andrew Ng and Chris Potts. 2013. Recursive
Deep Models for Semantic Compositionality Over a Sentiment Treebank. EMNLP 2013
code & demo: http://nlp.stanford.edu/sentiment/index.html80
Paragraph Vectors: Le & Mikolov (2014)
Le, Q., Mikolov,. T. (2014) Distributed Representations of Sentences and Documents
81
• add context (sentence, paragraph, document) to word
vectors during training
!
Results on Stanford Sentiment 

Treebank dataset:
Global Vectors, GloVe: Stanford (2014)
Pennington, P., Socher, R., Manning,. D.M. (2014). GloVe: Global Vectors for Word Representation
code & demo: http://nlp.stanford.edu/projects/glove/
vs
results on the word analogy task
“similar accuracy”
82
Dependency-based Embeddings: Levy & Goldberg (2014)
Levy, O., Goldberg, Y. (2014). Dependency-Based Word Embeddings
code & demo: https://levyomer.wordpress.com/2014/04/25/
dependency-based-word-embeddings/
- Syntactic Dependency Context
Australian scientist discovers star with telescope
- Bag of Words (BoW) Context
0.3$
0.4$
0.5$
0.6$
0.7$
0.8$
0.9$
1$
0$ 0.1$ 0.2$ 0.3$ 0.4$ 0.5$ 0.6$ 0.7$ 0.8$ 0.9$ 1$
Precision$
Recall$
“Dependency-based
embeddings have more
functional
similarities”
83

Contenu connexe

Tendances

Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introductionRobert Lujo
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingPranav Gupta
 
Nlp research presentation
Nlp research presentationNlp research presentation
Nlp research presentationSurya Sg
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Alia Hamwi
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review Jayneel Vora
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)Kuppusamy P
 
Natural language processing (NLP)
Natural language processing (NLP) Natural language processing (NLP)
Natural language processing (NLP) ASWINKP11
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingIla Group
 
Natural language processing
Natural language processingNatural language processing
Natural language processingAbash shah
 
Natural language processing
Natural language processingNatural language processing
Natural language processingYogendra Tamang
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language ProcessingMichel Bruley
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer modelsDing Li
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksChristian Perone
 

Tendances (20)

Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
Deep learning
Deep learning Deep learning
Deep learning
 
Next word Prediction
Next word PredictionNext word Prediction
Next word Prediction
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Nlp research presentation
Nlp research presentationNlp research presentation
Nlp research presentation
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
 
Natural Language Processing seminar review
Natural Language Processing seminar review Natural Language Processing seminar review
Natural Language Processing seminar review
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
 
Gpt models
Gpt modelsGpt models
Gpt models
 
Natural language processing (NLP)
Natural language processing (NLP) Natural language processing (NLP)
Natural language processing (NLP)
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Ai black box
Ai black boxAi black box
Ai black box
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language Processing
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
 
Bert
BertBert
Bert
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 

En vedette

Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !LINAGORA
 
Hackathon 2014 NLP Hack
Hackathon 2014 NLP HackHackathon 2014 NLP Hack
Hackathon 2014 NLP HackRoelof Pieters
 
iPhone5c的最后猜测
iPhone5c的最后猜测iPhone5c的最后猜测
iPhone5c的最后猜测Yanbin Kong
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageRoelof Pieters
 
How we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.noHow we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.noHenning Spjelkavik
 
Deep Learning in practice : Speech recognition and beyond - Meetup
Deep Learning in practice : Speech recognition and beyond - MeetupDeep Learning in practice : Speech recognition and beyond - Meetup
Deep Learning in practice : Speech recognition and beyond - MeetupLINAGORA
 
Blockchain Economic Theory
Blockchain Economic TheoryBlockchain Economic Theory
Blockchain Economic TheoryMelanie Swan
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
Blockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain ExplainedBlockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain ExplainedMelanie Swan
 
Deep Learning for Chatbot (4/4)
Deep Learning for Chatbot (4/4)Deep Learning for Chatbot (4/4)
Deep Learning for Chatbot (4/4)Jaemin Cho
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning ExplainedMelanie Swan
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Association for Computational Linguistics
 
Cs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and UnderstandingCs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and UnderstandingYanbin Kong
 
Deep Learning for Chatbot (3/4)
Deep Learning for Chatbot (3/4)Deep Learning for Chatbot (3/4)
Deep Learning for Chatbot (3/4)Jaemin Cho
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelYanbin Kong
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document RankingBhaskar Mitra
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
Cs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and SegmentationCs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and SegmentationYanbin Kong
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Roelof Pieters
 

En vedette (20)

Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !
 
Hackathon 2014 NLP Hack
Hackathon 2014 NLP HackHackathon 2014 NLP Hack
Hackathon 2014 NLP Hack
 
iPhone5c的最后猜测
iPhone5c的最后猜测iPhone5c的最后猜测
iPhone5c的最后猜测
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
 
How we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.noHow we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.no
 
Deep Learning in practice : Speech recognition and beyond - Meetup
Deep Learning in practice : Speech recognition and beyond - MeetupDeep Learning in practice : Speech recognition and beyond - Meetup
Deep Learning in practice : Speech recognition and beyond - Meetup
 
Blockchain Economic Theory
Blockchain Economic TheoryBlockchain Economic Theory
Blockchain Economic Theory
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Blockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain ExplainedBlockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain Explained
 
Deep Learning for Chatbot (4/4)
Deep Learning for Chatbot (4/4)Deep Learning for Chatbot (4/4)
Deep Learning for Chatbot (4/4)
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning Explained
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
 
Cs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and UnderstandingCs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and Understanding
 
Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation
Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine TranslationRoee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation
Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation
 
Deep Learning for Chatbot (3/4)
Deep Learning for Chatbot (3/4)Deep Learning for Chatbot (3/4)
Deep Learning for Chatbot (3/4)
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative Model
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document Ranking
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Cs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and SegmentationCs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and Segmentation
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 

Similaire à Deep Learning, an interactive introduction for NLP-ers

Deep Learning: a birds eye view
Deep Learning: a birds eye viewDeep Learning: a birds eye view
Deep Learning: a birds eye viewRoelof Pieters
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPMENGSAYLOEM1
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learningAmr Rashed
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014Paris Open Source Summit
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2Karthik Murugesan
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsRoelof Pieters
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101Felipe Prado
 
Promises of Deep Learning
Promises of Deep LearningPromises of Deep Learning
Promises of Deep LearningDavid Khosid
 
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn..."Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...Edge AI and Vision Alliance
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?NAVER Engineering
 
Deep learning: the future of recommendations
Deep learning: the future of recommendationsDeep learning: the future of recommendations
Deep learning: the future of recommendationsBalázs Hidasi
 
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)Thilo Stadelmann
 
Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Fernando Constantino
 
Introduction of Machine learning and Deep Learning
Introduction of Machine learning and Deep LearningIntroduction of Machine learning and Deep Learning
Introduction of Machine learning and Deep LearningMadhu Sanjeevi (Mady)
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesTuri, Inc.
 
Multi-modal Neural Machine Translation - Iacer Calixto
Multi-modal Neural Machine Translation - Iacer CalixtoMulti-modal Neural Machine Translation - Iacer Calixto
Multi-modal Neural Machine Translation - Iacer CalixtoSebastian Ruder
 

Similaire à Deep Learning, an interactive introduction for NLP-ers (20)

Deep Learning: a birds eye view
Deep Learning: a birds eye viewDeep Learning: a birds eye view
Deep Learning: a birds eye view
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
MILA DL & RL summer school highlights
MILA DL & RL summer school highlights MILA DL & RL summer school highlights
MILA DL & RL summer school highlights
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101
 
Promises of Deep Learning
Promises of Deep LearningPromises of Deep Learning
Promises of Deep Learning
 
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn..."Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
 
Semantic, Cognitive and Perceptual Computing -Deep learning
Semantic, Cognitive and Perceptual Computing -Deep learning Semantic, Cognitive and Perceptual Computing -Deep learning
Semantic, Cognitive and Perceptual Computing -Deep learning
 
Deep learning: the future of recommendations
Deep learning: the future of recommendationsDeep learning: the future of recommendations
Deep learning: the future of recommendations
 
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)
 
Deep Domain
Deep DomainDeep Domain
Deep Domain
 
Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.
 
Introduction of Machine learning and Deep Learning
Introduction of Machine learning and Deep LearningIntroduction of Machine learning and Deep Learning
Introduction of Machine learning and Deep Learning
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
 
lec01.pptx
lec01.pptxlec01.pptx
lec01.pptx
 
Multi-modal Neural Machine Translation - Iacer Calixto
Multi-modal Neural Machine Translation - Iacer CalixtoMulti-modal Neural Machine Translation - Iacer Calixto
Multi-modal Neural Machine Translation - Iacer Calixto
 

Plus de Roelof Pieters

Speculations in anthropology and tech for an uncertain future
Speculations in anthropology and tech for an uncertain futureSpeculations in anthropology and tech for an uncertain future
Speculations in anthropology and tech for an uncertain futureRoelof Pieters
 
AI assisted creativity
AI assisted creativity AI assisted creativity
AI assisted creativity Roelof Pieters
 
Creativity and AI: 
Deep Neural Nets "Going Wild"
Creativity and AI: 
Deep Neural Nets "Going Wild"Creativity and AI: 
Deep Neural Nets "Going Wild"
Creativity and AI: 
Deep Neural Nets "Going Wild"Roelof Pieters
 
Deep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with styleDeep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with styleRoelof Pieters
 
Building a Deep Learning (Dream) Machine
Building a Deep Learning (Dream) MachineBuilding a Deep Learning (Dream) Machine
Building a Deep Learning (Dream) MachineRoelof Pieters
 
Multi-modal embeddings: from discriminative to generative models and creative ai
Multi-modal embeddings: from discriminative to generative models and creative aiMulti-modal embeddings: from discriminative to generative models and creative ai
Multi-modal embeddings: from discriminative to generative models and creative aiRoelof Pieters
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddingsRoelof Pieters
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsRoelof Pieters
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsRoelof Pieters
 
Creative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadCreative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadRoelof Pieters
 
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsPython for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsRoelof Pieters
 
Explore Data: Data Science + Visualization
Explore Data: Data Science + VisualizationExplore Data: Data Science + Visualization
Explore Data: Data Science + VisualizationRoelof Pieters
 
Deep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorDeep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorRoelof Pieters
 
Deep Learning for Information Retrieval
Deep Learning for Information RetrievalDeep Learning for Information Retrieval
Deep Learning for Information RetrievalRoelof Pieters
 
Graph, Data-science, and Deep Learning
Graph, Data-science, and Deep LearningGraph, Data-science, and Deep Learning
Graph, Data-science, and Deep LearningRoelof Pieters
 
Learning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionaryLearning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionaryRoelof Pieters
 
Zero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferZero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferRoelof Pieters
 
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRecommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRoelof Pieters
 

Plus de Roelof Pieters (18)

Speculations in anthropology and tech for an uncertain future
Speculations in anthropology and tech for an uncertain futureSpeculations in anthropology and tech for an uncertain future
Speculations in anthropology and tech for an uncertain future
 
AI assisted creativity
AI assisted creativity AI assisted creativity
AI assisted creativity
 
Creativity and AI: 
Deep Neural Nets "Going Wild"
Creativity and AI: 
Deep Neural Nets "Going Wild"Creativity and AI: 
Deep Neural Nets "Going Wild"
Creativity and AI: 
Deep Neural Nets "Going Wild"
 
Deep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with styleDeep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with style
 
Building a Deep Learning (Dream) Machine
Building a Deep Learning (Dream) MachineBuilding a Deep Learning (Dream) Machine
Building a Deep Learning (Dream) Machine
 
Multi-modal embeddings: from discriminative to generative models and creative ai
Multi-modal embeddings: from discriminative to generative models and creative aiMulti-modal embeddings: from discriminative to generative models and creative ai
Multi-modal embeddings: from discriminative to generative models and creative ai
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddings
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed models
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
Creative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadCreative AI & multimodality: looking ahead
Creative AI & multimodality: looking ahead
 
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsPython for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
 
Explore Data: Data Science + Visualization
Explore Data: Data Science + VisualizationExplore Data: Data Science + Visualization
Explore Data: Data Science + Visualization
 
Deep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorDeep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog Detector
 
Deep Learning for Information Retrieval
Deep Learning for Information RetrievalDeep Learning for Information Retrieval
Deep Learning for Information Retrieval
 
Graph, Data-science, and Deep Learning
Graph, Data-science, and Deep LearningGraph, Data-science, and Deep Learning
Graph, Data-science, and Deep Learning
 
Learning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionaryLearning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionary
 
Zero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferZero shot learning through cross-modal transfer
Zero shot learning through cross-modal transfer
 
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRecommender Systems, Matrices and Graphs
Recommender Systems, Matrices and Graphs
 

Dernier

Cybersecurity Threats and Cybersecurity Best Practices
Cybersecurity Threats and Cybersecurity Best PracticesCybersecurity Threats and Cybersecurity Best Practices
Cybersecurity Threats and Cybersecurity Best PracticesLumiverse Solutions Pvt Ltd
 
IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119APNIC
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predieusebiomeyer
 
TRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptxTRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptxAndrieCagasanAkio
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书rnrncn29
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书rnrncn29
 
Unidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxUnidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxmibuzondetrabajo
 
Company Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptxCompany Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptxMario
 
ETHICAL HACKING dddddddddddddddfnandni.pptx
ETHICAL HACKING dddddddddddddddfnandni.pptxETHICAL HACKING dddddddddddddddfnandni.pptx
ETHICAL HACKING dddddddddddddddfnandni.pptxNIMMANAGANTI RAMAKRISHNA
 

Dernier (9)

Cybersecurity Threats and Cybersecurity Best Practices
Cybersecurity Threats and Cybersecurity Best PracticesCybersecurity Threats and Cybersecurity Best Practices
Cybersecurity Threats and Cybersecurity Best Practices
 
IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predi
 
TRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptxTRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptx
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
 
Unidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxUnidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptx
 
Company Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptxCompany Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptx
 
ETHICAL HACKING dddddddddddddddfnandni.pptx
ETHICAL HACKING dddddddddddddddfnandni.pptxETHICAL HACKING dddddddddddddddfnandni.pptx
ETHICAL HACKING dddddddddddddddfnandni.pptx
 

Deep Learning, an interactive introduction for NLP-ers

  • 1. @graphific Roelof Pieters Introduc0on  to  
 Deep  Learning  for  NLP 22  January  2015  
 Stockholm  Natural  Language  Processing  Meetup FEEDA Slides at:
 http://www.slideshare.net/roelofp/220115dlmeetup 1
  • 3. A couple of headlines… [all November ’14] 3
  • 5. Machine Learning ?? - Audience Check - 5
  • 6. • “Brain” inspired / simulations: • vision: make learning algorithms 
 better and easier to use • goal: revolutions in (practical) 
 advances for machine learning and AI • Deep Learning = subfield of Machine Learning Deep Learning ?? 6
  • 10. DL: Impact 10 Deep Learning for the win! a few examples: • IJCNN 2011 Traffic Sign Recognition Competition • ISBI 2012 Segmentation of neuronal structures in EM stacks challenge • ICDAR 2011 Chinese handwriting recognition
  • 11. • Deals with “construction and study of systems that can learn from data” Machine Learning ?? A computer program is said to learn from experience (E) with respect to some class of tasks (T) and performance measure (P), if its performance at tasks in T, as measured by P, improves with experience E — T. Mitchell 1997 11
  • 12. Machine Learning ?? Traditional Programming: Data Program Output Data Program Output Machine Learning: 12
  • 13. Supervised (inductive) learning • Training data includes desired outputs Unsupervised learning • Training data does not include desired outputs Semi-supervised learning • Training data includes a few desired outputs Reinforcement learning • Rewards from sequence of actions Types of Learning 13
  • 14. ML: Traditional Approach 1. Gather as much LABELED data as you can get 2. Throw some algorithms at it (mainly put in an SVM and keep it at that) 3. If you actually have tried more algos: Pick the best 4. Spend hours hand engineering some features / feature selection / dimensionality reduction (PCA, SVD, etc) 5. Repeat… For each new problem/question:: 14
  • 15. Machine Learning for NLP Data Classic Approach: Data is fed into a learning algorithm: Learning 
 Algorithm 15
  • 16. Machine Learning for NLP some of the (many) treebank datasets source: http://www-nlp.stanford.edu/links/statnlp.html#Treebanks ! 16
  • 17. Penn Treebank That’s a lot of “manual” work: 17
  • 18. • the students went to class DT NN VB P NN • plays well with others VB ADV P NN NN NN P DT • fruit flies like a banana NN NN VB DT NN NN VB P DT NN NN NN P DT NN NN VB VB DT NN With a lot of issues: Penn Treebank 18
  • 19. Machine Learning for NLP Learning 
 Algorithm Data “Features” Prediction Prediction/
 Classifier train set test set 19
  • 20. Machine Learning for NLP Learning 
 Algorithm “Features” Prediction Prediction/
 Classifier train set test set 20
  • 21. Machine Learning for NLP • Until the early 1990’s, NLP systems were built manually with hand-crafted dictionaries and rules. • As large electronic text corpora became increasingly available, researchers began using machine learning techniques to automatically build NLP systems. • Today, the vast majority of NLP systems use machine learning. 21
  • 22. 2. Neural Networks
 and a short history lesson 22
  • 23. Perceptron (1957) Frank Rosenblatt 
 (1928-1971) Original Perceptron Simplified model: (From Perceptrons by M. L Minsky and S. Papert, 1969, Cambridge, MA: MIT Press. Copyright 1969 by MIT Press. 23
  • 24. Perceptron (1957) Perceptron Research, youtube clip: 
 https://www.youtube.com/watch?v=cNxadbrN_aI&feature=youtu.be&t=12 24
  • 27. Neuron Model All you need to know: 27
  • 29. Backpropagation (1974/1986) 1974 Paul Werbos’ invents Backpropagation algorithm for NN 1986 Backdrop popularized by Rumelhart, Hinton, Williams 1990: Renewed Interest in NN’s 29
  • 30. Backprop Renaissance Forward Propagation • Sum inputs, produce activation, feed-forward 30
  • 31. Backprop Renaissance Back Propagation (of error) • Calculate total error at the top • Calculate contributions to error at each step going backwards 31
  • 32. • Compute gradient of example-wise loss wrt parameters • Simply applying the derivative chain rule wisely 
 
 
 • If computing the loss (example, parameters) is O(n) computation, then so is computing the gradient Backpropagation 32
  • 34. Training procedure • Initialize randomly • Sequentially give it data. • See what the difference is between network output and actual output. • Update the weights according to this error. • End result: give a model input, and it produces a proper output. Quest for the weights. The weights are the model! To reiterate: 34
  • 35. So why only now? • Inspired by the architectural depth of the brain, researchers wanted for decades to train deep multi-layer neural networks. • No successful attempts were reported before 2006 …Exception: convolutional neural networks, LeCun 1998 • SVM: Vapnik and his co-workers developed the Support Vector Machine (1993) (shallow architecture). • Breakthrough in 2006! 35
  • 36. 2006 Breakthrough • More data • Faster hardware: GPU’s, multi-core CPU’s • Working ideas on how to train deep architectures 36
  • 37. 2006 Breakthrough • More data • Faster hardware: GPU’s, multi-core CPU’s • Working ideas on how to train deep architectures 37
  • 39. 2006 Breakthrough • More data • Faster hardware: GPU’s, multi-core CPU’s • Working ideas on how to train deep architectures 39
  • 41. 2006 Breakthrough • More data • Faster hardware: GPU’s, multi-core CPU’s • Working ideas on how to train deep architectures 41
  • 42. 2006 Breakthrough Stacked Restricted Boltzman Machines* (RBM) Hinton, G. E, Osindero, S., and Teh, Y. W. (2006).
 A fast learning algorithm for deep belief nets.
 Neural Computation, 18:1527-1554. Stacked Autoencoders (AE) Bengio, Y., Lamblin, P., Popovici, P., Larochelle, H. (2007).
 Greedy Layer-Wise Training of Deep Networks,
 Advances in Neural Information Processing Systems 19 * called Deep Belief Networks (DBN)
42
  • 44. 44
  • 46. No More Handcrafted Features ! 46
  • 47. — Andrew Ng “I’ve worked all my life in Machine Learning, and I’ve never seen one algorithm knock over benchmarks like Deep Learning” Deep Learning: Why? 47
  • 48. Biological Justification Deep Learning = Brain “inspired”
 Audio/Visual Cortex has multiple stages == Hierarchical • Computational Biology • CVAP • Jorge Dávila-Chacón • “that guy” “Brainiacs” “Pragmatists”vs 48
  • 49. Different Levels of Abstraction 49
  • 50. Hierarchical Learning • Natural progression from low level to high level structure as seen in natural complexity Different Levels of Abstraction Feature Representation 50
  • 51. Hierarchical Learning • Natural progression from low level to high level structure as seen in natural complexity • Easier to monitor what is being learnt and to guide the machine to better subspaces Different Levels of Abstraction Feature Representation 51
  • 52. Hierarchical Learning • Natural progression from low level to high level structure as seen in natural complexity • Easier to monitor what is being learnt and to guide the machine to better subspaces • A good lower level representation can be used for many distinct tasks Different Levels of Abstraction Feature Representation 52
  • 53. Hierarchical Learning • Natural progression from low level to high level structure as seen in natural complexity • Easier to monitor what is being learnt and to guide the machine to better subspaces • A good lower level representation can be used for many distinct tasks Different Levels of Abstraction Feature Representation 53
  • 54. • Shared Low Level Representations • Multi-Task Learning • Unsupervised Training Generalizable Learning 54
  • 55. • Shared Low Level Representations • Multi-Task Learning • Unsupervised Training • Partial Feature Sharing • Mixed Mode Learning • Composition of Functions Generalizable Learning 55
  • 56. Classic Deep Architecture Input layer Hidden layers Output layer 56
  • 57. Modern Deep Architecture Input layer Hidden layers Output layer 57
  • 58. Deep Learning: Why? (again) Beat state of the art in many areas: • Language Modeling (2012, Mikolov et al) • Image Recognition (Krizhevsky won 2012 ImageNet competition) • Sentiment Classification (2011, Socher et al) • Speech Recognition (2010, Dahl et al) • MNIST hand-written digit recognition (Ciresan et al, 2010) 58
  • 59. One Model rules them all ?
 
 DL approaches have been successfully applied to: Deep Learning: Why for NLP ? Automatic summarization Coreference resolution Discourse analysis Machine translation Morphological segmentation Named entity recognition (NER) Natural language generation Natural language understanding Optical character recognition (OCR) Part-of-speech tagging Parsing Question answering Relationship extraction sentence boundary disambiguation Sentiment analysis Speech recognition Speech segmentation Topic segmentation and recognition Word segmentation Word sense disambiguation Information retrieval (IR) Information extraction (IE) Speech processing 59
  • 60. - COFFEE BREAK - after the break we return with: CODE Download the code samples already now from: https://github.com/graphific/DL-Meetup-intro http://goo.gl/abX1E2shortened url: 
 60
  • 61. • Deep Neural Network • Multilayer Perceptron (MLP) or Artificial Neural Network (ANN) 1. MLP Logistic regression Training regime: 
 Stochastic Gradient Descent (SGD) with minibatches MNIST dataset Simple hidden layer 61
  • 62. 2. Convolutional Neural Network 62 from: Krizhevsky, Sutskever, Hinton. (2012). ImageNet Classification with Deep Convolutional Neural Networks [breakthrough in object recognition, Imagenet 2012]
  • 64. Thats it, no more code! (for now) 64
  • 65. Deep Learning: Future Developments Currently an explosion of developments • Hessian-Free networks (2010) • Long Short Term Memory (2011) • Large Convolutional nets, max-pooling (2011) • Nesterov’s Gradient Descent (2013) Currently state of the art but... • No way of doing logical inference (extrapolation) • No easy integration of abstract knowledge • Hypothetic space bias might not conform with reality 65
  • 66. Deep Learning: Future Challenges a 66 Szegedy, C., Wojciech, Z., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R. (2013) Intriguing properties of neural networks L: correctly identified, Center: added noise x10, R: “Ostrich”
  • 67. • cuda-convnet2 (Alex Krizhevsky, Toronto) (c++/ CUDA, optimized for GTX 580) 
 https://code.google.com/p/cuda-convnet2/ • Caffe (Berkeley) (Cuda/OpenCL, Theano, Python)
 http://caffe.berkeleyvision.org/ • OverFeat (NYU) 
 http://cilvr.nyu.edu/doku.php?id=code:start Wanna Play ?
  • 68. • Theano - CPU/GPU symbolic expression compiler in python (from LISA lab at University of Montreal). http:// deeplearning.net/software/theano/ • Pylearn2 - library designed to make machine learning research easy. http://deeplearning.net/software/pylearn2/ • Torch - Matlab-like environment for state-of-the-art machine learning algorithms in lua (from Ronan Collobert, Clement Farabet and Koray Kavukcuoglu) http://torch.ch/ • more info: http://deeplearning.net/software links/ Wanna Play ? Wanna Play ?
  • 69. as PhD candidate KTH/CSC: “Always interested in discussing Machine Learning, Deep Architectures, Graphs, and Language Technology” In touch! roelof@kth.se www.csc.kth.se/~roelof/ Internship / EntrepeneurshipAcademic/Research as CIO/CTO Feeda: “Always looking for additions to our 
 brand new R&D team”
 
 [Internships upcoming on 
 KTH exjobb website…] roelof@feeda.com www.feeda.com Feeda 69
  • 70. Were Hiring! roelof@feeda.com www.feeda.com Feeda • Dev Ops • Software Developers • Data Scientists 70
  • 72. 72 Can’t get enough? Come to my talk Tomorrow (friday) Description on KTH website Visual-Semantic Embeddings: 
 some thoughts on Language Roelof Pieters TCS/CSC Friday jan 23 13:30. Room 304, Teknikringen 14 level 3
  • 73. Appendum Some of the exciting recent developments in NLP
 especially Distributed Semantics 73
  • 74. Word Embeddings: Turian (2010) Turian, J., Ratinov, L., Bengio, Y. (2010). Word representations: A simple and general method for semi-supervised learning code & info: http://metaoptimize.com/projects/wordreprs/74
  • 75. Word Embeddings: Turian (2010) Turian, J., Ratinov, L., Bengio, Y. (2010). Word representations: A simple and general method for semi-supervised learning code & info: http://metaoptimize.com/projects/wordreprs/75
  • 76. Word Embeddings: Collobert & Weston (2011) Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P. (2011) . Natural Language Processing (almost) from Scratch 76
  • 77. Multi-embeddings: Stanford (2012) Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng 
 Improving Word Representations via Global Context and Multiple Word Prototypes 77
  • 78. Linguistic Regularities: Mikolov (2013) code & info: https://code.google.com/p/word2vec/ Mikolov, T., Yih, W., & Zweig, G. (2013). Linguistic Regularities in Continuous Space Word Representations 78
  • 79. Word Embeddings for MT: Mikolov (2013) Mikolov, T., Le, V. L., Sutskever, I. (2013) . Exploiting Similarities among Languages for Machine Translation 79
  • 80. Recursive Deep Models & Sentiment: Socher (2013) Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Chris Manning, Andrew Ng and Chris Potts. 2013. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. EMNLP 2013 code & demo: http://nlp.stanford.edu/sentiment/index.html80
  • 81. Paragraph Vectors: Le & Mikolov (2014) Le, Q., Mikolov,. T. (2014) Distributed Representations of Sentences and Documents 81 • add context (sentence, paragraph, document) to word vectors during training ! Results on Stanford Sentiment 
 Treebank dataset:
  • 82. Global Vectors, GloVe: Stanford (2014) Pennington, P., Socher, R., Manning,. D.M. (2014). GloVe: Global Vectors for Word Representation code & demo: http://nlp.stanford.edu/projects/glove/ vs results on the word analogy task “similar accuracy” 82
  • 83. Dependency-based Embeddings: Levy & Goldberg (2014) Levy, O., Goldberg, Y. (2014). Dependency-Based Word Embeddings code & demo: https://levyomer.wordpress.com/2014/04/25/ dependency-based-word-embeddings/ - Syntactic Dependency Context Australian scientist discovers star with telescope - Bag of Words (BoW) Context 0.3$ 0.4$ 0.5$ 0.6$ 0.7$ 0.8$ 0.9$ 1$ 0$ 0.1$ 0.2$ 0.3$ 0.4$ 0.5$ 0.6$ 0.7$ 0.8$ 0.9$ 1$ Precision$ Recall$ “Dependency-based embeddings have more functional similarities” 83