SlideShare une entreprise Scribd logo
1  sur  116
Télécharger pour lire hors ligne
Deep Learning & NLP 
Graphs to the Rescue! (or not yet…) 
Roelof Pieters, KTH/CSC, Graph Technologies R&D 
roelof@kth.se 
www.csc.kth.se/~roelof/ 
Twitter: @graphific 
Stockholm, Sics, October 21 2014
Definitions 
Machine Learning 
Improving some task T based on experience E with 
respect to performance measure P. 
- T. Mitchell (1997) 
Learning denotes changes in the system that are 
adaptive in the sense that they enable the system to 
do the same task (or tasks drawn from a population of 
similar tasks) more effectively the next time. 
- H. Simon (1983) 
2
Definitions 
Representation learning 
Attempts to automatically learn good features or 
representations 
Deep learning 
Attempt to learn multiple levels of representation of 
increasing complexity/abstraction 
3
Overview 
1. From Machine Learning to Deep Learning 
2. Natural Language Processing 
3. Graph-Based Approaches to DL+NLP 
4
1. from 
Machine Learning 
to Deep Learning 
5
Perceptron 
6
Perceptron 
6 
• Rosenblatt 1957
Perceptron 
• Rosenblatt 1957 • Minsky & Papert 1969 
6
Perceptron 
• Rosenblatt 1957 • Minsky & Papert 1969 
The world believed Minsky & Papert… 
6
2th gen Perceptron 
• Quest to make it non-linear 
• no result… 
7 
Until finally… 
• Rumelhart, Hinton & Williams, 1986 
• Multi-Layered Perceptrons (MLP) !!! 
• Backpropagation (Bryson & Ho 1969) 
(Rumelhart, Hinton & Williams, 1986)
• Forward Propagation : 
• Sum inputs, produce activation, feed-forward 
8
• Back Propagation of Error 
• Calculate total error at the top 
• Calculate contributions to error at each step going 
backwards 
9
Phase 1: Propagation 
Each propagation involves the following steps: 
1. Forward propagation of a training pattern's input through the 
neural network in order to generate the propagation's output 
activations. 
2. Backward propagation of the propagation's output activations 
through the neural network using the training pattern target in 
order to generate the deltas of all output and hidden neurons. 
Phase 2: Weight update 
For each weight-synapse follow the following steps: 
1. Multiply its output delta and input activation to get the 
gradient of the weight. 
2. Subtract a ratio (percentage) of the gradient from the weight. 
10
Perceptron Network: SVM 
• Vapnik et al. 1992; 1995. 
11 
• Cortes & Vapnik 1995 
Source: Cortes & Vapnik 1995
Perceptron Network: SVM 
• Vapnik et al. 1992; 1995. 
Kernel SVM 
11 
• Cortes & Vapnik 1995 
Source: Cortes & Vapnik 1995
“2006” 
12
“2006” 
• Faster machines (GPU’s!) 
12
“2006” 
• Faster machines (GPU’s!) 
• More data 
12
“2006” 
• Faster machines (GPU’s!) 
• More data 
• New methods for unsupervised pre-training 
12
“2006” 
• New methods for unsupervised pre-training 
• Stacked RBM’s (Deep Belief Networks [DBN’s] ) 
• Hinton, G. E, Osindero, S., and Teh, Y. W. (2006). A fast learning 
algorithm for deep belief nets. Neural Computation, 
18:1527-1554. 
• Hinton, G. E. and Salakhutdinov, R. R, Reducing the 
dimensionality of data with neural networks. Science, Vol. 313. 
no. 5786, pp. 504 - 507, 28 July 2006. 
13 
• (Stacked) Autoencoders 
• Bengio, Y., Lamblin, P., Popovici, P., Larochelle, H. (2007). 
Greedy Layer-Wise Training of Deep Networks, Advances in 
Neural Information Processing Systems 19
Pretraining: Stacked RBM’s 
• Iterative pre-training construction of Deep Belief 
Network (DBN) (Hinton et al., 2006) 
from: Larochelle et al. (2007). An Empirical Evaluation of Deep Architectures on 
Problems with Many Factors of Variation. 
14
Pretraining: Stacked Denoising Auto-encoder 
• Stacking Auto-Encoders 
from: Bengio ICML 2009 
15
Pretraining: Stacked Denoising Auto-encoder 
16 
• (Vincent et al, 2008) 
• Good vs Corrupted context 
from: Vincent et al 2010
Pretraining: Stacked Denoising Auto-encoder 
16 
• (Vincent et al, 2008) 
• Good vs Corrupted context 
Raw input 
from: Vincent et al 2010
Pretraining: Stacked Denoising Auto-encoder 
16 
• (Vincent et al, 2008) 
• Good vs Corrupted context 
Corrupted input Raw input 
from: Vincent et al 2010
Pretraining: Stacked Denoising Auto-encoder 
16 
• (Vincent et al, 2008) 
• Good vs Corrupted context 
Hidden code (representation) 
Corrupted input Raw input 
from: Vincent et al 2010
Pretraining: Stacked Denoising Auto-encoder 
Corrupted input Raw input reconstruction 
16 
• (Vincent et al, 2008) 
• Good vs Corrupted context 
Hidden code (representation) 
from: Vincent et al 2010
Pretraining: Stacked Denoising Auto-encoder 
KL(reconstruction | raw input) 
Corrupted input Raw input reconstruction 
16 
• (Vincent et al, 2008) 
• Good vs Corrupted context 
Hidden code (representation) 
from: Vincent et al 2010
17
Convolutional Neural Networks (CNNs) 
• Fukushima 1980; LeCun et al. 1998; Behnke 2003; Simard et al. 2003… 
• Hinton et al. 2006; Bengio et al. 
2007; Ranzato et al. 2007 
• Sparse connectivity: 
18 
• MaxPooling 
• Shared weights: 
(Figures from http://deeplearning.net/tutorial/lenet.html)
Pretraining 
• Why does Pretraining work so well? (Erhan et al. 2010) 
• Better Generalisation 
without unsupervised pretraining with unsupervised pretraining) 
Figures from Erhan et al. 2010 
19
Pretraining 
Figures from Erhan et al. 2010 
20
“I’ve worked all my life in Machine Learning, 
and I’ve never seen one algorithm knock over 
benchmarks like Deep Learning” 
–Andrew Ng 
21
The (god)fathers of DL 
22
The (god)fathers of DL 
22
The (god)fathers of DL 
22
DL: (Every)where ? 
23
DL: (Every)where ? 
• Language Modeling (2012, Mikolov et al) 
23
DL: (Every)where ? 
• Language Modeling (2012, Mikolov et al) 
• Image Recognition (Krizhevsky won 2012 
ImageNet competition) 
23
DL: (Every)where ? 
• Language Modeling (2012, Mikolov et al) 
• Image Recognition (Krizhevsky won 2012 
ImageNet competition) 
• Sentiment Classification (2011, Socher et al) 
23
DL: (Every)where ? 
• Language Modeling (2012, Mikolov et al) 
• Image Recognition (Krizhevsky won 2012 
ImageNet competition) 
• Sentiment Classification (2011, Socher et al) 
• Speech Recognition (2010, Dahl et al) 
23
DL: (Every)where ? 
• Language Modeling (2012, Mikolov et al) 
• Image Recognition (Krizhevsky won 2012 
ImageNet competition) 
• Sentiment Classification (2011, Socher et al) 
• Speech Recognition (2010, Dahl et al) 
• MNIST hand-written digit recognition (Ciresan et al, 
2010) 
23
24
So: Why Deep? 
Deep Architectures can be representationally efficient 
• Fewer computational units for same function 
Deep Representations might allow for a hierarchy or 
representation 
• Allows non-local generalisation 
• Comprehensibility 
Multiple levels of latent variables allow combinatorial sharing 
of statistical strength 
25
So: Why Deep? 
Generalizing better to new tasks & domains 
Can learn good intermediate representations shared 
across tasks 
Distributed representations 
Unsupervised Learning 
Multiple levels of representation 
26
Diff Levels of Abstraction 
• Hierarchical Learning 
• Natural progression from low 
level to high level structure 
as seen in natural complexity 
• Easier to monitor what is 
being learnt and to guide the 
machine to better subspaces 
• A good lower level 
representation can be used 
for many distinct tasks 
27
Generalizable Learning 
• Shared Low Level Representations 
• Multi-Task Learning 
• Unsupervised Training 
28
Generalizable Learning 
• Shared Low Level Representations 
• Multi-Task Learning 
• Unsupervised Training 
28 
• Partial Feature Sharing 
• Mixed Mode Learning 
• Composition of Functions
No More Handcrafted Features ! 
29
2. Natural Language 
Processing 
30
DL + NLP 
• Language Modeling 
• Bengio et al. (2000, 2003): via Neural network 
• Mnih and Hinton (2007): via RBMs 
• Pos, Chunking, NER, SRL 
• Collobert and Weston 2008 
• Socher et al 2011; Socher 2014 
31
Language Modeling 
• Word Embeddings (Bengio et al, 2001; Bengio et 
al, 2003) based on idea of distributed 
representations for symbols (Hinton 1986) 
• Neural Word embeddings (Turian et al 2010; 
Collobert et al. 2011) 
32
Word Embeddings 
• Collobert & Weston 2008; Collobert et al. 2011 
• similar to word vector learning, but uses instead of 
single scalar score, a Softmax/Maxent classifier 
word embeddings in from lookup table. From Collobert et al. 2011 
33
Word Embeddings 
• Collobert & Weston 2008; Collobert et al. 2011 
• similar to word vector 
learning, but uses instead 
of single scalar score, a 
Softmax/Maxent classifier 
Figure from Socher et al. Tutorial ACL 2012. 
34
Figure from Socher et al. Tutorial ACL 2012. 
35
• window approach 
• sentence approach 
source: Collobert & Weston, Deep Learning for Natural Language Processing. 2009 Nips 36
• Multi-task learning 
37 
source: Collobert & Weston, Deep Learning for Natural Language Processing. 2009 Nips
38 
General Deep Architecture for NLP 
source: Collobert & Weston, Deep Learning for Natural Language Processing. 2009 Nips
38 
General Deep Architecture for NLP 
Basic features 
source: Collobert & Weston, Deep Learning for Natural Language Processing. 2009 Nips
38 
General Deep Architecture for NLP 
Basic features 
Embeddings 
source: Collobert & Weston, Deep Learning for Natural Language Processing. 2009 Nips
38 
General Deep Architecture for NLP 
Basic features 
Embeddings 
Convolution 
source: Collobert & Weston, Deep Learning for Natural Language Processing. 2009 Nips
38 
General Deep Architecture for NLP 
Basic features 
Embeddings 
Convolution 
Max pooling 
source: Collobert & Weston, Deep Learning for Natural Language Processing. 2009 Nips
38 
General Deep Architecture for NLP 
Basic features 
Embeddings 
Convolution 
Max pooling 
“Supervised” learning 
source: Collobert & Weston, Deep Learning for Natural Language Processing. 2009 Nips
Word Embeddings 
• Unsupervised Word Representations (Turian et al 
2010) 
• evaluates Brown clusters, C&W (Collobert and 
Weston 2008) embeddings, and HLBL (Mnih & 
Hinton, 2009) embeddings of words -> Brown 
clusters win out with a small margin on both NER 
and chunking. 
• more info: http://metaoptimize.com/projects/ 
wordreprs/ 
39
40 
t-SNE visualizations of word embeddings. Left: Number Region; Right: 
Jobs Region. From Turian et al. 2011
http://metaoptimize.com/projects/wordreprs/ 
41
Word Embeddings 
• Collobert & Weston 2008; Collobert et al. 2011 
• Propose a unified neural network architecture, for 
many NLP tasks: 
• part-of-speech tagging, chunking, named entity 
recognition, and semantic role labeling 
• no hand-made input features 
• learns internal representations on the basis of vast 
amounts of mostly unlabeled training data. 
42
Word Embeddings 
• Recurrent Neural Network (Mikolov et al. 2010; 
Mikolov et al. 2013a) 
W(‘‘woman")−W(‘‘man") ≃ W(‘‘aunt")−W(‘‘uncle") 
W(‘‘woman")−W(‘‘man") ≃ W(‘‘queen")−W(‘‘king") 
Figures from Mikolov, T., Yih, W., & Zweig, G. (2013). Linguistic 
Regularities in Continuous Space Word Representations 
43
• Mikolov et al. 2013b 
Figures from Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013b). 
Efficient Estimation of Word Representations in Vector Space 
44
Word Embeddings 
• Recursive (Tensor) Network (Socher et al. 2011; 
Socher 2014) 
45
Vector Space Model 
46
47
48
49
50
51
52
53
3. Graph-Based 
Approaches to DL+NLP 
• A) NLP “naturally encoded” 
• B) Genetic Finite State Machine 
• C) Neural net within Graph 
54
Graph-Based NLP 
• Graphs have a “natural affinity” with NLP [ feel free 
to quote me on that ;) ] 
• relation-oriented 
• index-free adjacency 
55
Whats in a Graph ? 
Figure from Buerli & Obispo (2012). 
56
Whats in a Graph ? 
• Graph Databases: Neo4j, OrientDB, InfoGrid, Titan, 
FlockDB, ArangoDB, InfiniteGraph, AllegroGraph, 
DEX, GraphBase, and HyperGraphDB 
• Distributed graph processing toolkits (based on 
MapReduce, HDFS, and custom BSP engines): 
Bagel, Hama, Giraph, PEGASUS, Faunus, Flink 
• in-memory graph packages designed for massive 
shared-memory (NetworkX, Gephi, MTGL, Boost, 
uRika, and STINGER) 
57
A. NLP “naturally encoded” 
• ie: graph-based opinion summarization (Ganesan 
et al. 2010; Genevan 2013) 
58 
• Captures: 
• Redundancies 
• Gapped Subsequences 
• Collapsible Structures 
From Ganesan 2013 
Natural Affinity, Say what?
Summarization Graph 
59 From Ganesan 2013
Natural Affinity? 
• Demo time! 
60
B. Finite State Graph 
• Bastani 2014a; 2014b; 2014c 
• Probabilistic feature hierarchy 
• Grammatical inference by genetic algorithms 
more info: https://github.com/kbastani/graphify 61 
Figure from Bastani 2014a
Finite State Graph 
62 
• Bastani 2014 
• training phase: 
all figures from Bastani 2014b
Finite State Graph 
62 
• Bastani 2014 
• training phase: 
all figures from Bastani 2014b
Finite State Graph 
62 
• Bastani 2014 
• training phase: 
all figures from Bastani 2014b
Finite State Graph 
62 
• Bastani 2014 
• training phase: 
all figures from Bastani 2014b
• sentiment 
analysis 
• error: 0.3 
Figure from Bastani 2014c 
63
Conceptual Hierarchical 
Graph 
• Demo time! 
64
C. Factor Graph 
• Factor graph in which the factors themselves contain a deep neural net. 
• Factor graph: 
• bipartite graph representing the factorization of a function (Kschischang et al. 
2001; Frey 2002) 
• can combine Bayesian networks (BNs) and Markov random fields (MRFs). 
Figure from Frey 2002 
65
Factor Graph 
• Factor graph with “deep factors” (Mirowski & LeCun 2009) 
• Dynamic Time Series modeling 
66
Energy-Based Graph 
• LeCun et al. 1998, handwriting recognition 
system 
• “Graph Transformer Networks” 
• Instead of normalised HMM, energy 
based factor graph (without normalization) 
• LeCun et al. 2006. 
• Energy-Based Learning 
67
And finally… 
and Finally… 
What you’ve all been waiting for… 
68
And finally… 
and Finally… 
What you’ve all been waiting for… 
Which Net is currently the Biggest ? 
68
And finally… 
and Finally… 
What you’ve all been waiting for… 
Which Net is currently the Biggest ? 
68 
the Deepest
And finally… 
and Finally… 
What you’ve all been waiting for… 
Which Net is currently the Biggest ? 
68 
the Deepest 
The most Bad-ass ?
Winners of: 
Large Scale Visual Recognition Challenge 2014 
(ILSVRC2014) 
19 September 2014 
source: Szegedy et al. Going deeper with convolutions (GoogLeNet ), ILSVRC2014, 19 Sep 2014 69
Winners of: 
Large Scale Visual Recognition Challenge 2014 
(ILSVRC2014) 
19 September 2014 
GoogLeNet 
Convolution 
Pooling 
Softmax 
Other 
source: Szegedy et al. Going deeper with convolutions (GoogLeNet ), ILSVRC2014, 19 Sep 2014 69
Large Scale Visual Recognition Challenge 2014 
GoogLeNet 
Convolution 
Pooling 
Softmax 
Other 
Winners of: 
(ILSVRC2014) 
19 September 2014 
GoogLeNet 
Convolution 
Pooling 
Softmax 
Other 
source: Szegedy et al. Going deeper with convolutions (GoogLeNet ), ILSVRC2014, 19 Sep 2014 69
Inception 
256 480 480 
512 512 512 
832 832 1024 
Width of inception modules ranges from 256 filters (in early modules) to 1024 in top inception 
modules. 
Can remove fully connected layers on top completely 
Number of parameters is reduced to 5 million 
Computional cost is increased by 
less than 2X compared to 
Krizhevsky’s network. (<1.5Bn 
operations/evaluation) 
source: Szegedy et al. Going deeper with convolutions (GoogLeNet ), ILSVRC2014, 19 Sep 2014 70
71 
Classification results on ImageNet 2012 
Team Year Place Error (top-5) Uses external 
data 
SuperVision 2012 - 16.4% no 
SuperVision 2012 1st 15.3% ImageNet 22k 
Clarifai 2013 - 11.7% no 
Clarifai 2013 1st 11.2% ImageNet 22k 
MSRA 2014 3rd 7.35% no 
VGG 2014 2nd 7.32% no 
GoogLeNet 2014 1st 6.67% no 
Final Detection Results 
Team Year Place mAP e x t e r n a l 
data 
ensemble c o n t e x t u a l 
model 
approach 
UvA-Euvision 2013 1st 22.6% none ? yes F i s h e r 
vectors 
Deep Insight 2014 3rd 40.5% I L S V R C 1 2 
Classification 
+ Localization 
3 models yes ConvNet 
C U H K 
DeepID-Net 
2014 2nd 40.7% I L S V R C 1 2 
Classification 
+ Localization 
? no ConvNet 
GoogLeNet 2014 1st 43.9% I L S V R C 1 2 
Classification 6 models no ConvNet 
Detection results 
source: Szegedy et al. Going deeper with convolutions (GoogLeNet ), ILSVRC2014, 19 Sep 2014
Wanna Play? 
• cuda-convnet2 (Alex Krizhevsky, Toronto) (c++/ 
CUDA, optimized for GTX 580) 
https://code.google.com/p/cuda-convnet2/ 
• Caffe (Berkeley) (Cuda/OpenCL, Theano, Python) 
http://caffe.berkeleyvision.org/ 
• OverFeat (NYU) 
http://cilvr.nyu.edu/doku.php?id=code:start 
72
Wanna Play? 
• Theano - CPU/GPU symbolic expression compiler in python 
(from LISA lab at University of Montreal). http:// 
deeplearning.net/software/theano/ 
• Pylearn2 - Pylearn2 is a library designed to make machine 
learning research easy. http://deeplearning.net/software/ 
pylearn2/ 
• Torch - provides a Matlab-like environment for state-of-the-art 
machine learning algorithms in lua (from Ronan Collobert, 
Clement Farabet and Koray Kavukcuoglu) http://torch.ch/ 
• more info: http://deeplearning.net/software links/ 
(slide partially stolen from: J. Sullivan, Convolutional Neural Networks & 
Computer Vision, Machine Learning meetup at Spotify, Stockholm, June 9 
2014) 
73
Fin. 
Questions / Discussion … ? 
74
Bibliography: Definitions 
• Mitchell, T. M. (1997). Machine Learning (1st ed.). New York, NY, 
USA: McGraw-Hill, Inc. 
• Simon, H.A. (1983). Why should machines learn? in: Machine 
Learning: An Artificial Intelligence Approach, (R. Michalski, J. 
Carbonell, T. Mitchell, eds) Tioga Press, 25-38. 
75
Bibliography: History 
• Rosenblatt, Frank (1957), The Perceptron--a perceiving and recognizing automaton. Report 
85-460-1, Cornell Aeronautical Laboratory. 
• Minsky & Papert (1969), Perceptrons: an introduction to computational geometry. 
• Bryson, A.E.; W.F. Denham; S.E. Dreyfus (1963) Optimal programming problems with inequality 
constraints. I: Necessary conditions for extremal solutions. AIAA J. 1, 11 2544-2550. 
• Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J. (1986). "Learning representations 
by back-propagating errors". Nature 323 (6088): 533–536. 
• Boser, B. E., Guyon, I., and Vapnik, V. (1992). A training algorithm for optimal margin classifiers. 
In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pages 144– 
152. ACM Press. 
• Cortes, C. and Vapnik, V. (1995), Support-vector network. Machine Learning, 20:273–297. 
• Larochelle, H., Erhan, D., Courville, A., Bergstra, J., & Bengio, Y. (2007). An Empirical 
Evaluation of Deep Architectures on Problems with Many Factors of Variation. In Proceedings of 
the 24th International Conference on Machine Learning (pp. 473–480). New York, NY, USA: 
ACM. 
• Vincent, P., Larochelle, H., & Lajoie, I. (2010), Stacked denoising autoencoders: Learning useful 
representations in a deep network with a local denoising criterion. Journal of Machine Learning 
Research, 11, 3371–3408. 
76
Bibliography: History - CNN’s 
• Fukushima, Kunihiko (1980). "Neocognitron: A Self-organizing Neural Network Model for a 
Mechanism of Pattern Recognition Unaffected by Shift in Position". Biological Cybernetics 36 
(4): 193–202. doi:10.1007/BF00344251. PMID 7370364. Retrieved 16 November 2013. 
• LeCun, Yann; Léon Bottou; Yoshua Bengio; Patrick Haffner (1998). "Gradient-based learning 
applied to document recognition". Proceedings of the IEEE 86 (11): 2278–2324. 
• S. Behnke. Hierarchical Neural Networks for Image Interpretation, volume 2766 of Lecture 
Notes in Computer Science. Springer, 2003. 
• Simard, Patrice, David Steinkraus, and John C. Platt. "Best Practices for Convolutional Neural 
Networks Applied to Visual Document Analysis." In ICDAR, vol. 3, pp. 958-962. 2003. 
• Hinton, GE; Osindero, S; Teh, YW (Jul 2006). "A fast learning algorithm for deep belief nets.". 
Neural computation 18 (7): 1527–54. 
• Bengio, Yoshua; Lamblin, Pascal; Popovici, Dan; Larochelle, Hugo (2007). "Greedy Layer-Wise 
Training of Deep Networks". Advances in Neural Information Processing Systems: 153–160. 
• Ranzato, MarcAurelio; Poultney, Christopher; Chopra, Sumit; LeCun, Yann (2007). "Efficient 
Learning of Sparse Representations with an Energy-Based Model". Advances in Neural 
Information Processing Systems. 
77
Bibliography: DL 
• Bengio, Y., Ducharme, R., & Vincent, P. (2001). A Neural Probabilistic Language Model. 
In T. K. Leen & T. G. Dietterich (Eds.), Advances in Neural Information Processing 
Systems 13 (NIPS’00). MIT Press. 
• Bengio, Y., Ducharme, R., Vincent, P., & Janvin, C. (2003). A Neural Probabilistic 
Language Model. The Journal of Machine Learning Research, 3, 1137–1155. 
• Bengio, Y., Lamblin, P., Popovici, P., Larochelle, H. (2007). Greedy Layer-Wise Training 
of Deep Networks, Advances in Neural Information Processing Systems 19 
• Hinton, G. E. (1986). Learning distributed representations of concepts. In Proceedings 
of the eighth annual conference of the cognitive science society (Vol. 1, p. 12). 
• Hinton, G. E. and Salakhutdinov, R. R, (2006) Reducing the dimensionality of data with 
neural networks. Science, Vol. 313. no. 5786, pp. 504 - 507, 28 July 2006. 
• Hinton, G. E, Osindero, S., and Teh, Y. W. (2006). A fast learning algorithm for deep 
belief nets. Neural Computation, 18:1527-1554. 
• Erhan, D., Bengio, Y., & Courville, A. (2010). Why does unsupervised pre-training help 
deep learning? Journal of Machine Learning Research, 11, 625–660. 
78
Bibliography: DL 
• P. Vincent, P., Larochelle, H., Bengio, Y. and Manzagol, P. A. (2008) Extracting and 
composing robust features with denoising autoencoders. In ICML. 
• Vincent, P., Larochelle, H., & Lajoie, I. (2010). Stacked denoising autoencoders: 
Learning useful representations in a deep network with a local denoising criterion. 
Journal of Machine Learning Research, 11, 3371–3408. Bengui 2009 
• Krizhevsky, A., Sutskever, I. and Hinton, G. E. (2012) Imagenet classification with 
deep convolutional neural networks. In NIPS. 
• Socher, Richard, Jeffrey Pennington, Eric H. Huang, Andrew Y. Ng, and 
Christopher D. Manning. (2011). Semi-supervised recursive autoencoders for 
predict- ing sentiment distributions. In Proceedings of the 2011 Conference on 
Empiri- cal Methods in Natural Language Processing (EMNLP). 
• Dahl, G. E., Ranzato, M. A., Mohamed, A. and Hinton, G. E. (2010) Phone 
recognition with the mean-covariance restricted Boltzmann machine. In NIPS. 
• Ciresan, D. C., Meier, U., Gambardella, L. M., & Schmidhuber, J. (2010). Deep Big 
Simple Neural Nets Excel on Handwritten Digit Recognition. CoRR. 
• Szegedy et al. (2014) Going deeper with convolutions (GoogLeNet ), ILSVRC2014, 
19 Sep 2014 
79
Bibliography: NLP 
• Turian, J., Ratinov, L., & Bengio, Y. (2010). Word Representations: A Simple and 
General Method for Semi-supervised Learning. In Proceedings of the 48th Annual 
Meeting of the Association for Computational Linguistics (pp. 384–394). 
Stroudsburg, PA, USA: Association for Computational Linguistics. 
• Collobert, R., & Weston, J. (2008). A unified architecture for natural language 
processing: Deep neural networks with multitask learning. Proceedings of the 25th 
International Conference …. 
• Collobert, R., Weston, J., & Bottou, L. (2011). Natural language processing (almost) 
from scratch. The Journal of Machine Learning Research, 12:2493-2537. 
• Collobert & Weston, Deep Learning for Natural Language Processing (2009) Nips 
Tutorial 
• Mikolov, T., Yih, W., & Zweig, G. (2013a). Linguistic Regularities in Continuous 
Space Word Representations. HLT-NAACL, (June), 746–751. 
• Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013b). Efficient Estimation of Word 
Representations in Vector Space, 1–12. Computation and Language. 
80
Bibliography: NLP 
• Bengio, Y. and Bengio, S (2000) Modeling high- dimensional 
discrete data with multi-layer neural networks. In Proceedings of 
NIPS 12 
• Mnih, A. and Hinton, G. E. (2007) Three New Graphical Models for 
Statistical Language Modelling. International Conference on 
Machine Learning, Corvallis, Oregon. 
• Socher, R., Bengio, Y., & Manning, C. (2012). Deep Learning for 
NLP (without Magic). Tutorial Abstracts of ACL 2012. 
• Socher, R. (2014). recursive deep learning for natural language 
processing and computer vision. Dissertation. 
81
Bibliography: Graph-Based Approaches 
• Frey, B. (2002). Extending factor graphs so as to unify directed and 
undirected graphical models. Proceedings of the Nineteenth Conference on 
Uncertainty in Artificial Intelligence 19 (UAI 03), Morgan Kaufmann, CA, 
Acapulco, Mexico, 257–264. 
• F. R. Kschischang, B. J. Frey, H. A. L. (2001). Factor graphs and the sum-product 
algorithm. IEEE Transactions on Information Theory, 47(2), 498–519. 
• Mirowski, P., & LeCun, Y. (2009). Dynamic factor graphs for time series 
modeling. Machine Learning and Knowledge Discovery. 
• LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning 
applied to document recognition. Proceedings of the IEEE November 1998. 
• LeCun, Y., Chopra, S., Hadsell, R., Ranzato, M. A., & Huang, F. J. (2006). A 
Tutorial on Energy-Based Learning 1 Introduction : Energy-Based Models, 1– 
59. 
82
Bibliography: Graph-Based Approaches 
• Buerli, M., & Obispo, C. (2012). The current state of graph databases. 
Department of Computer Science, Cal Poly San Luis Obispo 
• Ganesan, K., Zhai, C., & Han, J. (2010). Opinosis: a graph-based approach to 
abstractive summarization of highly redundant opinions. Proceedings of the 
23rd International Conference on Computational Linguistics (Coling 2010), 
(August), 340–348. 
• Ganesan, K. (2013). Opinion Driven Decision Support System. PhD 
Dissertation, University of Illinois. 
• Bastani, K. 2014a, Hierarchical Pattern Recognition, Blog: Meaning Of, June 
17, 2014 
• Bastani, K. 2014b, Using a Graph Database for Deep Learning Text 
Classification, Blog: Meaning Of, August 26, 2014 
• Bastani, K. 2014c, Deep Learning Sentiment Analysis for Movie Reviews using 
Neo4j, Blog: Meaning Of, September 15, 2014 
83

Contenu connexe

Tendances

What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceJonathan Mugan
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceJonathan Mugan
 
What's Wrong With Deep Learning?
What's Wrong With Deep Learning?What's Wrong With Deep Learning?
What's Wrong With Deep Learning?Philip Zheng
 
Deep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with styleDeep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with styleRoelof Pieters
 
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...Universitat Politècnica de Catalunya
 
Geek Night 17.0 - Artificial Intelligence and Machine Learning
Geek Night 17.0 - Artificial Intelligence and Machine LearningGeek Night 17.0 - Artificial Intelligence and Machine Learning
Geek Night 17.0 - Artificial Intelligence and Machine LearningGeekNightHyderabad
 
Neural Networks and Deep Learning
Neural Networks and Deep LearningNeural Networks and Deep Learning
Neural Networks and Deep LearningAsim Jalis
 
Deep Learning in practice : Speech recognition and beyond - Meetup
Deep Learning in practice : Speech recognition and beyond - MeetupDeep Learning in practice : Speech recognition and beyond - Meetup
Deep Learning in practice : Speech recognition and beyond - MeetupLINAGORA
 
DSRLab seminar Introduction to deep learning
DSRLab seminar   Introduction to deep learningDSRLab seminar   Introduction to deep learning
DSRLab seminar Introduction to deep learningPoo Kuan Hoong
 
Deep Learning: a birds eye view
Deep Learning: a birds eye viewDeep Learning: a birds eye view
Deep Learning: a birds eye viewRoelof Pieters
 
Deep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorDeep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorRoelof Pieters
 
Deep learning in Computer Vision
Deep learning in Computer VisionDeep learning in Computer Vision
Deep learning in Computer VisionDavid Dao
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerPoo Kuan Hoong
 
Intro To Convolutional Neural Networks
Intro To Convolutional Neural NetworksIntro To Convolutional Neural Networks
Intro To Convolutional Neural NetworksMark Scully
 
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Turi, Inc.
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsBenjamin Le
 
2010 deep learning and unsupervised feature learning
2010 deep learning and unsupervised feature learning2010 deep learning and unsupervised feature learning
2010 deep learning and unsupervised feature learningVan Thanh
 

Tendances (20)

What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial Intelligence
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial Intelligence
 
What's Wrong With Deep Learning?
What's Wrong With Deep Learning?What's Wrong With Deep Learning?
What's Wrong With Deep Learning?
 
Deep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with styleDeep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with style
 
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
 
Geek Night 17.0 - Artificial Intelligence and Machine Learning
Geek Night 17.0 - Artificial Intelligence and Machine LearningGeek Night 17.0 - Artificial Intelligence and Machine Learning
Geek Night 17.0 - Artificial Intelligence and Machine Learning
 
Tutorial on Deep Learning
Tutorial on Deep LearningTutorial on Deep Learning
Tutorial on Deep Learning
 
Neural Networks and Deep Learning
Neural Networks and Deep LearningNeural Networks and Deep Learning
Neural Networks and Deep Learning
 
Deep Learning in practice : Speech recognition and beyond - Meetup
Deep Learning in practice : Speech recognition and beyond - MeetupDeep Learning in practice : Speech recognition and beyond - Meetup
Deep Learning in practice : Speech recognition and beyond - Meetup
 
DSRLab seminar Introduction to deep learning
DSRLab seminar   Introduction to deep learningDSRLab seminar   Introduction to deep learning
DSRLab seminar Introduction to deep learning
 
Deep Learning: a birds eye view
Deep Learning: a birds eye viewDeep Learning: a birds eye view
Deep Learning: a birds eye view
 
Deep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorDeep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog Detector
 
Deep learning in Computer Vision
Deep learning in Computer VisionDeep learning in Computer Vision
Deep learning in Computer Vision
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
 
Intro To Convolutional Neural Networks
Intro To Convolutional Neural NetworksIntro To Convolutional Neural Networks
Intro To Convolutional Neural Networks
 
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
 
Deep learning
Deep learningDeep learning
Deep learning
 
2010 deep learning and unsupervised feature learning
2010 deep learning and unsupervised feature learning2010 deep learning and unsupervised feature learning
2010 deep learning and unsupervised feature learning
 

En vedette

Advanced Node.JS Meetup
Advanced Node.JS MeetupAdvanced Node.JS Meetup
Advanced Node.JS MeetupLINAGORA
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchBhaskar Mitra
 
Philosophy of Deep Learning
Philosophy of Deep LearningPhilosophy of Deep Learning
Philosophy of Deep LearningMelanie Swan
 
State of Blockchain 2017: Smartnetworks and the Blockchain Economy
State of Blockchain 2017:  Smartnetworks and the Blockchain EconomyState of Blockchain 2017:  Smartnetworks and the Blockchain Economy
State of Blockchain 2017: Smartnetworks and the Blockchain EconomyMelanie Swan
 
iPhone5c的最后猜测
iPhone5c的最后猜测iPhone5c的最后猜测
iPhone5c的最后猜测Yanbin Kong
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageRoelof Pieters
 
Hackathon 2014 NLP Hack
Hackathon 2014 NLP HackHackathon 2014 NLP Hack
Hackathon 2014 NLP HackRoelof Pieters
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !LINAGORA
 
Technological Unemployment and the Robo-Economy
Technological Unemployment and the Robo-EconomyTechnological Unemployment and the Robo-Economy
Technological Unemployment and the Robo-EconomyMelanie Swan
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelYanbin Kong
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersRoelof Pieters
 
Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...
Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...
Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...Association for Computational Linguistics
 
Cs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and SegmentationCs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and SegmentationYanbin Kong
 
Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)Jaemin Cho
 
Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...Bhaskar Mitra
 
Blockchain Economic Theory
Blockchain Economic TheoryBlockchain Economic Theory
Blockchain Economic TheoryMelanie Swan
 

En vedette (20)

Advanced Node.JS Meetup
Advanced Node.JS MeetupAdvanced Node.JS Meetup
Advanced Node.JS Meetup
 
Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation
Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine TranslationRoee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation
Roee Aharoni - 2017 - Towards String-to-Tree Neural Machine Translation
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for Search
 
Philosophy of Deep Learning
Philosophy of Deep LearningPhilosophy of Deep Learning
Philosophy of Deep Learning
 
State of Blockchain 2017: Smartnetworks and the Blockchain Economy
State of Blockchain 2017:  Smartnetworks and the Blockchain EconomyState of Blockchain 2017:  Smartnetworks and the Blockchain Economy
State of Blockchain 2017: Smartnetworks and the Blockchain Economy
 
iPhone5c的最后猜测
iPhone5c的最后猜测iPhone5c的最后猜测
iPhone5c的最后猜测
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
 
Hackathon 2014 NLP Hack
Hackathon 2014 NLP HackHackathon 2014 NLP Hack
Hackathon 2014 NLP Hack
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !
 
Technological Unemployment and the Robo-Economy
Technological Unemployment and the Robo-EconomyTechnological Unemployment and the Robo-Economy
Technological Unemployment and the Robo-Economy
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative Model
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ers
 
Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...
Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...
Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Re...
 
Cs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and SegmentationCs231n 2017 lecture11 Detection and Segmentation
Cs231n 2017 lecture11 Detection and Segmentation
 
Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)
 
Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...Exploring Session Context using Distributed Representations of Queries and Re...
Exploring Session Context using Distributed Representations of Queries and Re...
 
Blockchain Economic Theory
Blockchain Economic TheoryBlockchain Economic Theory
Blockchain Economic Theory
 

Similaire à Deep Learning & NLP: Graphs to the Rescue!

Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsRoelof Pieters
 
Deep Learning for Information Retrieval
Deep Learning for Information RetrievalDeep Learning for Information Retrieval
Deep Learning for Information RetrievalRoelof Pieters
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2Karthik Murugesan
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPMachine Learning Prague
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPMENGSAYLOEM1
 
Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Fernando Constantino
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?NAVER Engineering
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsRoelof Pieters
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Sudeep Das, Ph.D.
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer ConnectAnuj Gupta
 
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)Thilo Stadelmann
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer modelsDing Li
 
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)Universitat Politècnica de Catalunya
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014Paris Open Source Summit
 
Evolution of Deep Learning and new advancements
Evolution of Deep Learning and new advancementsEvolution of Deep Learning and new advancements
Evolution of Deep Learning and new advancementsChitta Ranjan
 
MLIP - Chapter 3 - Introduction to deep learning
MLIP - Chapter 3 - Introduction to deep learningMLIP - Chapter 3 - Introduction to deep learning
MLIP - Chapter 3 - Introduction to deep learningCharles Deledalle
 
ODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPindico data
 

Similaire à Deep Learning & NLP: Graphs to the Rescue! (20)

Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
Deep Learning for Information Retrieval
Deep Learning for Information RetrievalDeep Learning for Information Retrieval
Deep Learning for Information Retrieval
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLP
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
 
Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
 
MILA DL & RL summer school highlights
MILA DL & RL summer school highlights MILA DL & RL summer school highlights
MILA DL & RL summer school highlights
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed models
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it!
 
Tackling Deep Software Variability Together
Tackling Deep Software Variability TogetherTackling Deep Software Variability Together
Tackling Deep Software Variability Together
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer Connect
 
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)
Deep Learning @ ZHAW Datalab (with Mark Cieliebak & Yves Pauchard)
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
 
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
 
Evolution of Deep Learning and new advancements
Evolution of Deep Learning and new advancementsEvolution of Deep Learning and new advancements
Evolution of Deep Learning and new advancements
 
MLIP - Chapter 3 - Introduction to deep learning
MLIP - Chapter 3 - Introduction to deep learningMLIP - Chapter 3 - Introduction to deep learning
MLIP - Chapter 3 - Introduction to deep learning
 
ODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLP
 

Plus de Roelof Pieters

Speculations in anthropology and tech for an uncertain future
Speculations in anthropology and tech for an uncertain futureSpeculations in anthropology and tech for an uncertain future
Speculations in anthropology and tech for an uncertain futureRoelof Pieters
 
AI assisted creativity
AI assisted creativity AI assisted creativity
AI assisted creativity Roelof Pieters
 
Creativity and AI: 
Deep Neural Nets "Going Wild"
Creativity and AI: 
Deep Neural Nets "Going Wild"Creativity and AI: 
Deep Neural Nets "Going Wild"
Creativity and AI: 
Deep Neural Nets "Going Wild"Roelof Pieters
 
Building a Deep Learning (Dream) Machine
Building a Deep Learning (Dream) MachineBuilding a Deep Learning (Dream) Machine
Building a Deep Learning (Dream) MachineRoelof Pieters
 
Multi-modal embeddings: from discriminative to generative models and creative ai
Multi-modal embeddings: from discriminative to generative models and creative aiMulti-modal embeddings: from discriminative to generative models and creative ai
Multi-modal embeddings: from discriminative to generative models and creative aiRoelof Pieters
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddingsRoelof Pieters
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsRoelof Pieters
 
Creative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadCreative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadRoelof Pieters
 
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsPython for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsRoelof Pieters
 
Explore Data: Data Science + Visualization
Explore Data: Data Science + VisualizationExplore Data: Data Science + Visualization
Explore Data: Data Science + VisualizationRoelof Pieters
 
Graph, Data-science, and Deep Learning
Graph, Data-science, and Deep LearningGraph, Data-science, and Deep Learning
Graph, Data-science, and Deep LearningRoelof Pieters
 
Learning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionaryLearning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionaryRoelof Pieters
 
Zero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferZero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferRoelof Pieters
 
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRecommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRoelof Pieters
 

Plus de Roelof Pieters (14)

Speculations in anthropology and tech for an uncertain future
Speculations in anthropology and tech for an uncertain futureSpeculations in anthropology and tech for an uncertain future
Speculations in anthropology and tech for an uncertain future
 
AI assisted creativity
AI assisted creativity AI assisted creativity
AI assisted creativity
 
Creativity and AI: 
Deep Neural Nets "Going Wild"
Creativity and AI: 
Deep Neural Nets "Going Wild"Creativity and AI: 
Deep Neural Nets "Going Wild"
Creativity and AI: 
Deep Neural Nets "Going Wild"
 
Building a Deep Learning (Dream) Machine
Building a Deep Learning (Dream) MachineBuilding a Deep Learning (Dream) Machine
Building a Deep Learning (Dream) Machine
 
Multi-modal embeddings: from discriminative to generative models and creative ai
Multi-modal embeddings: from discriminative to generative models and creative aiMulti-modal embeddings: from discriminative to generative models and creative ai
Multi-modal embeddings: from discriminative to generative models and creative ai
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddings
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
Creative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadCreative AI & multimodality: looking ahead
Creative AI & multimodality: looking ahead
 
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsPython for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
 
Explore Data: Data Science + Visualization
Explore Data: Data Science + VisualizationExplore Data: Data Science + Visualization
Explore Data: Data Science + Visualization
 
Graph, Data-science, and Deep Learning
Graph, Data-science, and Deep LearningGraph, Data-science, and Deep Learning
Graph, Data-science, and Deep Learning
 
Learning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionaryLearning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionary
 
Zero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferZero shot learning through cross-modal transfer
Zero shot learning through cross-modal transfer
 
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRecommender Systems, Matrices and Graphs
Recommender Systems, Matrices and Graphs
 

Dernier

DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...Henrik Hanke
 
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxEngaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxAsifArshad8
 
The Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationThe Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationNathan Young
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRRsarwankumar4524
 
Early Modern Spain. All about this period
Early Modern Spain. All about this periodEarly Modern Spain. All about this period
Early Modern Spain. All about this periodSaraIsabelJimenez
 
Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸mathanramanathan2005
 
Application of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxApplication of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxRoquia Salam
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...漢銘 謝
 
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.KathleenAnnCordero2
 
Event 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxEvent 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxaryanv1753
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRachelAnnTenibroAmaz
 
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power
 
Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Escort Service
 
CHROMATOGRAPHY and its types with procedure,diagrams,flow charts,advantages a...
CHROMATOGRAPHY and its types with procedure,diagrams,flow charts,advantages a...CHROMATOGRAPHY and its types with procedure,diagrams,flow charts,advantages a...
CHROMATOGRAPHY and its types with procedure,diagrams,flow charts,advantages a...university
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEMCharmi13
 
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comSaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comsaastr
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptxogubuikealex
 
Internship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SEInternship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SESaleh Ibne Omar
 
proposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeegerproposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeegerkumenegertelayegrama
 

Dernier (19)

DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
 
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptxEngaging Eid Ul Fitr Presentation for Kindergartners.pptx
Engaging Eid Ul Fitr Presentation for Kindergartners.pptx
 
The Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationThe Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism Presentation
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
 
Early Modern Spain. All about this period
Early Modern Spain. All about this periodEarly Modern Spain. All about this period
Early Modern Spain. All about this period
 
Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸
 
Application of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptxApplication of GIS in Landslide Disaster Response.pptx
Application of GIS in Landslide Disaster Response.pptx
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
 
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
 
Event 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxEvent 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptx
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
 
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
 
Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170
 
CHROMATOGRAPHY and its types with procedure,diagrams,flow charts,advantages a...
CHROMATOGRAPHY and its types with procedure,diagrams,flow charts,advantages a...CHROMATOGRAPHY and its types with procedure,diagrams,flow charts,advantages a...
CHROMATOGRAPHY and its types with procedure,diagrams,flow charts,advantages a...
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEM
 
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comSaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptx
 
Internship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SEInternship Presentation | PPT | CSE | SE
Internship Presentation | PPT | CSE | SE
 
proposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeegerproposal kumeneger edited.docx A kumeeger
proposal kumeneger edited.docx A kumeeger
 

Deep Learning & NLP: Graphs to the Rescue!

  • 1. Deep Learning & NLP Graphs to the Rescue! (or not yet…) Roelof Pieters, KTH/CSC, Graph Technologies R&D roelof@kth.se www.csc.kth.se/~roelof/ Twitter: @graphific Stockholm, Sics, October 21 2014
  • 2. Definitions Machine Learning Improving some task T based on experience E with respect to performance measure P. - T. Mitchell (1997) Learning denotes changes in the system that are adaptive in the sense that they enable the system to do the same task (or tasks drawn from a population of similar tasks) more effectively the next time. - H. Simon (1983) 2
  • 3. Definitions Representation learning Attempts to automatically learn good features or representations Deep learning Attempt to learn multiple levels of representation of increasing complexity/abstraction 3
  • 4. Overview 1. From Machine Learning to Deep Learning 2. Natural Language Processing 3. Graph-Based Approaches to DL+NLP 4
  • 5. 1. from Machine Learning to Deep Learning 5
  • 7. Perceptron 6 • Rosenblatt 1957
  • 8. Perceptron • Rosenblatt 1957 • Minsky & Papert 1969 6
  • 9. Perceptron • Rosenblatt 1957 • Minsky & Papert 1969 The world believed Minsky & Papert… 6
  • 10. 2th gen Perceptron • Quest to make it non-linear • no result… 7 Until finally… • Rumelhart, Hinton & Williams, 1986 • Multi-Layered Perceptrons (MLP) !!! • Backpropagation (Bryson & Ho 1969) (Rumelhart, Hinton & Williams, 1986)
  • 11. • Forward Propagation : • Sum inputs, produce activation, feed-forward 8
  • 12. • Back Propagation of Error • Calculate total error at the top • Calculate contributions to error at each step going backwards 9
  • 13. Phase 1: Propagation Each propagation involves the following steps: 1. Forward propagation of a training pattern's input through the neural network in order to generate the propagation's output activations. 2. Backward propagation of the propagation's output activations through the neural network using the training pattern target in order to generate the deltas of all output and hidden neurons. Phase 2: Weight update For each weight-synapse follow the following steps: 1. Multiply its output delta and input activation to get the gradient of the weight. 2. Subtract a ratio (percentage) of the gradient from the weight. 10
  • 14. Perceptron Network: SVM • Vapnik et al. 1992; 1995. 11 • Cortes & Vapnik 1995 Source: Cortes & Vapnik 1995
  • 15. Perceptron Network: SVM • Vapnik et al. 1992; 1995. Kernel SVM 11 • Cortes & Vapnik 1995 Source: Cortes & Vapnik 1995
  • 17. “2006” • Faster machines (GPU’s!) 12
  • 18. “2006” • Faster machines (GPU’s!) • More data 12
  • 19. “2006” • Faster machines (GPU’s!) • More data • New methods for unsupervised pre-training 12
  • 20. “2006” • New methods for unsupervised pre-training • Stacked RBM’s (Deep Belief Networks [DBN’s] ) • Hinton, G. E, Osindero, S., and Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18:1527-1554. • Hinton, G. E. and Salakhutdinov, R. R, Reducing the dimensionality of data with neural networks. Science, Vol. 313. no. 5786, pp. 504 - 507, 28 July 2006. 13 • (Stacked) Autoencoders • Bengio, Y., Lamblin, P., Popovici, P., Larochelle, H. (2007). Greedy Layer-Wise Training of Deep Networks, Advances in Neural Information Processing Systems 19
  • 21. Pretraining: Stacked RBM’s • Iterative pre-training construction of Deep Belief Network (DBN) (Hinton et al., 2006) from: Larochelle et al. (2007). An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation. 14
  • 22. Pretraining: Stacked Denoising Auto-encoder • Stacking Auto-Encoders from: Bengio ICML 2009 15
  • 23. Pretraining: Stacked Denoising Auto-encoder 16 • (Vincent et al, 2008) • Good vs Corrupted context from: Vincent et al 2010
  • 24. Pretraining: Stacked Denoising Auto-encoder 16 • (Vincent et al, 2008) • Good vs Corrupted context Raw input from: Vincent et al 2010
  • 25. Pretraining: Stacked Denoising Auto-encoder 16 • (Vincent et al, 2008) • Good vs Corrupted context Corrupted input Raw input from: Vincent et al 2010
  • 26. Pretraining: Stacked Denoising Auto-encoder 16 • (Vincent et al, 2008) • Good vs Corrupted context Hidden code (representation) Corrupted input Raw input from: Vincent et al 2010
  • 27. Pretraining: Stacked Denoising Auto-encoder Corrupted input Raw input reconstruction 16 • (Vincent et al, 2008) • Good vs Corrupted context Hidden code (representation) from: Vincent et al 2010
  • 28. Pretraining: Stacked Denoising Auto-encoder KL(reconstruction | raw input) Corrupted input Raw input reconstruction 16 • (Vincent et al, 2008) • Good vs Corrupted context Hidden code (representation) from: Vincent et al 2010
  • 29. 17
  • 30. Convolutional Neural Networks (CNNs) • Fukushima 1980; LeCun et al. 1998; Behnke 2003; Simard et al. 2003… • Hinton et al. 2006; Bengio et al. 2007; Ranzato et al. 2007 • Sparse connectivity: 18 • MaxPooling • Shared weights: (Figures from http://deeplearning.net/tutorial/lenet.html)
  • 31. Pretraining • Why does Pretraining work so well? (Erhan et al. 2010) • Better Generalisation without unsupervised pretraining with unsupervised pretraining) Figures from Erhan et al. 2010 19
  • 32. Pretraining Figures from Erhan et al. 2010 20
  • 33. “I’ve worked all my life in Machine Learning, and I’ve never seen one algorithm knock over benchmarks like Deep Learning” –Andrew Ng 21
  • 38. DL: (Every)where ? • Language Modeling (2012, Mikolov et al) 23
  • 39. DL: (Every)where ? • Language Modeling (2012, Mikolov et al) • Image Recognition (Krizhevsky won 2012 ImageNet competition) 23
  • 40. DL: (Every)where ? • Language Modeling (2012, Mikolov et al) • Image Recognition (Krizhevsky won 2012 ImageNet competition) • Sentiment Classification (2011, Socher et al) 23
  • 41. DL: (Every)where ? • Language Modeling (2012, Mikolov et al) • Image Recognition (Krizhevsky won 2012 ImageNet competition) • Sentiment Classification (2011, Socher et al) • Speech Recognition (2010, Dahl et al) 23
  • 42. DL: (Every)where ? • Language Modeling (2012, Mikolov et al) • Image Recognition (Krizhevsky won 2012 ImageNet competition) • Sentiment Classification (2011, Socher et al) • Speech Recognition (2010, Dahl et al) • MNIST hand-written digit recognition (Ciresan et al, 2010) 23
  • 43. 24
  • 44. So: Why Deep? Deep Architectures can be representationally efficient • Fewer computational units for same function Deep Representations might allow for a hierarchy or representation • Allows non-local generalisation • Comprehensibility Multiple levels of latent variables allow combinatorial sharing of statistical strength 25
  • 45. So: Why Deep? Generalizing better to new tasks & domains Can learn good intermediate representations shared across tasks Distributed representations Unsupervised Learning Multiple levels of representation 26
  • 46. Diff Levels of Abstraction • Hierarchical Learning • Natural progression from low level to high level structure as seen in natural complexity • Easier to monitor what is being learnt and to guide the machine to better subspaces • A good lower level representation can be used for many distinct tasks 27
  • 47. Generalizable Learning • Shared Low Level Representations • Multi-Task Learning • Unsupervised Training 28
  • 48. Generalizable Learning • Shared Low Level Representations • Multi-Task Learning • Unsupervised Training 28 • Partial Feature Sharing • Mixed Mode Learning • Composition of Functions
  • 49. No More Handcrafted Features ! 29
  • 50. 2. Natural Language Processing 30
  • 51. DL + NLP • Language Modeling • Bengio et al. (2000, 2003): via Neural network • Mnih and Hinton (2007): via RBMs • Pos, Chunking, NER, SRL • Collobert and Weston 2008 • Socher et al 2011; Socher 2014 31
  • 52. Language Modeling • Word Embeddings (Bengio et al, 2001; Bengio et al, 2003) based on idea of distributed representations for symbols (Hinton 1986) • Neural Word embeddings (Turian et al 2010; Collobert et al. 2011) 32
  • 53. Word Embeddings • Collobert & Weston 2008; Collobert et al. 2011 • similar to word vector learning, but uses instead of single scalar score, a Softmax/Maxent classifier word embeddings in from lookup table. From Collobert et al. 2011 33
  • 54. Word Embeddings • Collobert & Weston 2008; Collobert et al. 2011 • similar to word vector learning, but uses instead of single scalar score, a Softmax/Maxent classifier Figure from Socher et al. Tutorial ACL 2012. 34
  • 55. Figure from Socher et al. Tutorial ACL 2012. 35
  • 56. • window approach • sentence approach source: Collobert & Weston, Deep Learning for Natural Language Processing. 2009 Nips 36
  • 57. • Multi-task learning 37 source: Collobert & Weston, Deep Learning for Natural Language Processing. 2009 Nips
  • 58. 38 General Deep Architecture for NLP source: Collobert & Weston, Deep Learning for Natural Language Processing. 2009 Nips
  • 59. 38 General Deep Architecture for NLP Basic features source: Collobert & Weston, Deep Learning for Natural Language Processing. 2009 Nips
  • 60. 38 General Deep Architecture for NLP Basic features Embeddings source: Collobert & Weston, Deep Learning for Natural Language Processing. 2009 Nips
  • 61. 38 General Deep Architecture for NLP Basic features Embeddings Convolution source: Collobert & Weston, Deep Learning for Natural Language Processing. 2009 Nips
  • 62. 38 General Deep Architecture for NLP Basic features Embeddings Convolution Max pooling source: Collobert & Weston, Deep Learning for Natural Language Processing. 2009 Nips
  • 63. 38 General Deep Architecture for NLP Basic features Embeddings Convolution Max pooling “Supervised” learning source: Collobert & Weston, Deep Learning for Natural Language Processing. 2009 Nips
  • 64. Word Embeddings • Unsupervised Word Representations (Turian et al 2010) • evaluates Brown clusters, C&W (Collobert and Weston 2008) embeddings, and HLBL (Mnih & Hinton, 2009) embeddings of words -> Brown clusters win out with a small margin on both NER and chunking. • more info: http://metaoptimize.com/projects/ wordreprs/ 39
  • 65. 40 t-SNE visualizations of word embeddings. Left: Number Region; Right: Jobs Region. From Turian et al. 2011
  • 67. Word Embeddings • Collobert & Weston 2008; Collobert et al. 2011 • Propose a unified neural network architecture, for many NLP tasks: • part-of-speech tagging, chunking, named entity recognition, and semantic role labeling • no hand-made input features • learns internal representations on the basis of vast amounts of mostly unlabeled training data. 42
  • 68. Word Embeddings • Recurrent Neural Network (Mikolov et al. 2010; Mikolov et al. 2013a) W(‘‘woman")−W(‘‘man") ≃ W(‘‘aunt")−W(‘‘uncle") W(‘‘woman")−W(‘‘man") ≃ W(‘‘queen")−W(‘‘king") Figures from Mikolov, T., Yih, W., & Zweig, G. (2013). Linguistic Regularities in Continuous Space Word Representations 43
  • 69. • Mikolov et al. 2013b Figures from Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013b). Efficient Estimation of Word Representations in Vector Space 44
  • 70. Word Embeddings • Recursive (Tensor) Network (Socher et al. 2011; Socher 2014) 45
  • 72. 47
  • 73. 48
  • 74. 49
  • 75. 50
  • 76. 51
  • 77. 52
  • 78. 53
  • 79. 3. Graph-Based Approaches to DL+NLP • A) NLP “naturally encoded” • B) Genetic Finite State Machine • C) Neural net within Graph 54
  • 80. Graph-Based NLP • Graphs have a “natural affinity” with NLP [ feel free to quote me on that ;) ] • relation-oriented • index-free adjacency 55
  • 81. Whats in a Graph ? Figure from Buerli & Obispo (2012). 56
  • 82. Whats in a Graph ? • Graph Databases: Neo4j, OrientDB, InfoGrid, Titan, FlockDB, ArangoDB, InfiniteGraph, AllegroGraph, DEX, GraphBase, and HyperGraphDB • Distributed graph processing toolkits (based on MapReduce, HDFS, and custom BSP engines): Bagel, Hama, Giraph, PEGASUS, Faunus, Flink • in-memory graph packages designed for massive shared-memory (NetworkX, Gephi, MTGL, Boost, uRika, and STINGER) 57
  • 83. A. NLP “naturally encoded” • ie: graph-based opinion summarization (Ganesan et al. 2010; Genevan 2013) 58 • Captures: • Redundancies • Gapped Subsequences • Collapsible Structures From Ganesan 2013 Natural Affinity, Say what?
  • 84. Summarization Graph 59 From Ganesan 2013
  • 85. Natural Affinity? • Demo time! 60
  • 86. B. Finite State Graph • Bastani 2014a; 2014b; 2014c • Probabilistic feature hierarchy • Grammatical inference by genetic algorithms more info: https://github.com/kbastani/graphify 61 Figure from Bastani 2014a
  • 87. Finite State Graph 62 • Bastani 2014 • training phase: all figures from Bastani 2014b
  • 88. Finite State Graph 62 • Bastani 2014 • training phase: all figures from Bastani 2014b
  • 89. Finite State Graph 62 • Bastani 2014 • training phase: all figures from Bastani 2014b
  • 90. Finite State Graph 62 • Bastani 2014 • training phase: all figures from Bastani 2014b
  • 91. • sentiment analysis • error: 0.3 Figure from Bastani 2014c 63
  • 92. Conceptual Hierarchical Graph • Demo time! 64
  • 93. C. Factor Graph • Factor graph in which the factors themselves contain a deep neural net. • Factor graph: • bipartite graph representing the factorization of a function (Kschischang et al. 2001; Frey 2002) • can combine Bayesian networks (BNs) and Markov random fields (MRFs). Figure from Frey 2002 65
  • 94. Factor Graph • Factor graph with “deep factors” (Mirowski & LeCun 2009) • Dynamic Time Series modeling 66
  • 95. Energy-Based Graph • LeCun et al. 1998, handwriting recognition system • “Graph Transformer Networks” • Instead of normalised HMM, energy based factor graph (without normalization) • LeCun et al. 2006. • Energy-Based Learning 67
  • 96. And finally… and Finally… What you’ve all been waiting for… 68
  • 97. And finally… and Finally… What you’ve all been waiting for… Which Net is currently the Biggest ? 68
  • 98. And finally… and Finally… What you’ve all been waiting for… Which Net is currently the Biggest ? 68 the Deepest
  • 99. And finally… and Finally… What you’ve all been waiting for… Which Net is currently the Biggest ? 68 the Deepest The most Bad-ass ?
  • 100. Winners of: Large Scale Visual Recognition Challenge 2014 (ILSVRC2014) 19 September 2014 source: Szegedy et al. Going deeper with convolutions (GoogLeNet ), ILSVRC2014, 19 Sep 2014 69
  • 101. Winners of: Large Scale Visual Recognition Challenge 2014 (ILSVRC2014) 19 September 2014 GoogLeNet Convolution Pooling Softmax Other source: Szegedy et al. Going deeper with convolutions (GoogLeNet ), ILSVRC2014, 19 Sep 2014 69
  • 102. Large Scale Visual Recognition Challenge 2014 GoogLeNet Convolution Pooling Softmax Other Winners of: (ILSVRC2014) 19 September 2014 GoogLeNet Convolution Pooling Softmax Other source: Szegedy et al. Going deeper with convolutions (GoogLeNet ), ILSVRC2014, 19 Sep 2014 69
  • 103. Inception 256 480 480 512 512 512 832 832 1024 Width of inception modules ranges from 256 filters (in early modules) to 1024 in top inception modules. Can remove fully connected layers on top completely Number of parameters is reduced to 5 million Computional cost is increased by less than 2X compared to Krizhevsky’s network. (<1.5Bn operations/evaluation) source: Szegedy et al. Going deeper with convolutions (GoogLeNet ), ILSVRC2014, 19 Sep 2014 70
  • 104. 71 Classification results on ImageNet 2012 Team Year Place Error (top-5) Uses external data SuperVision 2012 - 16.4% no SuperVision 2012 1st 15.3% ImageNet 22k Clarifai 2013 - 11.7% no Clarifai 2013 1st 11.2% ImageNet 22k MSRA 2014 3rd 7.35% no VGG 2014 2nd 7.32% no GoogLeNet 2014 1st 6.67% no Final Detection Results Team Year Place mAP e x t e r n a l data ensemble c o n t e x t u a l model approach UvA-Euvision 2013 1st 22.6% none ? yes F i s h e r vectors Deep Insight 2014 3rd 40.5% I L S V R C 1 2 Classification + Localization 3 models yes ConvNet C U H K DeepID-Net 2014 2nd 40.7% I L S V R C 1 2 Classification + Localization ? no ConvNet GoogLeNet 2014 1st 43.9% I L S V R C 1 2 Classification 6 models no ConvNet Detection results source: Szegedy et al. Going deeper with convolutions (GoogLeNet ), ILSVRC2014, 19 Sep 2014
  • 105. Wanna Play? • cuda-convnet2 (Alex Krizhevsky, Toronto) (c++/ CUDA, optimized for GTX 580) https://code.google.com/p/cuda-convnet2/ • Caffe (Berkeley) (Cuda/OpenCL, Theano, Python) http://caffe.berkeleyvision.org/ • OverFeat (NYU) http://cilvr.nyu.edu/doku.php?id=code:start 72
  • 106. Wanna Play? • Theano - CPU/GPU symbolic expression compiler in python (from LISA lab at University of Montreal). http:// deeplearning.net/software/theano/ • Pylearn2 - Pylearn2 is a library designed to make machine learning research easy. http://deeplearning.net/software/ pylearn2/ • Torch - provides a Matlab-like environment for state-of-the-art machine learning algorithms in lua (from Ronan Collobert, Clement Farabet and Koray Kavukcuoglu) http://torch.ch/ • more info: http://deeplearning.net/software links/ (slide partially stolen from: J. Sullivan, Convolutional Neural Networks & Computer Vision, Machine Learning meetup at Spotify, Stockholm, June 9 2014) 73
  • 107. Fin. Questions / Discussion … ? 74
  • 108. Bibliography: Definitions • Mitchell, T. M. (1997). Machine Learning (1st ed.). New York, NY, USA: McGraw-Hill, Inc. • Simon, H.A. (1983). Why should machines learn? in: Machine Learning: An Artificial Intelligence Approach, (R. Michalski, J. Carbonell, T. Mitchell, eds) Tioga Press, 25-38. 75
  • 109. Bibliography: History • Rosenblatt, Frank (1957), The Perceptron--a perceiving and recognizing automaton. Report 85-460-1, Cornell Aeronautical Laboratory. • Minsky & Papert (1969), Perceptrons: an introduction to computational geometry. • Bryson, A.E.; W.F. Denham; S.E. Dreyfus (1963) Optimal programming problems with inequality constraints. I: Necessary conditions for extremal solutions. AIAA J. 1, 11 2544-2550. • Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J. (1986). "Learning representations by back-propagating errors". Nature 323 (6088): 533–536. • Boser, B. E., Guyon, I., and Vapnik, V. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pages 144– 152. ACM Press. • Cortes, C. and Vapnik, V. (1995), Support-vector network. Machine Learning, 20:273–297. • Larochelle, H., Erhan, D., Courville, A., Bergstra, J., & Bengio, Y. (2007). An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation. In Proceedings of the 24th International Conference on Machine Learning (pp. 473–480). New York, NY, USA: ACM. • Vincent, P., Larochelle, H., & Lajoie, I. (2010), Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research, 11, 3371–3408. 76
  • 110. Bibliography: History - CNN’s • Fukushima, Kunihiko (1980). "Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position". Biological Cybernetics 36 (4): 193–202. doi:10.1007/BF00344251. PMID 7370364. Retrieved 16 November 2013. • LeCun, Yann; Léon Bottou; Yoshua Bengio; Patrick Haffner (1998). "Gradient-based learning applied to document recognition". Proceedings of the IEEE 86 (11): 2278–2324. • S. Behnke. Hierarchical Neural Networks for Image Interpretation, volume 2766 of Lecture Notes in Computer Science. Springer, 2003. • Simard, Patrice, David Steinkraus, and John C. Platt. "Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis." In ICDAR, vol. 3, pp. 958-962. 2003. • Hinton, GE; Osindero, S; Teh, YW (Jul 2006). "A fast learning algorithm for deep belief nets.". Neural computation 18 (7): 1527–54. • Bengio, Yoshua; Lamblin, Pascal; Popovici, Dan; Larochelle, Hugo (2007). "Greedy Layer-Wise Training of Deep Networks". Advances in Neural Information Processing Systems: 153–160. • Ranzato, MarcAurelio; Poultney, Christopher; Chopra, Sumit; LeCun, Yann (2007). "Efficient Learning of Sparse Representations with an Energy-Based Model". Advances in Neural Information Processing Systems. 77
  • 111. Bibliography: DL • Bengio, Y., Ducharme, R., & Vincent, P. (2001). A Neural Probabilistic Language Model. In T. K. Leen & T. G. Dietterich (Eds.), Advances in Neural Information Processing Systems 13 (NIPS’00). MIT Press. • Bengio, Y., Ducharme, R., Vincent, P., & Janvin, C. (2003). A Neural Probabilistic Language Model. The Journal of Machine Learning Research, 3, 1137–1155. • Bengio, Y., Lamblin, P., Popovici, P., Larochelle, H. (2007). Greedy Layer-Wise Training of Deep Networks, Advances in Neural Information Processing Systems 19 • Hinton, G. E. (1986). Learning distributed representations of concepts. In Proceedings of the eighth annual conference of the cognitive science society (Vol. 1, p. 12). • Hinton, G. E. and Salakhutdinov, R. R, (2006) Reducing the dimensionality of data with neural networks. Science, Vol. 313. no. 5786, pp. 504 - 507, 28 July 2006. • Hinton, G. E, Osindero, S., and Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18:1527-1554. • Erhan, D., Bengio, Y., & Courville, A. (2010). Why does unsupervised pre-training help deep learning? Journal of Machine Learning Research, 11, 625–660. 78
  • 112. Bibliography: DL • P. Vincent, P., Larochelle, H., Bengio, Y. and Manzagol, P. A. (2008) Extracting and composing robust features with denoising autoencoders. In ICML. • Vincent, P., Larochelle, H., & Lajoie, I. (2010). Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research, 11, 3371–3408. Bengui 2009 • Krizhevsky, A., Sutskever, I. and Hinton, G. E. (2012) Imagenet classification with deep convolutional neural networks. In NIPS. • Socher, Richard, Jeffrey Pennington, Eric H. Huang, Andrew Y. Ng, and Christopher D. Manning. (2011). Semi-supervised recursive autoencoders for predict- ing sentiment distributions. In Proceedings of the 2011 Conference on Empiri- cal Methods in Natural Language Processing (EMNLP). • Dahl, G. E., Ranzato, M. A., Mohamed, A. and Hinton, G. E. (2010) Phone recognition with the mean-covariance restricted Boltzmann machine. In NIPS. • Ciresan, D. C., Meier, U., Gambardella, L. M., & Schmidhuber, J. (2010). Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition. CoRR. • Szegedy et al. (2014) Going deeper with convolutions (GoogLeNet ), ILSVRC2014, 19 Sep 2014 79
  • 113. Bibliography: NLP • Turian, J., Ratinov, L., & Bengio, Y. (2010). Word Representations: A Simple and General Method for Semi-supervised Learning. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (pp. 384–394). Stroudsburg, PA, USA: Association for Computational Linguistics. • Collobert, R., & Weston, J. (2008). A unified architecture for natural language processing: Deep neural networks with multitask learning. Proceedings of the 25th International Conference …. • Collobert, R., Weston, J., & Bottou, L. (2011). Natural language processing (almost) from scratch. The Journal of Machine Learning Research, 12:2493-2537. • Collobert & Weston, Deep Learning for Natural Language Processing (2009) Nips Tutorial • Mikolov, T., Yih, W., & Zweig, G. (2013a). Linguistic Regularities in Continuous Space Word Representations. HLT-NAACL, (June), 746–751. • Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013b). Efficient Estimation of Word Representations in Vector Space, 1–12. Computation and Language. 80
  • 114. Bibliography: NLP • Bengio, Y. and Bengio, S (2000) Modeling high- dimensional discrete data with multi-layer neural networks. In Proceedings of NIPS 12 • Mnih, A. and Hinton, G. E. (2007) Three New Graphical Models for Statistical Language Modelling. International Conference on Machine Learning, Corvallis, Oregon. • Socher, R., Bengio, Y., & Manning, C. (2012). Deep Learning for NLP (without Magic). Tutorial Abstracts of ACL 2012. • Socher, R. (2014). recursive deep learning for natural language processing and computer vision. Dissertation. 81
  • 115. Bibliography: Graph-Based Approaches • Frey, B. (2002). Extending factor graphs so as to unify directed and undirected graphical models. Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence 19 (UAI 03), Morgan Kaufmann, CA, Acapulco, Mexico, 257–264. • F. R. Kschischang, B. J. Frey, H. A. L. (2001). Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory, 47(2), 498–519. • Mirowski, P., & LeCun, Y. (2009). Dynamic factor graphs for time series modeling. Machine Learning and Knowledge Discovery. • LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE November 1998. • LeCun, Y., Chopra, S., Hadsell, R., Ranzato, M. A., & Huang, F. J. (2006). A Tutorial on Energy-Based Learning 1 Introduction : Energy-Based Models, 1– 59. 82
  • 116. Bibliography: Graph-Based Approaches • Buerli, M., & Obispo, C. (2012). The current state of graph databases. Department of Computer Science, Cal Poly San Luis Obispo • Ganesan, K., Zhai, C., & Han, J. (2010). Opinosis: a graph-based approach to abstractive summarization of highly redundant opinions. Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), (August), 340–348. • Ganesan, K. (2013). Opinion Driven Decision Support System. PhD Dissertation, University of Illinois. • Bastani, K. 2014a, Hierarchical Pattern Recognition, Blog: Meaning Of, June 17, 2014 • Bastani, K. 2014b, Using a Graph Database for Deep Learning Text Classification, Blog: Meaning Of, August 26, 2014 • Bastani, K. 2014c, Deep Learning Sentiment Analysis for Movie Reviews using Neo4j, Blog: Meaning Of, September 15, 2014 83