SlideShare une entreprise Scribd logo
1  sur  20
Télécharger pour lire hors ligne
Faces in Places: compound
query retrieval
Y. Zhong, R. Arandjelovic and A. Zisserman: Paper Link
BMVC 2016
1
Slides by Eva Mohedano and Andrea Calafell [GDoc]
UPC Computer Vision Reading Group (14/10/2016)
Outline
2
1. Introduction
2. Hybrid Network
3. The “Celebrity in Places” Dataset
4. Synthetic Training Images
5. Experiments and Results
6. Summary
Introduction
Large Image Dataset
System
3
Compound query
Introduction
Three contributions:
1. Hybrid CNN to produce place descriptors that are aware of faces and their
descriptors.
2. Collect and annotate a dataset of real images containing celebrities in different
places.
3. Image synthesis system to render high quality fully-labelled face-and-place
images to train the network.
4
Outline
5
1. Introduction
2. Hybrid Network
3. The “Celebrity in Places” Dataset
4. Synthetic Training Images
5. Experiments and Results
6. Summary
Basic Approach
6
Hybrid Network
7
Hybrid Network
8
AlexNet pre-trained on
Places205
VGG-16 trained on VGG
Face Dataset
FC7
FC7
Outline
9
1. Introduction
2. Hybrid Network
3. The “Celebrity in Places” Dataset
4. Synthetic Training Images
5. Experiments and Results
6. Summary
The “Celebrity in Places” Dataset
10
Example images from the CIP dataset
Includes:
● 4611 celebrities
● 16 places
Query texts in
Google Image
Search
2,5M
images Duplicate
removal
170K
images Mechanical
Turk
annotation
38K
images
The “Celebrity in Places” Dataset
11
Includes:
● 4611 celebrities
● 16 places
Query text in
Google Image
Search
2,5M
images Duplicate
removal
170k
images Mechanical
Turk
annotation
38k
images
Problems with this approach
● Difficult to obtain high quality images with
Image Search engines
● Obtained images highly unbalanced across
classes
Outline
12
1. Introduction
2. Hybrid Network
3. The “Celebrity in Places” Dataset
4. Synthetic Training Images
5. Experiments and Results
6. Summary
Synthetic Training Images
13
Synthetic Training Images
14
178k training images
8.7k validation images
Includes:
● 500 faces
● 16 places
Outline
15
1. Introduction
2. Hybrid Network
3. The “Celebrity in Places” Dataset
4. Synthetic Training Images
5. Experiments and Results
6. Summary
Experiments and Results
16
Comparison with 3 baselines of late fusion
● FC7 VGG faces + FC7 Places205 + L2norm
● FC7 VGG faces + FC7 Places205 finetuned on 16 places+ L2norm
● FC7 VGG faces + FC7 Places205 finetuned on 16 places+ Platt
Test sets statistics
Experiments and Results
17
Outline
18
1. Introduction
2. Hybrid Network
3. The “Celebrity in Places” Dataset
4. Synthetic Training Images
5. Experiments and Results
6. Summary
Summary
19
● They have presented a hybrid network for compound queries, where place
descriptors are aware of faces and face descriptors. This network outperforms
the baselines.
● They have designed an automatic pipeline to synthesize training images.
● They have collected a new dataset of real images to evaluate their methods.
Questions?
20

Contenu connexe

En vedette

Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...Universitat Politècnica de Catalunya
 
Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...
Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...
Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...Universitat Politècnica de Catalunya
 
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)Universitat Politècnica de Catalunya
 
Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model (UP...
Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model (UP...Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model (UP...
Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model (UP...Universitat Politècnica de Catalunya
 
Time-series forecasting of indoor temperature using pre-trained Deep Neural N...
Time-series forecasting of indoor temperature using pre-trained Deep Neural N...Time-series forecasting of indoor temperature using pre-trained Deep Neural N...
Time-series forecasting of indoor temperature using pre-trained Deep Neural N...Francisco Zamora-Martinez
 
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...Universitat Politècnica de Catalunya
 
Image segmentation hj_cho
Image segmentation hj_choImage segmentation hj_cho
Image segmentation hj_choHyungjoo Cho
 
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlowLearning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlowAltoros
 
High level-api in tensorflow
High level-api in tensorflowHigh level-api in tensorflow
High level-api in tensorflowHyungjoo Cho
 
2017 tensor flow dev summit
2017 tensor flow dev summit2017 tensor flow dev summit
2017 tensor flow dev summitTae Young Lee
 
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...Universitat Politècnica de Catalunya
 
Google Dev Summit Extended Seoul - TensorFlow: Tensorboard & Keras
Google Dev Summit Extended Seoul - TensorFlow: Tensorboard & KerasGoogle Dev Summit Extended Seoul - TensorFlow: Tensorboard & Keras
Google Dev Summit Extended Seoul - TensorFlow: Tensorboard & KerasTaegyun Jeon
 
Electricity price forecasting with Recurrent Neural Networks
Electricity price forecasting with Recurrent Neural NetworksElectricity price forecasting with Recurrent Neural Networks
Electricity price forecasting with Recurrent Neural NetworksTaegyun Jeon
 
중국 IT의 현재: 디자이너 시선으로보는 알리바바와 텐센트
중국 IT의 현재: 디자이너 시선으로보는 알리바바와 텐센트중국 IT의 현재: 디자이너 시선으로보는 알리바바와 텐센트
중국 IT의 현재: 디자이너 시선으로보는 알리바바와 텐센트Hyunjoo Kate Lee
 
Visual Information Retrieval: Advances, Challenges and Opportunities
Visual Information Retrieval: Advances, Challenges and OpportunitiesVisual Information Retrieval: Advances, Challenges and Opportunities
Visual Information Retrieval: Advances, Challenges and OpportunitiesOge Marques
 
Visual Translation Embedding Network for Visual Relation Detection (UPC Readi...
Visual Translation Embedding Network for Visual Relation Detection (UPC Readi...Visual Translation Embedding Network for Visual Relation Detection (UPC Readi...
Visual Translation Embedding Network for Visual Relation Detection (UPC Readi...Universitat Politècnica de Catalunya
 
Deep learning and feature extraction for time series forecasting
Deep learning and feature extraction for time series forecastingDeep learning and feature extraction for time series forecasting
Deep learning and feature extraction for time series forecastingPavel Filonov
 

En vedette (18)

Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...
Advanced Deep Architectures (D2L6 Deep Learning for Speech and Language UPC 2...
 
Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...
Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...
Speech Recognition with Deep Neural Networks (D3L2 Deep Learning for Speech a...
 
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)
 
Speaker ID II (D4L1 Deep Learning for Speech and Language UPC 2017)
Speaker ID II (D4L1 Deep Learning for Speech and Language UPC 2017)Speaker ID II (D4L1 Deep Learning for Speech and Language UPC 2017)
Speaker ID II (D4L1 Deep Learning for Speech and Language UPC 2017)
 
Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model (UP...
Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model (UP...Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model (UP...
Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model (UP...
 
Time-series forecasting of indoor temperature using pre-trained Deep Neural N...
Time-series forecasting of indoor temperature using pre-trained Deep Neural N...Time-series forecasting of indoor temperature using pre-trained Deep Neural N...
Time-series forecasting of indoor temperature using pre-trained Deep Neural N...
 
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
 
Image segmentation hj_cho
Image segmentation hj_choImage segmentation hj_cho
Image segmentation hj_cho
 
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlowLearning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
 
High level-api in tensorflow
High level-api in tensorflowHigh level-api in tensorflow
High level-api in tensorflow
 
2017 tensor flow dev summit
2017 tensor flow dev summit2017 tensor flow dev summit
2017 tensor flow dev summit
 
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
 
Google Dev Summit Extended Seoul - TensorFlow: Tensorboard & Keras
Google Dev Summit Extended Seoul - TensorFlow: Tensorboard & KerasGoogle Dev Summit Extended Seoul - TensorFlow: Tensorboard & Keras
Google Dev Summit Extended Seoul - TensorFlow: Tensorboard & Keras
 
Electricity price forecasting with Recurrent Neural Networks
Electricity price forecasting with Recurrent Neural NetworksElectricity price forecasting with Recurrent Neural Networks
Electricity price forecasting with Recurrent Neural Networks
 
중국 IT의 현재: 디자이너 시선으로보는 알리바바와 텐센트
중국 IT의 현재: 디자이너 시선으로보는 알리바바와 텐센트중국 IT의 현재: 디자이너 시선으로보는 알리바바와 텐센트
중국 IT의 현재: 디자이너 시선으로보는 알리바바와 텐센트
 
Visual Information Retrieval: Advances, Challenges and Opportunities
Visual Information Retrieval: Advances, Challenges and OpportunitiesVisual Information Retrieval: Advances, Challenges and Opportunities
Visual Information Retrieval: Advances, Challenges and Opportunities
 
Visual Translation Embedding Network for Visual Relation Detection (UPC Readi...
Visual Translation Embedding Network for Visual Relation Detection (UPC Readi...Visual Translation Embedding Network for Visual Relation Detection (UPC Readi...
Visual Translation Embedding Network for Visual Relation Detection (UPC Readi...
 
Deep learning and feature extraction for time series forecasting
Deep learning and feature extraction for time series forecastingDeep learning and feature extraction for time series forecasting
Deep learning and feature extraction for time series forecasting
 

Similaire à Faces in Places: Compound Query Retrieval

Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecognIlyas CHAOUA
 
Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...
Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...
Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...YutaSuzuki27
 
Using Set Cover to Optimize a Large-Scale Low Latency Distributed Graph
Using Set Cover to Optimize a Large-Scale Low Latency Distributed GraphUsing Set Cover to Optimize a Large-Scale Low Latency Distributed Graph
Using Set Cover to Optimize a Large-Scale Low Latency Distributed GraphRui Wang
 
IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...
IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...
IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...IRJET Journal
 
Learning with Relative Attributes
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative AttributesVikas Jain
 
Realtime face matching and gender prediction based on deep learning
Realtime face matching and gender prediction based on deep learningRealtime face matching and gender prediction based on deep learning
Realtime face matching and gender prediction based on deep learningIJECEIAES
 
Graduation project Book (Self-Driving Car)
Graduation project Book (Self-Driving Car)Graduation project Book (Self-Driving Car)
Graduation project Book (Self-Driving Car)ahmedshehata133
 
IRJET- Prediction of Facial Attribute without Landmark Information
IRJET-  	  Prediction of Facial Attribute without Landmark InformationIRJET-  	  Prediction of Facial Attribute without Landmark Information
IRJET- Prediction of Facial Attribute without Landmark InformationIRJET Journal
 
Principles of Data Visualization
Principles of Data VisualizationPrinciples of Data Visualization
Principles of Data VisualizationEamonn Maguire
 
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...multimediaeval
 
VERIFICATION_&_VALIDATION_OF_A_SEMANTIC_IMAGE_TAGGING_FRAMEWORK_VIA_GENERATIO...
VERIFICATION_&_VALIDATION_OF_A_SEMANTIC_IMAGE_TAGGING_FRAMEWORK_VIA_GENERATIO...VERIFICATION_&_VALIDATION_OF_A_SEMANTIC_IMAGE_TAGGING_FRAMEWORK_VIA_GENERATIO...
VERIFICATION_&_VALIDATION_OF_A_SEMANTIC_IMAGE_TAGGING_FRAMEWORK_VIA_GENERATIO...grssieee
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesFellowship at Vodafone FutureLab
 
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)Shunta Saito
 
EE-2018-1303261-1.pdf
EE-2018-1303261-1.pdfEE-2018-1303261-1.pdf
EE-2018-1303261-1.pdfUmarDrazKhan2
 
FACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEW
FACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEWFACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEW
FACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEWIRJET Journal
 
Extraction of Buildings from Satellite Images
Extraction of Buildings from Satellite ImagesExtraction of Buildings from Satellite Images
Extraction of Buildings from Satellite ImagesAkanksha Prasad
 

Similaire à Faces in Places: Compound Query Retrieval (20)

Report face recognition : ArganRecogn
Report face recognition :  ArganRecognReport face recognition :  ArganRecogn
Report face recognition : ArganRecogn
 
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
 
Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...
Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...
Transfer Learning Model for Image Segmentation by Integrating U-NetPlusPlus a...
 
Using Set Cover to Optimize a Large-Scale Low Latency Distributed Graph
Using Set Cover to Optimize a Large-Scale Low Latency Distributed GraphUsing Set Cover to Optimize a Large-Scale Low Latency Distributed Graph
Using Set Cover to Optimize a Large-Scale Low Latency Distributed Graph
 
IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...
IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...
IRJET- A Review on Data Dependent Label Distribution Learning for Age Estimat...
 
Learning with Relative Attributes
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative Attributes
 
ObjectDetection.pptx
ObjectDetection.pptxObjectDetection.pptx
ObjectDetection.pptx
 
Realtime face matching and gender prediction based on deep learning
Realtime face matching and gender prediction based on deep learningRealtime face matching and gender prediction based on deep learning
Realtime face matching and gender prediction based on deep learning
 
Graduation project Book (Self-Driving Car)
Graduation project Book (Self-Driving Car)Graduation project Book (Self-Driving Car)
Graduation project Book (Self-Driving Car)
 
IRJET- Prediction of Facial Attribute without Landmark Information
IRJET-  	  Prediction of Facial Attribute without Landmark InformationIRJET-  	  Prediction of Facial Attribute without Landmark Information
IRJET- Prediction of Facial Attribute without Landmark Information
 
Principles of Data Visualization
Principles of Data VisualizationPrinciples of Data Visualization
Principles of Data Visualization
 
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
 
VERIFICATION_&_VALIDATION_OF_A_SEMANTIC_IMAGE_TAGGING_FRAMEWORK_VIA_GENERATIO...
VERIFICATION_&_VALIDATION_OF_A_SEMANTIC_IMAGE_TAGGING_FRAMEWORK_VIA_GENERATIO...VERIFICATION_&_VALIDATION_OF_A_SEMANTIC_IMAGE_TAGGING_FRAMEWORK_VIA_GENERATIO...
VERIFICATION_&_VALIDATION_OF_A_SEMANTIC_IMAGE_TAGGING_FRAMEWORK_VIA_GENERATIO...
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network Approaches
 
Deep Learning for Computer Vision: Face Recognition (UPC 2016)
Deep Learning for Computer Vision: Face Recognition (UPC 2016)Deep Learning for Computer Vision: Face Recognition (UPC 2016)
Deep Learning for Computer Vision: Face Recognition (UPC 2016)
 
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
 
EE-2018-1303261-1.pdf
EE-2018-1303261-1.pdfEE-2018-1303261-1.pdf
EE-2018-1303261-1.pdf
 
FACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEW
FACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEWFACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEW
FACE PHOTO-SKETCH RECOGNITION USING DEEP LEARNING TECHNIQUES - A REVIEW
 
Comparison of Rendering Processes on 3D Model
Comparison of Rendering Processes on 3D ModelComparison of Rendering Processes on 3D Model
Comparison of Rendering Processes on 3D Model
 
Extraction of Buildings from Satellite Images
Extraction of Buildings from Satellite ImagesExtraction of Buildings from Satellite Images
Extraction of Buildings from Satellite Images
 

Plus de Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 

Plus de Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Dernier

RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxAleenaJamil4
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 

Dernier (20)

RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptx
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 

Faces in Places: Compound Query Retrieval

  • 1. Faces in Places: compound query retrieval Y. Zhong, R. Arandjelovic and A. Zisserman: Paper Link BMVC 2016 1 Slides by Eva Mohedano and Andrea Calafell [GDoc] UPC Computer Vision Reading Group (14/10/2016)
  • 2. Outline 2 1. Introduction 2. Hybrid Network 3. The “Celebrity in Places” Dataset 4. Synthetic Training Images 5. Experiments and Results 6. Summary
  • 4. Introduction Three contributions: 1. Hybrid CNN to produce place descriptors that are aware of faces and their descriptors. 2. Collect and annotate a dataset of real images containing celebrities in different places. 3. Image synthesis system to render high quality fully-labelled face-and-place images to train the network. 4
  • 5. Outline 5 1. Introduction 2. Hybrid Network 3. The “Celebrity in Places” Dataset 4. Synthetic Training Images 5. Experiments and Results 6. Summary
  • 8. Hybrid Network 8 AlexNet pre-trained on Places205 VGG-16 trained on VGG Face Dataset FC7 FC7
  • 9. Outline 9 1. Introduction 2. Hybrid Network 3. The “Celebrity in Places” Dataset 4. Synthetic Training Images 5. Experiments and Results 6. Summary
  • 10. The “Celebrity in Places” Dataset 10 Example images from the CIP dataset Includes: ● 4611 celebrities ● 16 places Query texts in Google Image Search 2,5M images Duplicate removal 170K images Mechanical Turk annotation 38K images
  • 11. The “Celebrity in Places” Dataset 11 Includes: ● 4611 celebrities ● 16 places Query text in Google Image Search 2,5M images Duplicate removal 170k images Mechanical Turk annotation 38k images Problems with this approach ● Difficult to obtain high quality images with Image Search engines ● Obtained images highly unbalanced across classes
  • 12. Outline 12 1. Introduction 2. Hybrid Network 3. The “Celebrity in Places” Dataset 4. Synthetic Training Images 5. Experiments and Results 6. Summary
  • 14. Synthetic Training Images 14 178k training images 8.7k validation images Includes: ● 500 faces ● 16 places
  • 15. Outline 15 1. Introduction 2. Hybrid Network 3. The “Celebrity in Places” Dataset 4. Synthetic Training Images 5. Experiments and Results 6. Summary
  • 16. Experiments and Results 16 Comparison with 3 baselines of late fusion ● FC7 VGG faces + FC7 Places205 + L2norm ● FC7 VGG faces + FC7 Places205 finetuned on 16 places+ L2norm ● FC7 VGG faces + FC7 Places205 finetuned on 16 places+ Platt Test sets statistics
  • 18. Outline 18 1. Introduction 2. Hybrid Network 3. The “Celebrity in Places” Dataset 4. Synthetic Training Images 5. Experiments and Results 6. Summary
  • 19. Summary 19 ● They have presented a hybrid network for compound queries, where place descriptors are aware of faces and face descriptors. This network outperforms the baselines. ● They have designed an automatic pipeline to synthesize training images. ● They have collected a new dataset of real images to evaluate their methods.