6. • Advantages of AI approaches:
• Robust to dynamic objects
• Robust to changes in illumination
• Learning from limited datasets
• Selected Approaches
• PoseNet
• LSTM-Pose
• VLocNet++
• DSAC++
Brachman et al.: Expert Sample Consensus Applied to Camera Re-Localization:
Kendall et al.: Posenet
Camera Pose estimation/Localisation/Tracking
7. • Requirements
• 3D Model of environment available
• Image data of environment
available
• Challenges
• Enough data?
• Accuracy?
• What happens in case of wrong
estimation? Re-localisation?
• Speed issues
Camera Pose estimation/Localisation/Tracking
Brachman et al.: Expert Sample Consensus Applied to Camera Re-Localization:
Kendall et al.: Posenet
10. • Requirements
• 3D Model of environment available
• Image data of environment available
• Challenges
• Enough data? What needs to be
covered by training dataset?
• Accuracy?
• What happens in case of wrong
estimation? Re-localisation?
• Speed issues
• Hardware requirements
Camera Pose estimation/Localisation/Tracking
11. • Accuracy:
• Posenet’s accuracy within meter range
• Training effort
• DSAC++: Training takes 1-2 days per
training stage and scene on a Tesla K80
GPU
• VLocNet++ and DSAC++: Entire target
area should be included in the training
process.
• Processing times:
• Posenet: 5ms
• DSAC++: 200ms per image on a Tesla
K80 GPU
Camera Pose estimation/Localisation/Tracking
13. • Requirements fulfilled?
• Only 2.5 Model of environment
available
• Image data of environment available
• Challenges
• Accuracy? Previous work by
Armagan et al. achieved accuracy of
3 meters combing semantic
segmentation and CNN
Camera Pose estimation/Localisation/Tracking
15. Object detection
• Advantages of AI approaches:
• No additional sensors required
• Using more complex features
• More accurate
• Faster compared to classical object
detection
• Using larger datasets
• Selected approaches
• YOLO x
• RCNN
• Mask RCNN
https://www.youtube.com/watch?v=s8Ui_kV9dhw
22. 3D estimation/Depth estimation
• Advantages of AI approaches:
• Dual/Monocular camera support
• Synthesising unknown information
• Selected Approaches
• MegaDepth
• DepthTransfer (Karsch et al.)
• Instant 3D Photography (Hedman/
Kopf)
Soccer3D, Remantas et al.
Instant3D, Hedman and Kopf
24. • Requirements
• Input video data available
• Challenges
• Accuracy
• Speed requirements? Fast
turnover for replays required.
• Hardware requirements
• Covering different perspectives?
3D estimation/Depth estimation
Soccer3D, Remantas et al.
Instant3D, Hedman and Kopf
26. Localisation/Pose
Estimation
AI for AR
Object
Detection
3D estimation/
Depth estimation
Other topics, e.g.
Illumination/Light
estimation,
Inpainting
Some Opportunities for AR
• Accuracy required for AR comes with high processing times
• Possibly suitable for relocalisation
• Training requires target area to be covered
27. Localisation/Pose
Estimation
AI for AR
Object
Detection
Other topics, e.g.
Illumination/Light
estimation,
Inpainting
Some Opportunities for AR
• Speed still an issue when dealing with high frame rate cameras
• Temporal consistency? Tracking over multiple frames often not
given
3D estimation/
Depth estimation
28. Localisation/Pose
Estimation
AI for AR
Object
Detection
3D estimation/
Depth estimation
Other topics, e.g.
Illumination/Light
estimation,
Inpainting
Some Opportunities for AR
• What about difficult user positions?
• Training dataset needs to be very specific
• Will be not on-device but server structure
29. What are the benefits AI?
Benefits for AI
Dataset Collection Human in the loop Additional Interfaces
• Trust
• Reassurances
• Data improvement
• More potential
users
• More data
• Continuously/
During runtime
30. THANK YOU
1. Kán P, Kafumann H. DeepLight: light source estimation for augmented reality using deep learning. Visual
Computer. 2019.
2. Brachmann E, Rother C. Expert Sample Consensus Applied to Camera Re-Localization. 2019.
3. Karsch K, Liu C, Bing Kang S. Depth transfer: Depth extraction from videos using nonparametric
sampling. In: Dense Image Correspondences for Computer Vision.; 2015:173–205.
4. Radwan N, Valada A, Burgard W. VLocNet++: Deep Multitask Learning for Semantic Visual Localization
and Odometry. IEEE Robotics and Automation Letters. 2018.
5. Hedman P, Kopf J. Instant 3d photography. ACM Transactions on Graphics. 2018.
6. He K, Gkioxari G, Dollar P, Girshick R. Mask R-CNN. In: Proceedings of the IEEE International Conference
on Computer Vision.; 2017.
7. Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified, real-time object detection. In:
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.; 2016.
8. Rematas K, Kemelmacher-Shlizerman I, Curless B, Seitz S. Soccer on Your Tabletop. In: Proceedings of
the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.; 2018.
9. Radwan N, Valada A, Burgard W. VLocNet++: Deep Multitask Learning for Semantic Visual Localization
and Odometry. IEEE Robotics and Automation Letters. 2018.