3. Definition of Artificial Intelligence
• Machine (Mechanical? Biological?) that perform tasks as humans or field of study to do it
–No clear consensus.
–Here are definition of A.I. who first coined the term.
8. Machine Learning
• "Field of study that gives computers the ability to learn without being explicitly programmed”
9. Toward Human-level Recognition Performance
• Deep Learning is Driving Recent Major Breakthroughs in Visual and Speech Recognition Tasks
10. Beyond Human-level Performance
• Now, Machines Beat Human in Tasks Once Considered Impossible
5:0
vs Fan Hui
(Oct. 2015)
4:1
vs Sedol Lee
(Mar. 2016)
11. Beyond Human-level Performance
• Now, Machines Beat Human in Tasks Once Considered Impossible
TPU Server
used against Lee Sedol
TPU Board
used against Ke Jie
12. Beyond Human-level Performance
• Now, Machines Beat Human in Tasks Once Considered Impossible
Libratus(Jan 30, 2017) DeepStack(Science, Mar 02, 2017)
14. Explaining Deep Learning in One Sentence
You could think of Deep Learning as the building of
learning machines, say pattern recognition systems or
whatever, by assembling lots of modules or elements
that all train the same way.
IEEE Spectrum, Feb. 2015
Deep learning is a branch of machine learning based
on a set of algorithms that attempt to model high level
abstractions in data by using a deep graph with
multiple processing layers, composed of multiple linear
and non-linear transformations.
21. Feature Engineering vs Feature Learning
From Yann LeCun
Knowledge-driven Feature Engineering Data-driven Feature Learning
• Feature Learning instead of Conventional Feature Engineering Removes Barriers for Multi-modal
Studies and Data-driven Approaches in Medical Data Analysis
22. Feature Engineering vs Feature Learning
• Clinically-defined Features vs Data-driven Features for DILD Quantification in Chest CT
–Learned features of CNN improves classification performance of lung patches into 6 subtypes of DILD by
significant margin.
–Learned features are more robust to inter-scanner setting, where images are collected from different institutions
or scanners.
–Presented at RSNA 2015
23. Feature Engineering vs Feature Learning
• Visualization of Hand-crafted Feature vs Learned Feature in 2D
24. Feature Engineering vs Feature Learning
• Clinically-defined Features vs Data-driven Features for Early Prediction of Arrhythmia using RNN
–Existing method uses multi-level feature extraction method after ectopic beats removal.
–By replacing hand-crafted feature extraction steps with data-driven feature learning method,
the prediction accuracy has been improved with significant margin.
25. Toward Fully Data-driven Medicine
• End-to-end Data-driven Workflow for Medical Research
http://tcr.amegroups.com/article/view/8705/html
End-to-end
26. Deep Learning for Medicine, Why Now?
Big Data Computational Power Algorithm
SPIE, 1993
Med. Phys. 1995
27. A.I. Medicine in Tech Keynotes
“So imagine that, soon every doctor
around the world just gonna have the
ability to snap a photo and as well as
the best doctors in the world be able
to diagnose your cancer. That’s gonna
save lives !”
- Mark Zuckerberg at F8 2016
“If there is one application where a lot
of very complicated, messy and
unstructured data is available, it is in the
field of medicine. And what better
application for deep learning than to
improve our health, improve life?”
- Jen-Hsun Huang, GTC 2016
Facebook F8, April 2016 Google I/O, May 2016Nvidia GTC, March 2016
“It’s very very difficult to have highly
trained doctors available in many
parts of the world. Deep learning did
really good at detecting DR. We can
see the promise again, of using
machine learning.
- Sundar Pichai, Google IO 2016
31. A.I. for Medicine in Healthcare Investment
• Increasing investment of smart money to healthcare, especially A.I.-based imaging & diagnostics
32. Medical Imaging A.I. Startups by Applications
Source : Signify Research(2017)
• Number of Medical Imaging Startups Founded and Funding Volume by Quarter(2014 to 2017)
38. Common Challenges
• Data Collection
–How many images do we need?
–What if we don’t have enough data?
–What if we don’t’ have enough annotations?
• Model Selection
–Do we really need ‘deep’ models?
–Is there any ‘off-the-shelf’ models?
–How can we incorporate context or prior into the models?
–Is there more trainer-friendly models?
• Result Interpretation
–Can we visually interpret the result?
–Can we obtain human-friendly interpretation?
Data
Model
Result
39. Data
- How many images do we need?
- What if we don’t have enough data?
- What if we don’t’ have enough annotations?
40. Data - How Much Medical Images Do We Need?
• Explorative Study for Measuring the Effect of Training Data Size on the Test Performance
–Predict the necessary training data size by extrapolating the performance/training size using nonlinear least
square.
–Not clinically meaningful but validating common assumption on the performance-dataset size trade-off.
J. Cho et. al. arXiv, 2015
41. How Much Medical Images Do We Need?
• The Effect of Training Dataset Size and Number of Annotation in the Fundus Image Classification
V. Gulshan et.al., JAMA, 2016
42. How Much Medical Images Do We Need?
• The Effect of Training Dataset Size and Number of Annotation in the Fundus Image Classification
V. Gulshan et.al., JAMA, 2016
43. How Much Medical Images Do We Need?
• The Inter-observer Variability or Disagreement is Significant
V. Gulshan et.al., JAMA, 2016
44. How Much Medical Images Do We Need?
• Dermatologist-level Classification of Skin Cancer
–Classification of skin cancer
–129,450 skin lesions comprising 2,032 different disease are used for training and 1,942 biopsy-labelled images
for test.
–Data is collected from ISIC Dermoscopic Archive, the Edinburgh Dermofit Library and Stanford Hospital.
–Rotation by 0~359 degrees and flip is used for data augmentation.
A. Esteva et. al, Nature 2017
45. How Much Medical Images Do We Need?
• Detection of Cancer Metastases on Pathology Image
–Generated 299x299 patches from 270 slides with resolution 10,000 x 10,000.
–Each slides contains 10,000 to 400,000 patches (median 90,000)
–But each tumor slide contains 20 to 150,000 tumor patches(median 2,000) – Class ratio from 0.01% to
70%(median 2%)
–Careful sampling strategy – 1) select class(normal or tumor) 2)select slide number randomly, 3)select patch
randomly to reduce bias toward slide with more patches.
–To reduce class imbalance, several data augmentation is used : 1)Rotation(90 degree x 4), horizontal flip,
2)Color perturbation(brightness, saturation, hue, contrast), 3)x,y offset upto 8 pixels.
–In total 10^7 patches + Augmentation
Y. Liu et. al. 2017
46. Data – What If We Don’t Have Enough Data?
• Data Augmentation for Effective Training Set Expansion
–In many cases, data augmentation techniques used in natural images does not semantically make sense in
medical image
(flips, rotations, scale shifts, color shifts)
–Physically-plausible deformations or morphological transform can be used in limited cases.
–More augmentation choices for texture classification problems.
H. R. Roth et. al., MICCAI, 2015
47. Data – What If We Don’t Have Enough Data?
• Transfer Learning from Other Domains
–Performance of off-the-shelf features vs random initialization vs initialization from transferred feature
–Initializing deeper network with transferred feature leads to better performance.
–Transferred network with ‘deep’ fine-tuning shows best results.
–Produced better results both on lymph node detection and polyp detection that networks with random init.
H. Shin et. al. IEEE Medical Imaging, 2016 N. Tajbakhksh et. al. IEEE Medical Imaging, 2016
48. Data – What If We Don’t Have Enough Annotations?
• Unsupervised Pre-training and Supervised Fine-tuning
–Stacked denoising auto-encoders are used for unsupervised training of input images
–Sparse annotations are used for supervised fine-tuning for better prediction performance.
J. Cheng et. al. Scientific Reports, 2016 H. Suk et. al. MICCAI, 2013
49. Data – What If We Don’t Have Enough Annotations?
• Weakly and Semi-supervised Semantic Segmentation for Lung Disease Detection
–With very limited strong information of lesion and abundant weak diagnostic information, semantic
segmentation network is trained.
–By sharing feature extractor for multi-task, classification of disease with localized lesion can be obtained.
–But… the training the network was tricky. We did pre-trained and semantic segmentation with skipped
connected ASPP network.
–Slight improvement of segmentation performance by exploiting weak label(Cancer)
Strong label
S. Hong et. al. arXiv:1512.07928, 2015
50. Data – What If We Don’t Have Enough Annotations?
• Medical Image Annotation Tool
–Provide the right tool for the higher quality annotation.
–Quality monitoring and control functionality is crucial for reducing trial and errors.
51. Model
- Do we really need ‘deep’ models?
- Is there any ‘off-the-shelf’ models?
- How can we incorporate context or prior into the model?
52. Model – Do We Really Need Deep Models?
• Surpassing human-level performance in medical imaging
–Detection of diabetic retinopathy in fundoscopy
V. Gulshan et.al., JAMA, 2016
sens : 96.7%, spec : 84.0%
sens : 90.7%, spec : 93.8%
AUROC : 97.4%
53. Model – Do We Really Need Deep Models?
• Surpassing Human-level Performance in Medical Imaging Diagnosis
–Classification of skin cancer
A. Esteva et. al, Nature 2017
54. Model – Do We Really Need Deep Models?
• Detection of Cancer Metastases on Pathology Image
–State-of-the-art sensitivity with 8 FP
Y. Liu et. al. 2017
55. Model – Do We Really Need Deep Models?
• Increased Performance with Deeper Networks
–Deeper models learn more discriminative features for better classification performance.
Shin et. al(2016)Jung et. al(2015)
56. Model – Is There Any Off-the-shelf Models?
• U-net for Biomedical Image Segmentation
–Winner of various image segmentation tasks
–Shows stable performance even with small annotated images
O. Ronneberger et. al. 2015
57. Model – Is There Any Off-the-shelf Models?
• V-net for Volumetric Biomedical Image Segmentation
–Expansion of U-net to 3D volumetric medical images such as CT and MRI
–The feature map of last stage is added to last feature map of current state to learn residual functions
F. Milletari et. al. 2016
58. Model – Is There Any Off-the-shelf Models?
• Inception-V3 Network for Surpassing Human Experts in Multiple Medical Imaging Tasks
–Detection of diabetic retinopathy
–Detection of skin cancer
–Detection of tumor in histopathology image
59. Model – How Can We Incorporate the Context Information?
• Location Sensitive CNN for the Segmentation of White Matter Hyperintensities
–Explicit Spatial Location Features
• (x, y, z) Coordinate
• in-plane distance from (left ventricle, right ventricle, brain cortex, midsagittal brain surface)
• Prior probability of WMH in that location
–Comparison of Single Scale(SS), Multi-scale Early Fusion(MSEF), Multi-scale Late Fusion with Independent
Weights(MSIW), and Multi-scale Late Fusion with Weight Sharing(MSWS)
M. Ghafoorian et. al., 2016
60. Model – How Can We Incorporate the Context Information?
• Location Sensitive CNN for the Segmentation of White Matter Hyperintensities
–Explicit Spatial Location Features
• (x, y, z) Coordinate
• in-plane distance from (left ventricle, right ventricle, brain cortex, midsagittal brain surface)
• Prior probability of WMH in that location
M. Ghafoorian et. al., 2016
61. Model – How Can We Incorporate the Context Information?
• DeepLung for Semantic Lung Segmentation
–Convolutional neural network is trained to semantically segment parenchymal part in lung HRCT
–High resolution feature maps with ‘atrous’ convolution layers are used to improve segmentation performance.
–Spatial context information is used to better capture anatomical structure of lungs and other organs.
62. Model – How Can We Incorporate the Context Information?
• DeepLung for Semantic Lung Segmentation
–Improved Segmentation Performance using Spatial Information and Hi-Res Feature Map
Spatial Context Information
Curriculum Learning
Model Selection
63. Model – How Can We Incorporate the Context Information?
• DeepLung for Semantic Lung Segmentation
–Improved Segmentation Performance using Spatial Information and Hi-Res Feature Map
64. Model – How Can We Incorporate the Context Information?
• DeepLung for Semantic Lung Segmentation
–Further improving performance using fully-connected conditional random field(NIPS 2011).
65. Model – How Can We Incorporate the Context Information?
• DeepLung for Semantic Lung Segmentation
–Clinical validation to totally unseen cases with different scanner and parameters. “Vendor Agnostic”
–When spatial context information is used we can get better segmentation result in the lower part of the
sequence.
66. Model – Is There More Trainer-friendly Models?
• Brain Lesion Detection using Generative Adversarial Network
–Detect lesion in the multi-modal brain images using patch-wise classifier trained with GAN
–Generator generates fake non-lesion patches while discriminator distinguishes real patches from fake non-lesion
patches
–In inference phase, the discriminator is expected to provide low value for lesion patches than non-lesion patches
67. Model – Is There More Trainer-friendly Models?
• Brain Lesion Detection using Generative Adversarial Network
–Detect lesion in the multi-modal brain images using patch-wise classifier trained with GAN
–Generator generates fake non-lesion patches while discriminator distinguishes real patches from fake non-lesion
patches
–In inference phase, the discriminator is expected to provide low value for lesion patches than non-lesion patches
68. Model – Is There More Trainer-friendly Models?
• Brain Lesion Detection using Generative Adversarial Network
–Detect lesion in the multi-modal brain images using patch-wise classifier trained with GAN
–Generator generates fake non-lesion patches while discriminator distinguishes real patches from fake non-lesion
patches
–In inference phase, the discriminator is expected to provide low value for lesion patches than non-lesion patches
69. Model – Is There More Trainer-friendly Models?
• Detection of Aggressive Prostate Cancer
–Detect of prostate cancer using semantic segmentation with generative adversarial object
–Instead of generator in the original GAN, segmentor is used to generate pixel-level lesion detection.
–Instead of using pixel-wise cross-entropy loss, GAN loss from segmentor and discriminator is used for training.
70. Model – Is There More Trainer-friendly Models?
• Detection of Aggressive Prostate Cancer
–Detect of prostate cancer using semantic segmentation with generative adversarial object
–Instead of generator in the original GAN, segmentor is used to generate pixel-level lesion detection.
–Instead of using pixel-wise cross-entropy loss, GAN loss from segmentor and discriminator is used for training.
71. Result
- Can we visually interpret the result?
- Can we obtain human-friendly interpretation?
72. Result – Can We Visually Interpret the Result?
• Class Activation Map for Visualize Salient Regions in the Image
B. Zhou et. al., CVPR, 2016
Objects
Actions
73. Result – Can We Visually Interpret the Result?
• Evidence Hotspot for Lesion Visualization
–Radiological score prediction and evidence pathological region suggestion
–Jointly learn multiple grading system and produce evidence for predictions.
–For training, disc volumes and corresponding multiple labels are used as input and multi-class classification
network is trained with class-balanced loss.
–‘Saliency Map’ approach is used for producing evidence hotspot
A. Jamaludin et. al. MICCAI, 2016
74. Result – Can We Visually Interpret the Result?
• Bone Age Assessment from Hand-bone X-ray
–Visualization of salient region in the bone x-ray image
H. Lee, et. al., JDI, 2017
75. Result – Can We Visually Interpret the Result?
• Open-source Visualization Tool
–PICASSO(https://github.com/merantix/Picasso)
77. Result – Can We Get Clinician-friendly Interpretation?
• Learning to Read Chest X-ray
–Automated x-ray annotation with recurrent neural cascade model.
H. Shin et. al., CVPR 2016
78. Result – Generation of Realistic Medical Images
D. Nie, et. al., 2016
CT Image Synthesis from MRI Decomposition of X-ray Image
S. Albarquoni, et. al., 2016
79. Result – Generation of Realistic Medical Images
• Translation of Image to Image without Paired Dataset
–Unpaired image-to-image translation has great potential for medical imaging such as segmentation, registration,
decomposition, modality shift and so on.
J-Y. Zhu et. al., arXiv, 2017
81. Conclusion
• Deep learning-based medical image analysis has shown promising results for data-driven medicine.
• By adopting recent progress in deep learning, many challenges in data-driven medical image analysis
has been overcome.
• Deep learning has the potential to improve the accuracy and sensitivity of image analysis tools and
will accelerate innovation and new product launches.
82. Future Directions in Medical Imaging
• Further studies to incorporate clinical knowledge into data-driven models.
• More studies on the application of recent advances in unsupervised and reinforcement learning to
medical image analysis.
• Studies on higher-dimensional(3D, 4D or even higher) medical image analysis.
• However, the greatest market impact in the short-term will be from cognitive workflow solutions that
enhance radiologist productivity.
• Diagnostic decision support solutions are close to commercialization, but several market barriers need
to be overcome, e.g. regulatory clearance, legal implications and resistance from clinicians.
• A.I. will “Augment”, not “Replace” Physicians. Radiologists become “Physicians of Physicians”.