SlideShare une entreprise Scribd logo
1  sur  58
Télécharger pour lire hors ligne
Sunghyon Kyeong

Severance Biomedical Science Institute, 

Yonsei University College of Medicine
Topological Data Analysis
-Methods and Examples-
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p
Contents
• Brief overview of topological data analysis
• About Ayasdi 

Startup company providing solutions for data analytics
• Topological data analysis - Methods
• Applications to medical science data
• Applications to social science data
2
Brief Overview of 

Topological Data Analysis
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p
Machine Learning
4
•Supervised Machine Learning 

→ Classification of new input data

(LDA, bayesian, support vector machine, neural network, and so on)
•Unsupervised Machine Learning 

→ Clustering of given dataset / Community detection

(k-means clustering, modularity optimisation, ICA, PCA, and so on)
•Topological Data Analysis

→ partial clustering with allowing overlaps among clusters
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p
World Interests for TDA
5
Heat map for viewers of my TDA slide at sliceshare (for 2500 viewers during 2015.2.14. - 2014.11.31.)
Data has Shape
Raw Data 

(diabetes related data)
An example
Shape has Meaning
Normal
Type II 

diabetes
Type I 

diabetes
An example
Meaning drives “Values”
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p
When to use TDA?
11
• To study complex high-dimensional data

: feature selections are not required in TDA
• Extracting shapes (patterns) of data
• Insights qualitative information is needed.
• Summaries are more valuable than individual
parameter choices.
Geometric
Topology
Algebraic
Betti0: 8
Betti1: 5
Betti0: 10
Betti1: 5
Topology
Betti0: clusters
Betti1: holes
Betti2: voids
mathematically 

defined “holes” in data
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 13
Persistent homology is a spatial type of homology that is useful for data analysis.
Betti numbers, which come from computing homology, reflect the topological
properties of an object.
B
O ㅂ
q b
: the connected components : the number of holes
Algebraic Topology
Quantitive Information
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 14
Homology
homeomorphic to
homeomorphic to
, Betti2 = 1
, Betti2 = 0
Ref) Xiaojin Zhu, IJCAI 2013 presentation slide
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 15
?
Ref) Figures are obtained from Y.P. Lum et al (2013) Scientific Reports | 3: 1236
points cloud data topology
Geometric Topology
Extracting Shapes of Data
Topological Data Analysis
using Mapper
Two input functions

- filter is to collapse high-dimensional data set into a single point

- distance as a measure of distance between data points
Resolution Parameters
- Intervals, overlap, magic fudge
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 17
Filter
Filter Range
Intervals
Overlap
Interval Length
: 0.0~2.0
: 5
: 50%
: 0.4
0 0.4 0.8 1.2 1.6 2.0
0.2 0.6 1.0 1.4 1.8
NumberofNodesinEachFilterBin
0
10
20
30
40
50
Filter Metric
0 1 2single
cluster one or two cluster(s)?
: Divide point clouds into each filter bin
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 18
Filter Function
- Filter function is not necessarily linear projections on a data matrix.
- People often uses functions that depend only on the distance
function itself, such as a measure of centrality.
- Some filter functions may not produce any interesting shapes.
Ref) Figures are obtained from Y.P. Lum et al (2013) Scientific Reports | 3: 1236
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 19
Distance Function
-distance between all pairs of data points.
-both euclidean or geodesic distances could be used.
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p
0
4
8
12
1 2 3 4 5
20
Distance & Clustering
• Single linkage dengrogram is used for clustering point clouds based on
distance between two nodes.
0
3
6
9
1 2 3 4 5
Magic Fudge is the number of
bins in the distribution of the
distance obtained from single
linkage dendrogram.
No. of clusters are estimated
from the number of continuous
bins having zero elements.
1 Cluster
2 Clusters
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 21
Nodes, Edges, Colors
1,2,3,

4,5
0,1,2,7
5,6,9
10
Nodes are groups of similar objects
Edges connect similar nodes
Colors let you see values of interest
8,11
3,4,8,12
12 1412,14
1311,13
indices for points cloud: 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 22
Topology extraction
Filter Range
Intervals
Overlap
Magic Fudge
: 0 ~ 2
: 5
: 50%
: 5
Filter
Partial Clustering
The size of node represents
the number of point clouds
in each node.
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p
23
A) C)
F1
F2
Filter
Extract
position
information
of y-axis
F3
F4
F5
F6
F7
F8
F9
F10
B)
F1
F2
F3
F4
F5
F6
F7
F8
F9
F9
F10
F10
0
10
20
30
1 2 3 4 5
9th Filter bin
1 1
1st Filter bin
0
10
20
1 2 3 4 5
11
Topology of Y-shape points cloud
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 24
interval = 5

overlap = 20%
interval = 5

overlap = 50%
interval = 5

overlap = 80%
interval = 10
overlap = 20%
interval = 10
overlap = 50%
interval = 10
overlap = 80%
interval = 15
overlap = 20%
interval = 15
overlap = 50%
interval = 15

overlap = 80%
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p
Mathematical details can be found at
25
• Gurjeet Singh et al. (2007), Topological methods for the analysis of
high dimensional data sets and 3D object recognition.
• Gunnar Carlsson (2009), Topology and data, Bull. Amer. Math.
Soc. 46. 255-308.
Gurjeet Singh and Gunnar Carlsson are 

co-founders of Ayasdi
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 27
US Patent by Ayasdi
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 28
Applications to 

Neurobiological/Clinical Data
International Neuroimaging Data-sharing Initiative (INDI)
- URL: http://fcon_1000.projects.nitrc.org/index.html
- Healthy control, ADHD, ASD data sets are available.
- resting state fMRI, diffusion tensor imaging,
- phenotype information such as intelligence scale and ADHD
symptom severity are available.
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p
Dataset : ADHD Symptoms & IQ
30
1 2 3 4 5
1 0.0 61.5 77.3 51.6 77.0
2 61.5 0.0 62.2 55.9 69.0
3 77.3 62.2 0.0 47.2 11.9
4 51.6 55.9 47.2 0.0 52.4
5 77.0 69.0 11.9 52.4 0.0
Data set:
L2-Distance (or Euclidean distance) for all pairwise subjects:
Distance Matrix:
(or Euclidean)
Filter Function:
f(x) = max d(x, y)
y 2 x
L-infinity eccentricity
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 31
Phenotypic subgroups of ADHD
AIQ-TDC

(Average IQ)
Low High
TDC

(Low IQ)
Distance function: L2-distance Filter function: L-infinity eccentricity
N of subjects = 204 Low resolution
AIQ-ADHD

(Average IQ & 

Borderline Sx )
HIQ-TDC

(High IQ)
LIQ-ADHD

(Low IQ & Ordinary Sx)
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 32
30
60
90
ADHD
60
30
60
90
Inattentive
59
30
60
90
Hyper/Impulsivity
59
70
100
130
FSIQ
108
70
100
130
VIQ
109
70
100
130
PIQ
105
HIQ-TDC

(High IQ)
AIQ-TDC 

(Average IQ)
LIQ-ADHD

(Ordinary Sx)
AIQ-ADHD

(Borderline Sx)
All subjects 

(Avg. Sx & Avg. IQ)
13 (13 / 0) 21 (17 / 4) 12 (1 / 11) 19 (2 / 17) 204 (90 / 114)
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 33
Functional Modular Architectures
AIQ-TDC 

(TDC with Average IQ)
LIQ-ADHD 

(ADHD with Ordinary Sx.)
AIQ-ADHD 

(ADHD with Borderline Sx.)
DMN module
Occipital module
Central module
Parietofrontal module
BG/THL module
Temporal/Limbic module
HIQ-TDC 

(TDC with High IQ)
※ Only positive weights of the functional connectivity are used for module analysis and visualised.
Applications to
Medical science data
Diabetes Subtypes
based on six quantities:
- age
- relative weight
- fasting plasma glucose
- area under the plasma glucose curve for the three hour
glucose tolerance test (OGTT)
- area under the plasma insulin curve for the OGTT
- steady state plasma glucose (SSPG) response
SSPG
Glucose area Insulin area
Ref) Eurographics Symposium on Point-Based Graphics, Singh G et al. (2007)
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 36
Subtypes of diabetes
Type I 

diabetes
Type II 

diabetes
Type I 

diabetes
Type II 

diabetes
Low-resolution High-resolution
Interval: 3

Overlay: 50%
Interval: 4

Overlay: 80%
Type I: adult onset TypeII: juvenile onset
Distance function: L2-distance
Filter function: density kernel with e=130,000
Subtypes of Breast Cancer
using breast cancer microarray gene expression data set
- disease specific genomic analysis (DSGA) transformed data
Application to Biological Data,
PNAS, Monica Nicolau et al. (2010)
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 38
Breast Cancer Subtype
ER: Estrogen Receptor
PNAS, Monica Nicolau et al. (2010)
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p
Diabetes Mellitus (DM)
• Commonly referred to as diabetes, a group of metabolic disease.
• If left untreated, diabetes can cause many complications: 

cardiovascular disease, stroke, chronic kidney failure, foot ulcers, and damage to eyes.
• Diabetes is due to either the pancreas not producing enough insulin or the cells of the body
not responding properly to the insulin produced.
• Type 1 DM: results from the pancreas’s failure to produce enough insulin.
• Type 2 DM: begins with insulin resistance, a condition in which cells fail to respond to

insulin properly. heavy weight and not enough exercise are causes of T2D.
• Gestational diabetes, is the third main form and occurs when pregnant women without a
previous history of diabetes develop a high blood-sugar level.
40
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p
Why subgroups of type 2 DM?
• Risk factors of Type 2 DM are:

obesity, family history of diabetes, physical inactivity, ethnicity, and
advanced age.
• Type 2 DM is heterogenous complex disease affecting 

more than 29 million in American (9.3%). 2 million in Korea.
• Increasing needs for early prevention and clinical management of Type 2
DM.
41
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p
Methods
• High dimensional EMRs and genotype data from 11,210 individuals from Mount Sinai
Medical Center (MSMC)’s outpatient population (46% Hispanic, 32% African american, 20%
European white, and 2% others).
• Type 2 DM and non-Type 2 DM were defined by an electronic phenotyping algorithm
(eMERGE network) based on ICM-9-CM diagnosis codes, laboratory test, prescribed
medications (RxNorm), physician notes (natural language processing), and family history.
• Form of preprocessed data matrix is n patients by P medical variables. 

Medical variables included 505 clinical variable, 7097 unique ICM-9-CM codes (1 to 218 per
patients). On average, 64 clinical variables per patients. To avoid overfitting, select the
variable with at least 50% of patients who had the variables, resulting in 73 (of 505) variables
to perform the analysis.
42
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p
TDA pipeline
• Distance metric: cosine distance metric was used to assess the similarity
of the data points.
• Filter metric: L-infinity centrality and principal metric singular value
decomposition (SVD1). 

L-infinity centrality is defined for each data point y to be the maximum
distance from y to any other data point in the data set. Large values of this
function correspond to points that are far from the centre of data set.
43
3889 patients 7321 patients
44
TDA with only 

2551 Type II DM
762

Subtype 1
617

Subtype 2
1096

Subtype 3
Gender is not an organising
factor in the topology
Reproducibility test (2/3 training,
1/3 test data set, 10 times)
revealed that the overall
accuracy was 96% for a subtype
classification.
45
Applications to 

Social Science Data
- Classification of (basket ball) player types
- Partial clustering of personality using temperamental traits
- Relationship among welfare, civil construction, and suicide rate
Classification of player types
based on their in-game performance such as:
- rates (per minute played) of rebounds, assists,
turnovers, steals, blocked shots, personal fouls, and
points scored (7 performance measures)
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 48
Map of Players
low resolution map at 20 intervals high resolution map at 30 intervals
Points Per Game
Low High
Filter function: 

Principal and secondary SVD values
Distance function: 

Variance normalised L2-distance
Traditionally, basket ball players are
categorised into guards, forwards, and
center.
TDA versus K-means clustering
Grouping subjects into
personality types
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 50
Novelty seeking
Harm avoidance
Reward dependance
Persistence
Temperament and character inventory (TCI) scores from 40 normal subjects
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 51
the mean of points in Si
V =
2X
i=1
X
xj 2Si
||xj µi||2 S = {S1, S2}
20 30 40 50 60 70
0
10
20
30
40
50
60
70
Novelty Seeking
HarmAvoidance
Introverts
Extraverts
Centroids
20 30 40 50 60 70
Novelty Seeking
HarmAvoidance
70
60
50
40
30
20
10
0
Subject Grouping
high HA & low NS
low HA & high NS
Centroids×
B. k-means ClusteringA. Input Dataset
The goal of k-means clustering is
to minimise the within-cluster 

sum of squares.
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 52
TDA to extract personality groups
20
30
40
50
60
NS HA RD P
Group1 Group2
Distance function: L2-distance Filter function: L-infinity eccentricity
Low High
Low resolution
(5 intervals, 50% overlap)
Group 1

(n=7)
Group 2

(n=4)
HA,NS,RD,P=(32,65,58,49)
1 1
1
High resolution
(5 intervals, 70% overlap)
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 53
Welfare / civil engineering / suicide ratio
Data Download: http://newstapa.com/news/201411935
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 54
Nation’s public data analysis
Distance function: L2-distance Filter function: L-infinity eccentricity
Input data: ratio of welfare to civil engineering (2009), ratio of welfare to civil engineering (2012), Suicide rate (2012)
부산(남구)

인천(연수구, 남동구, 서구)

광주(동구)

강원(본청)
광주(북구)
전북(본청)
대구(북구)
서울(노원구)
대구(달서구)
대전(서구)
강원(홍천, 양양)
충북(단양)
전북(장수)
전남(함평)
경남(함양)
Group 1
Group 3
Group 2
Group 4
Low High
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 55
0
1
2
3
4
5
6
7
Group1 Group3Group2 Group4
Ratio of Welfare to Civil Eng. (2009) Ratio of Welfare to Civil Eng. (2012)
Group1 Group2 Group3 Group4
Suicide ratio (2012)
70
60
50
40
30
20
10
0
광주(북구)
전북(본청)
대구(북구)
서울(노원구)
대구(달서구)
대전(서구)
강원(홍천, 양양)
충북(단양)
전북(장수)
전남(함평)
경남(함양)
부산(남구)

인천(연수구, 남동구, 서구)

광주(동구)

강원(본청)
Blog posting: http://skyeong.tistory.com/136/
Welfare | Civil Eng. | Suicide ratio
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 56
Topological Data Analysis,

new weapon for 

discovering new insight from data.
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p
Conclusion
• TDA can be applied to various dataset and has a coordinate free
characteristic.
• Useful for analysing non-linear dataset.
• Selection and optimisation of distance and filter metric are important
issue.
• TDA will be a powerful weapon for those who want to find a new insight
from data
• It’s possible to make a supervised machine learning system using TDA.
57
Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p
References
1. Gurgeek Singh et al., Topological Methods for the Analysis of High Dimensional Data Sets and
3D Object Recognition, Eurographics Symposium on Point-Based Graphics, 2007.
2. Gunnar Carlsson, Topology and Data, Bull. Amer. Math. Soc. 46(2):255-308, 2009.
3. Monica Nicolau et al., Topology based data analysis identifies a subgroup of breast cancers
with a unique mutational profile and excellent survival, PNAS 108(17):7265-7270, 2011.
4. P. Y. Lum et al., Extracting insights from the shapes of complex data using topology, Nature
Scientific Reports 3:1236, 2013.
5. Li Li et al., Identification of type 2 diabetes subgroups through topological analysis of patients
similarity, Science Translational Medicine 7(311):311ra174, 2015.
6. Jessica L. Nielson et al., Topological data analysis for discovery in preclinical spinal cord injury
and traumatic brain injury, Nature communications 6:8581, 2015.
7. AYASDI, a commercial software for TDA, http://www.ayasdi.com/
58

Contenu connexe

Tendances

Topological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsTopological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsMason Porter
 
Visualization using tSNE
Visualization using tSNEVisualization using tSNE
Visualization using tSNEYan Xu
 
The neural tangent link between CNN denoisers and non-local filters
The neural tangent link between CNN denoisers and non-local filtersThe neural tangent link between CNN denoisers and non-local filters
The neural tangent link between CNN denoisers and non-local filtersJulián Tachella
 
Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...
Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...
Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...Praxitelis Nikolaos Kouroupetroglou
 
Dimensionality reduction with UMAP
Dimensionality reduction with UMAPDimensionality reduction with UMAP
Dimensionality reduction with UMAPJakub Bartczuk
 
Pattern recognition and Machine Learning.
Pattern recognition and Machine Learning.Pattern recognition and Machine Learning.
Pattern recognition and Machine Learning.Rohit Kumar
 
Visualizing Data Using t-SNE
Visualizing Data Using t-SNEVisualizing Data Using t-SNE
Visualizing Data Using t-SNEDavid Khosid
 
Fuzzy c-means clustering for image segmentation
Fuzzy c-means  clustering for image segmentationFuzzy c-means  clustering for image segmentation
Fuzzy c-means clustering for image segmentationDharmesh Patel
 
ベイズ推論とシミュレーション法の基礎
ベイズ推論とシミュレーション法の基礎ベイズ推論とシミュレーション法の基礎
ベイズ推論とシミュレーション法の基礎Tomoshige Nakamura
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsShunta Saito
 
MACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHMMACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHMPuneet Kulyana
 
Bayesian Neural Networks
Bayesian Neural NetworksBayesian Neural Networks
Bayesian Neural NetworksNatan Katz
 
[DL輪読会]"CyCADA: Cycle-Consistent Adversarial Domain Adaptation"&"Learning Se...
 [DL輪読会]"CyCADA: Cycle-Consistent Adversarial Domain Adaptation"&"Learning Se... [DL輪読会]"CyCADA: Cycle-Consistent Adversarial Domain Adaptation"&"Learning Se...
[DL輪読会]"CyCADA: Cycle-Consistent Adversarial Domain Adaptation"&"Learning Se...Deep Learning JP
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks남주 김
 
Data Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksData Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksBICA Labs
 
Introduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisIntroduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisJaclyn Kokx
 

Tendances (20)

Topological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial SystemsTopological Data Analysis of Complex Spatial Systems
Topological Data Analysis of Complex Spatial Systems
 
Fuzzy Clustering(C-means, K-means)
Fuzzy Clustering(C-means, K-means)Fuzzy Clustering(C-means, K-means)
Fuzzy Clustering(C-means, K-means)
 
Visualization using tSNE
Visualization using tSNEVisualization using tSNE
Visualization using tSNE
 
The neural tangent link between CNN denoisers and non-local filters
The neural tangent link between CNN denoisers and non-local filtersThe neural tangent link between CNN denoisers and non-local filters
The neural tangent link between CNN denoisers and non-local filters
 
Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...
Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...
Presentation - Msc Thesis - Machine Learning Techniques for Short-Term Electr...
 
Dimensionality reduction with UMAP
Dimensionality reduction with UMAPDimensionality reduction with UMAP
Dimensionality reduction with UMAP
 
Pattern recognition and Machine Learning.
Pattern recognition and Machine Learning.Pattern recognition and Machine Learning.
Pattern recognition and Machine Learning.
 
Machine learning & Time Series Analysis
Machine learning & Time Series AnalysisMachine learning & Time Series Analysis
Machine learning & Time Series Analysis
 
Visualizing Data Using t-SNE
Visualizing Data Using t-SNEVisualizing Data Using t-SNE
Visualizing Data Using t-SNE
 
Fuzzy c-means clustering for image segmentation
Fuzzy c-means  clustering for image segmentationFuzzy c-means  clustering for image segmentation
Fuzzy c-means clustering for image segmentation
 
ベイズ推論とシミュレーション法の基礎
ベイズ推論とシミュレーション法の基礎ベイズ推論とシミュレーション法の基礎
ベイズ推論とシミュレーション法の基礎
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methods
 
MACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHMMACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHM
 
Bayesian Neural Networks
Bayesian Neural NetworksBayesian Neural Networks
Bayesian Neural Networks
 
[DL輪読会]"CyCADA: Cycle-Consistent Adversarial Domain Adaptation"&"Learning Se...
 [DL輪読会]"CyCADA: Cycle-Consistent Adversarial Domain Adaptation"&"Learning Se... [DL輪読会]"CyCADA: Cycle-Consistent Adversarial Domain Adaptation"&"Learning Se...
[DL輪読会]"CyCADA: Cycle-Consistent Adversarial Domain Adaptation"&"Learning Se...
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Data Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksData Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural Networks
 
Hierarchical Clustering
Hierarchical ClusteringHierarchical Clustering
Hierarchical Clustering
 
PhD Defense
PhD DefensePhD Defense
PhD Defense
 
Introduction to Linear Discriminant Analysis
Introduction to Linear Discriminant AnalysisIntroduction to Linear Discriminant Analysis
Introduction to Linear Discriminant Analysis
 

Similaire à Topological data analysis

Topological Data Analysis and Persistent Homology
Topological Data Analysis and Persistent HomologyTopological Data Analysis and Persistent Homology
Topological Data Analysis and Persistent HomologyCarla Melia
 
[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics
[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics
[SIGIR17] Learning to Rank Using Localized Geometric Mean MetricsYuxin Su
 
Spatial analysis and modeling
Spatial analysis and modelingSpatial analysis and modeling
Spatial analysis and modelingTolasa_F
 
A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...
A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...
A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...theijes
 
Rank based similarity search reducing the dimensional dependence
Rank based similarity search reducing the dimensional dependenceRank based similarity search reducing the dimensional dependence
Rank based similarity search reducing the dimensional dependenceredpel dot com
 
Data reduction techniques for high dimensional biological data
Data reduction techniques for high dimensional biological dataData reduction techniques for high dimensional biological data
Data reduction techniques for high dimensional biological dataeSAT Journals
 
Winner of EY NextWave Data Science Challenge 2019
Winner of EY NextWave Data Science Challenge 2019Winner of EY NextWave Data Science Challenge 2019
Winner of EY NextWave Data Science Challenge 2019ByungEunJeon
 
1st Place in EY Data Science Challenge
1st Place in EY Data Science Challenge 1st Place in EY Data Science Challenge
1st Place in EY Data Science Challenge Hyunju Shim
 
Advanced Intelligent Systems - 2020 - Sha - Artificial Intelligence to Power ...
Advanced Intelligent Systems - 2020 - Sha - Artificial Intelligence to Power ...Advanced Intelligent Systems - 2020 - Sha - Artificial Intelligence to Power ...
Advanced Intelligent Systems - 2020 - Sha - Artificial Intelligence to Power ...remAYDOAN3
 
Developing Competitive Strategies in Higher Education through Visual Data Mining
Developing Competitive Strategies in Higher Education through Visual Data MiningDeveloping Competitive Strategies in Higher Education through Visual Data Mining
Developing Competitive Strategies in Higher Education through Visual Data MiningGurdal Ertek
 
[GAGIS]A Hierarchical Representation and Computation Scheme of Arbitrary-dime...
[GAGIS]A Hierarchical Representation and Computation Scheme of Arbitrary-dime...[GAGIS]A Hierarchical Representation and Computation Scheme of Arbitrary-dime...
[GAGIS]A Hierarchical Representation and Computation Scheme of Arbitrary-dime...wen luo
 
A vision-based uncut crop edge detection method for automated guidance of hea...
A vision-based uncut crop edge detection method for automated guidance of hea...A vision-based uncut crop edge detection method for automated guidance of hea...
A vision-based uncut crop edge detection method for automated guidance of hea...Institute of Agricultural Machinery, NARO
 
Lecture7 xing fei-fei
Lecture7 xing fei-feiLecture7 xing fei-fei
Lecture7 xing fei-feiTianlu Wang
 
Impact of attendance on students’ academic performance in ict related courses...
Impact of attendance on students’ academic performance in ict related courses...Impact of attendance on students’ academic performance in ict related courses...
Impact of attendance on students’ academic performance in ict related courses...Alexander Decker
 
A framework for outlier detection in
A framework for outlier detection inA framework for outlier detection in
A framework for outlier detection inijfcstjournal
 
Accurate and low complex cell histogram generation by bypass the gradient of ...
Accurate and low complex cell histogram generation by bypass the gradient of ...Accurate and low complex cell histogram generation by bypass the gradient of ...
Accurate and low complex cell histogram generation by bypass the gradient of ...Nothing!
 
Precise identification of objects in a hyperspectral image by characterizing...
Precise identification of objects in a hyperspectral image by  characterizing...Precise identification of objects in a hyperspectral image by  characterizing...
Precise identification of objects in a hyperspectral image by characterizing...IJECEIAES
 
A Graph Summarization: A Survey | Summarizing and understanding large graphs
A Graph Summarization: A Survey | Summarizing and understanding large graphsA Graph Summarization: A Survey | Summarizing and understanding large graphs
A Graph Summarization: A Survey | Summarizing and understanding large graphsaftab alam
 

Similaire à Topological data analysis (20)

Topological Data Analysis and Persistent Homology
Topological Data Analysis and Persistent HomologyTopological Data Analysis and Persistent Homology
Topological Data Analysis and Persistent Homology
 
[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics
[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics
[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics
 
Spatial analysis and modeling
Spatial analysis and modelingSpatial analysis and modeling
Spatial analysis and modeling
 
A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...
A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...
A Novel Feature Selection with Annealing For Computer Vision And Big Data Lea...
 
Rank based similarity search reducing the dimensional dependence
Rank based similarity search reducing the dimensional dependenceRank based similarity search reducing the dimensional dependence
Rank based similarity search reducing the dimensional dependence
 
Data reduction techniques for high dimensional biological data
Data reduction techniques for high dimensional biological dataData reduction techniques for high dimensional biological data
Data reduction techniques for high dimensional biological data
 
Winner of EY NextWave Data Science Challenge 2019
Winner of EY NextWave Data Science Challenge 2019Winner of EY NextWave Data Science Challenge 2019
Winner of EY NextWave Data Science Challenge 2019
 
1st Place in EY Data Science Challenge
1st Place in EY Data Science Challenge 1st Place in EY Data Science Challenge
1st Place in EY Data Science Challenge
 
call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...
 
Advanced Intelligent Systems - 2020 - Sha - Artificial Intelligence to Power ...
Advanced Intelligent Systems - 2020 - Sha - Artificial Intelligence to Power ...Advanced Intelligent Systems - 2020 - Sha - Artificial Intelligence to Power ...
Advanced Intelligent Systems - 2020 - Sha - Artificial Intelligence to Power ...
 
GDRR Opening Workshop - Gradient Boosting Trees for Spatial Data Prediction ...
GDRR Opening Workshop -  Gradient Boosting Trees for Spatial Data Prediction ...GDRR Opening Workshop -  Gradient Boosting Trees for Spatial Data Prediction ...
GDRR Opening Workshop - Gradient Boosting Trees for Spatial Data Prediction ...
 
Developing Competitive Strategies in Higher Education through Visual Data Mining
Developing Competitive Strategies in Higher Education through Visual Data MiningDeveloping Competitive Strategies in Higher Education through Visual Data Mining
Developing Competitive Strategies in Higher Education through Visual Data Mining
 
[GAGIS]A Hierarchical Representation and Computation Scheme of Arbitrary-dime...
[GAGIS]A Hierarchical Representation and Computation Scheme of Arbitrary-dime...[GAGIS]A Hierarchical Representation and Computation Scheme of Arbitrary-dime...
[GAGIS]A Hierarchical Representation and Computation Scheme of Arbitrary-dime...
 
A vision-based uncut crop edge detection method for automated guidance of hea...
A vision-based uncut crop edge detection method for automated guidance of hea...A vision-based uncut crop edge detection method for automated guidance of hea...
A vision-based uncut crop edge detection method for automated guidance of hea...
 
Lecture7 xing fei-fei
Lecture7 xing fei-feiLecture7 xing fei-fei
Lecture7 xing fei-fei
 
Impact of attendance on students’ academic performance in ict related courses...
Impact of attendance on students’ academic performance in ict related courses...Impact of attendance on students’ academic performance in ict related courses...
Impact of attendance on students’ academic performance in ict related courses...
 
A framework for outlier detection in
A framework for outlier detection inA framework for outlier detection in
A framework for outlier detection in
 
Accurate and low complex cell histogram generation by bypass the gradient of ...
Accurate and low complex cell histogram generation by bypass the gradient of ...Accurate and low complex cell histogram generation by bypass the gradient of ...
Accurate and low complex cell histogram generation by bypass the gradient of ...
 
Precise identification of objects in a hyperspectral image by characterizing...
Precise identification of objects in a hyperspectral image by  characterizing...Precise identification of objects in a hyperspectral image by  characterizing...
Precise identification of objects in a hyperspectral image by characterizing...
 
A Graph Summarization: A Survey | Summarizing and understanding large graphs
A Graph Summarization: A Survey | Summarizing and understanding large graphsA Graph Summarization: A Survey | Summarizing and understanding large graphs
A Graph Summarization: A Survey | Summarizing and understanding large graphs
 

Dernier

BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx023NiWayanAnggiSriWa
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxmaryFF1
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
Forensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxForensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxkumarsanjai28051
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomyDrAnita Sharma
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 

Dernier (20)

BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
Forensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptxForensic limnology of diatoms by Sanjai.pptx
Forensic limnology of diatoms by Sanjai.pptx
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomy
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 

Topological data analysis

  • 1. Sunghyon Kyeong
 Severance Biomedical Science Institute, 
 Yonsei University College of Medicine Topological Data Analysis -Methods and Examples-
  • 2. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p Contents • Brief overview of topological data analysis • About Ayasdi 
 Startup company providing solutions for data analytics • Topological data analysis - Methods • Applications to medical science data • Applications to social science data 2
  • 3. Brief Overview of 
 Topological Data Analysis
  • 4. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p Machine Learning 4 •Supervised Machine Learning 
 → Classification of new input data
 (LDA, bayesian, support vector machine, neural network, and so on) •Unsupervised Machine Learning 
 → Clustering of given dataset / Community detection
 (k-means clustering, modularity optimisation, ICA, PCA, and so on) •Topological Data Analysis
 → partial clustering with allowing overlaps among clusters
  • 5. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p World Interests for TDA 5 Heat map for viewers of my TDA slide at sliceshare (for 2500 viewers during 2015.2.14. - 2014.11.31.)
  • 7. Raw Data 
 (diabetes related data) An example
  • 9. Normal Type II 
 diabetes Type I 
 diabetes An example
  • 11. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p When to use TDA? 11 • To study complex high-dimensional data
 : feature selections are not required in TDA • Extracting shapes (patterns) of data • Insights qualitative information is needed. • Summaries are more valuable than individual parameter choices.
  • 12. Geometric Topology Algebraic Betti0: 8 Betti1: 5 Betti0: 10 Betti1: 5 Topology Betti0: clusters Betti1: holes Betti2: voids mathematically 
 defined “holes” in data
  • 13. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 13 Persistent homology is a spatial type of homology that is useful for data analysis. Betti numbers, which come from computing homology, reflect the topological properties of an object. B O ㅂ q b : the connected components : the number of holes Algebraic Topology Quantitive Information
  • 14. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 14 Homology homeomorphic to homeomorphic to , Betti2 = 1 , Betti2 = 0 Ref) Xiaojin Zhu, IJCAI 2013 presentation slide
  • 15. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 15 ? Ref) Figures are obtained from Y.P. Lum et al (2013) Scientific Reports | 3: 1236 points cloud data topology Geometric Topology Extracting Shapes of Data
  • 16. Topological Data Analysis using Mapper Two input functions
 - filter is to collapse high-dimensional data set into a single point
 - distance as a measure of distance between data points Resolution Parameters - Intervals, overlap, magic fudge
  • 17. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 17 Filter Filter Range Intervals Overlap Interval Length : 0.0~2.0 : 5 : 50% : 0.4 0 0.4 0.8 1.2 1.6 2.0 0.2 0.6 1.0 1.4 1.8 NumberofNodesinEachFilterBin 0 10 20 30 40 50 Filter Metric 0 1 2single cluster one or two cluster(s)? : Divide point clouds into each filter bin
  • 18. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 18 Filter Function - Filter function is not necessarily linear projections on a data matrix. - People often uses functions that depend only on the distance function itself, such as a measure of centrality. - Some filter functions may not produce any interesting shapes. Ref) Figures are obtained from Y.P. Lum et al (2013) Scientific Reports | 3: 1236
  • 19. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 19 Distance Function -distance between all pairs of data points. -both euclidean or geodesic distances could be used.
  • 20. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 0 4 8 12 1 2 3 4 5 20 Distance & Clustering • Single linkage dengrogram is used for clustering point clouds based on distance between two nodes. 0 3 6 9 1 2 3 4 5 Magic Fudge is the number of bins in the distribution of the distance obtained from single linkage dendrogram. No. of clusters are estimated from the number of continuous bins having zero elements. 1 Cluster 2 Clusters
  • 21. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 21 Nodes, Edges, Colors 1,2,3,
 4,5 0,1,2,7 5,6,9 10 Nodes are groups of similar objects Edges connect similar nodes Colors let you see values of interest 8,11 3,4,8,12 12 1412,14 1311,13 indices for points cloud: 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14
  • 22. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 22 Topology extraction Filter Range Intervals Overlap Magic Fudge : 0 ~ 2 : 5 : 50% : 5 Filter Partial Clustering The size of node represents the number of point clouds in each node.
  • 23. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 23 A) C) F1 F2 Filter Extract position information of y-axis F3 F4 F5 F6 F7 F8 F9 F10 B) F1 F2 F3 F4 F5 F6 F7 F8 F9 F9 F10 F10 0 10 20 30 1 2 3 4 5 9th Filter bin 1 1 1st Filter bin 0 10 20 1 2 3 4 5 11 Topology of Y-shape points cloud
  • 24. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 24 interval = 5
 overlap = 20% interval = 5
 overlap = 50% interval = 5
 overlap = 80% interval = 10 overlap = 20% interval = 10 overlap = 50% interval = 10 overlap = 80% interval = 15 overlap = 20% interval = 15 overlap = 50% interval = 15
 overlap = 80%
  • 25. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p Mathematical details can be found at 25 • Gurjeet Singh et al. (2007), Topological methods for the analysis of high dimensional data sets and 3D object recognition. • Gunnar Carlsson (2009), Topology and data, Bull. Amer. Math. Soc. 46. 255-308. Gurjeet Singh and Gunnar Carlsson are 
 co-founders of Ayasdi
  • 26.
  • 27. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 27 US Patent by Ayasdi
  • 28. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 28
  • 29. Applications to 
 Neurobiological/Clinical Data International Neuroimaging Data-sharing Initiative (INDI) - URL: http://fcon_1000.projects.nitrc.org/index.html - Healthy control, ADHD, ASD data sets are available. - resting state fMRI, diffusion tensor imaging, - phenotype information such as intelligence scale and ADHD symptom severity are available.
  • 30. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p Dataset : ADHD Symptoms & IQ 30 1 2 3 4 5 1 0.0 61.5 77.3 51.6 77.0 2 61.5 0.0 62.2 55.9 69.0 3 77.3 62.2 0.0 47.2 11.9 4 51.6 55.9 47.2 0.0 52.4 5 77.0 69.0 11.9 52.4 0.0 Data set: L2-Distance (or Euclidean distance) for all pairwise subjects: Distance Matrix: (or Euclidean) Filter Function: f(x) = max d(x, y) y 2 x L-infinity eccentricity
  • 31. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 31 Phenotypic subgroups of ADHD AIQ-TDC
 (Average IQ) Low High TDC
 (Low IQ) Distance function: L2-distance Filter function: L-infinity eccentricity N of subjects = 204 Low resolution AIQ-ADHD
 (Average IQ & 
 Borderline Sx ) HIQ-TDC
 (High IQ) LIQ-ADHD
 (Low IQ & Ordinary Sx)
  • 32. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 32 30 60 90 ADHD 60 30 60 90 Inattentive 59 30 60 90 Hyper/Impulsivity 59 70 100 130 FSIQ 108 70 100 130 VIQ 109 70 100 130 PIQ 105 HIQ-TDC
 (High IQ) AIQ-TDC 
 (Average IQ) LIQ-ADHD
 (Ordinary Sx) AIQ-ADHD
 (Borderline Sx) All subjects 
 (Avg. Sx & Avg. IQ) 13 (13 / 0) 21 (17 / 4) 12 (1 / 11) 19 (2 / 17) 204 (90 / 114)
  • 33. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 33 Functional Modular Architectures AIQ-TDC 
 (TDC with Average IQ) LIQ-ADHD 
 (ADHD with Ordinary Sx.) AIQ-ADHD 
 (ADHD with Borderline Sx.) DMN module Occipital module Central module Parietofrontal module BG/THL module Temporal/Limbic module HIQ-TDC 
 (TDC with High IQ) ※ Only positive weights of the functional connectivity are used for module analysis and visualised.
  • 35. Diabetes Subtypes based on six quantities: - age - relative weight - fasting plasma glucose - area under the plasma glucose curve for the three hour glucose tolerance test (OGTT) - area under the plasma insulin curve for the OGTT - steady state plasma glucose (SSPG) response SSPG Glucose area Insulin area Ref) Eurographics Symposium on Point-Based Graphics, Singh G et al. (2007)
  • 36. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 36 Subtypes of diabetes Type I 
 diabetes Type II 
 diabetes Type I 
 diabetes Type II 
 diabetes Low-resolution High-resolution Interval: 3
 Overlay: 50% Interval: 4
 Overlay: 80% Type I: adult onset TypeII: juvenile onset Distance function: L2-distance Filter function: density kernel with e=130,000
  • 37. Subtypes of Breast Cancer using breast cancer microarray gene expression data set - disease specific genomic analysis (DSGA) transformed data Application to Biological Data, PNAS, Monica Nicolau et al. (2010)
  • 38. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 38 Breast Cancer Subtype ER: Estrogen Receptor PNAS, Monica Nicolau et al. (2010)
  • 39.
  • 40. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p Diabetes Mellitus (DM) • Commonly referred to as diabetes, a group of metabolic disease. • If left untreated, diabetes can cause many complications: 
 cardiovascular disease, stroke, chronic kidney failure, foot ulcers, and damage to eyes. • Diabetes is due to either the pancreas not producing enough insulin or the cells of the body not responding properly to the insulin produced. • Type 1 DM: results from the pancreas’s failure to produce enough insulin. • Type 2 DM: begins with insulin resistance, a condition in which cells fail to respond to
 insulin properly. heavy weight and not enough exercise are causes of T2D. • Gestational diabetes, is the third main form and occurs when pregnant women without a previous history of diabetes develop a high blood-sugar level. 40
  • 41. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p Why subgroups of type 2 DM? • Risk factors of Type 2 DM are:
 obesity, family history of diabetes, physical inactivity, ethnicity, and advanced age. • Type 2 DM is heterogenous complex disease affecting 
 more than 29 million in American (9.3%). 2 million in Korea. • Increasing needs for early prevention and clinical management of Type 2 DM. 41
  • 42. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p Methods • High dimensional EMRs and genotype data from 11,210 individuals from Mount Sinai Medical Center (MSMC)’s outpatient population (46% Hispanic, 32% African american, 20% European white, and 2% others). • Type 2 DM and non-Type 2 DM were defined by an electronic phenotyping algorithm (eMERGE network) based on ICM-9-CM diagnosis codes, laboratory test, prescribed medications (RxNorm), physician notes (natural language processing), and family history. • Form of preprocessed data matrix is n patients by P medical variables. 
 Medical variables included 505 clinical variable, 7097 unique ICM-9-CM codes (1 to 218 per patients). On average, 64 clinical variables per patients. To avoid overfitting, select the variable with at least 50% of patients who had the variables, resulting in 73 (of 505) variables to perform the analysis. 42
  • 43. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p TDA pipeline • Distance metric: cosine distance metric was used to assess the similarity of the data points. • Filter metric: L-infinity centrality and principal metric singular value decomposition (SVD1). 
 L-infinity centrality is defined for each data point y to be the maximum distance from y to any other data point in the data set. Large values of this function correspond to points that are far from the centre of data set. 43
  • 44. 3889 patients 7321 patients 44
  • 45. TDA with only 
 2551 Type II DM 762
 Subtype 1 617
 Subtype 2 1096
 Subtype 3 Gender is not an organising factor in the topology Reproducibility test (2/3 training, 1/3 test data set, 10 times) revealed that the overall accuracy was 96% for a subtype classification. 45
  • 46. Applications to 
 Social Science Data - Classification of (basket ball) player types - Partial clustering of personality using temperamental traits - Relationship among welfare, civil construction, and suicide rate
  • 47. Classification of player types based on their in-game performance such as: - rates (per minute played) of rebounds, assists, turnovers, steals, blocked shots, personal fouls, and points scored (7 performance measures)
  • 48. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 48 Map of Players low resolution map at 20 intervals high resolution map at 30 intervals Points Per Game Low High Filter function: 
 Principal and secondary SVD values Distance function: 
 Variance normalised L2-distance Traditionally, basket ball players are categorised into guards, forwards, and center.
  • 49. TDA versus K-means clustering Grouping subjects into personality types
  • 50. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 50 Novelty seeking Harm avoidance Reward dependance Persistence Temperament and character inventory (TCI) scores from 40 normal subjects
  • 51. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 51 the mean of points in Si V = 2X i=1 X xj 2Si ||xj µi||2 S = {S1, S2} 20 30 40 50 60 70 0 10 20 30 40 50 60 70 Novelty Seeking HarmAvoidance Introverts Extraverts Centroids 20 30 40 50 60 70 Novelty Seeking HarmAvoidance 70 60 50 40 30 20 10 0 Subject Grouping high HA & low NS low HA & high NS Centroids× B. k-means ClusteringA. Input Dataset The goal of k-means clustering is to minimise the within-cluster 
 sum of squares.
  • 52. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 52 TDA to extract personality groups 20 30 40 50 60 NS HA RD P Group1 Group2 Distance function: L2-distance Filter function: L-infinity eccentricity Low High Low resolution (5 intervals, 50% overlap) Group 1
 (n=7) Group 2
 (n=4) HA,NS,RD,P=(32,65,58,49) 1 1 1 High resolution (5 intervals, 70% overlap)
  • 53. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 53 Welfare / civil engineering / suicide ratio Data Download: http://newstapa.com/news/201411935
  • 54. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 54 Nation’s public data analysis Distance function: L2-distance Filter function: L-infinity eccentricity Input data: ratio of welfare to civil engineering (2009), ratio of welfare to civil engineering (2012), Suicide rate (2012) 부산(남구)
 인천(연수구, 남동구, 서구)
 광주(동구)
 강원(본청) 광주(북구) 전북(본청) 대구(북구) 서울(노원구) 대구(달서구) 대전(서구) 강원(홍천, 양양) 충북(단양) 전북(장수) 전남(함평) 경남(함양) Group 1 Group 3 Group 2 Group 4 Low High
  • 55. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 55 0 1 2 3 4 5 6 7 Group1 Group3Group2 Group4 Ratio of Welfare to Civil Eng. (2009) Ratio of Welfare to Civil Eng. (2012) Group1 Group2 Group3 Group4 Suicide ratio (2012) 70 60 50 40 30 20 10 0 광주(북구) 전북(본청) 대구(북구) 서울(노원구) 대구(달서구) 대전(서구) 강원(홍천, 양양) 충북(단양) 전북(장수) 전남(함평) 경남(함양) 부산(남구)
 인천(연수구, 남동구, 서구)
 광주(동구)
 강원(본청) Blog posting: http://skyeong.tistory.com/136/ Welfare | Civil Eng. | Suicide ratio
  • 56. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p 56 Topological Data Analysis,
 new weapon for 
 discovering new insight from data.
  • 57. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p Conclusion • TDA can be applied to various dataset and has a coordinate free characteristic. • Useful for analysing non-linear dataset. • Selection and optimisation of distance and filter metric are important issue. • TDA will be a powerful weapon for those who want to find a new insight from data • It’s possible to make a supervised machine learning system using TDA. 57
  • 58. Sunghyon Kyeong (Yonsei University) | sunghyon.kyeong@gmail.com | Topological Data Analysis: Methods and Examples | p References 1. Gurgeek Singh et al., Topological Methods for the Analysis of High Dimensional Data Sets and 3D Object Recognition, Eurographics Symposium on Point-Based Graphics, 2007. 2. Gunnar Carlsson, Topology and Data, Bull. Amer. Math. Soc. 46(2):255-308, 2009. 3. Monica Nicolau et al., Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival, PNAS 108(17):7265-7270, 2011. 4. P. Y. Lum et al., Extracting insights from the shapes of complex data using topology, Nature Scientific Reports 3:1236, 2013. 5. Li Li et al., Identification of type 2 diabetes subgroups through topological analysis of patients similarity, Science Translational Medicine 7(311):311ra174, 2015. 6. Jessica L. Nielson et al., Topological data analysis for discovery in preclinical spinal cord injury and traumatic brain injury, Nature communications 6:8581, 2015. 7. AYASDI, a commercial software for TDA, http://www.ayasdi.com/ 58