4. Homophily
▪“Birds of feather flock together”
▪Similarity begets friendship – Plato
▪People loves those who are like
themselves - Aristotle
▪Homophily - we tend to be similar
to our friends
4/20/2021 VANI KANDHASAMY, PSGTECH 4
5. Homophily
▪Links in a social network tend to connect people who are similar to one another
oImmutable characteristics: racial and ethnic groups; ages; etc.
oMutable characteristics: places living, occupations, levels of affluence, and interests, beliefs,
and opinions; etc
▪Friendship through common friend – Triadic closure (network structure)
▪Friendship through common school/college/job/interest – Contextual features
4/20/2021 VANI KANDHASAMY, PSGTECH 5
6. Homophily Test
▪Consider a random network G = (V, E) where each node is assigned male with
probability p, and female with probability 1 - p
▪Consider any edge (i, j) Ꜫ E of this random network G
▪Both ends of edge will be male/female with probability p2 / (1-p)2
▪Let random variable Xij = 1 if it is a cross-edge, and Xij = 0 otherwise
▪Then Xij is a Bernoulli random variable such that P(Xij = 1) = 2p(1 - p)
Homophily Test: If the fraction of cross-gender edges is significantly less
than 2p(1 - p), then there is evidence for homophily
4/20/2021 VANI KANDHASAMY, PSGTECH 6
7. Homophily Test
▪p = 2/3 and the fraction of cross-
gender edges is 5/18
▪On the other hand the fraction
of cross-gender edges in the
random network = 2p(1 - p) =
4/9 = 8/18
▪Note that 5/18 < 8/18 showing
evidence of homophily
4/20/2021 VANI KANDHASAMY, PSGTECH 7
8. Homophily
SOCIAL INFLUENCE
▪Influenced by the people we are connected to
▪Socialization – existing links shapes people’s
interest
SELECTION
▪We select friends who are similar to us
▪Formation of new links
4/20/2021 VANI KANDHASAMY, PSGTECH 8
9. Affiliation Networks
▪Affiliation networks are examples of a class of graphs called bipartite graphs
▪Affiliation networks represents the participation of a set of people in a set of foci
4/20/2021 VANI KANDHASAMY, PSGTECH 9
10. Affiliation Networks
Triadic closure: The formation
of the link between B and C
with common friend A
Focal closure (selection): The
formation of the link between
B and C with common interest
A
Membership closure (social
influence): The formation of
the link between B and C
influenced by a friend A
4/20/2021 VANI KANDHASAMY, PSGTECH 10
13. Stochastic Block Models
▪Posterior Block modelling: By examining how network is generated from
underlying community structure one can detect communities in network
▪Goal: Define a model that can generate networks
oThe model will have a set of “parameters” that we will later used to detect communities
Given a set of nodes, how do communities “generate” edges of the network?
4/20/2021 VANI KANDHASAMY, PSGTECH 13
14. Stochastic Block Models
1. Given the communities, generate the network using the model
2. Given a network, find the “best” community using the model developed
C
A
B
D E
H
F
G
C
A
B
D E
H
F
G
Generative model
C
A
B
D
H
C
A
B
D
H
Generative model
4/20/2021 VANI KANDHASAMY, PSGTECH 14
15. Community-Affiliation Graph
Generative model B(V, C, M, {pc}) for graphs:
◦ Nodes V, Communities C, Memberships M
◦ Each community c has a single probability pc
◦ Later we fit the model to networks to detect communities
Model
Network
Communities, C
Nodes, V
Model
pA pB
Memberships, M
4/20/2021 VANI KANDHASAMY, PSGTECH 15
16. AGM: Generative Process
◦ For each pair of nodes in community 𝑨, we connect them with prob. 𝒑𝑨
◦ The overall edge probability is:
−
−
=
v
u M
M
c
c
p
v
u
P )
1
(
1
)
,
( If 𝒖, 𝒗 share no communities:
𝑷 𝒖, 𝒗 = 𝜺
𝑴𝒖 … set of communities
node 𝒖 belongs to
4/20/2021 VANI KANDHASAMY, PSGTECH 16
17. Detecting Communities
C
A
B
D E
H
F
G
Given a Graph 𝑮(𝑽, 𝑬), find the Model
1. Affiliation graph M
2. Number of communities C
3. Parameters pc
4/20/2021 VANI KANDHASAMY, PSGTECH 17
18. Maximum Likelihood Estimation
Given: Data 𝑿
Assumption: Data is generated by some model 𝒇(𝚯)
𝒇 … model
𝚯 … model parameters
Want to estimate: 𝑷𝒇 𝑿 𝚯)
The probability that our model 𝒇 (with parameters 𝜣) generated the data 𝑿
We need to find the most likely model that could have generated the data:
arg max
Θ
𝑷𝒇 𝑿 𝚯)
4/20/2021 VANI KANDHASAMY, PSGTECH 18
19. Example: MLE
Imagine we are given a set of coin flips
Task: Figure out the bias of a coin!
◦ Data: Sequence of coin flips: 𝑿 = [𝟏, 𝟎, 𝟎, 𝟎, 𝟏, 𝟎, 𝟎, 𝟏]
◦ Model: 𝒇 𝚯 = return 1 with prob. Θ, else return 0
◦ What is 𝑷𝒇 𝑿 𝚯 ? Assuming coin flips are independent
◦ So, 𝑷𝒇 𝑿 𝚯 = 𝑷𝒇 𝟏 𝚯 ∗ 𝑷𝒇 𝟎 𝚯 ∗ 𝑷𝒇 𝟎 𝚯 … ∗ 𝑷𝒇 𝟏 𝚯
◦ Then, 𝑷𝒇 𝑿 𝚯 = 𝚯𝟑
𝟏 − 𝚯 𝟓
◦ For example:
◦ 𝑷𝒇 𝑿 𝚯 = 𝟎. 𝟓 = 𝟎. 𝟎𝟎𝟑𝟗𝟎𝟔
◦ 𝑷𝒇 𝑿 𝚯 =
𝟑
𝟖
= 𝟎. 𝟎𝟎𝟓𝟎𝟐𝟗
◦ Data 𝑿 was most likely generated by coin with bias 𝚯 = 𝟑/𝟖
𝑷
𝒇
𝑿
𝚯
𝚯
𝚯∗
= 𝟑/𝟖
4/20/2021 VANI KANDHASAMY, PSGTECH 19
20. Maximum Likelihood Estimation
Goal: Find 𝚯 = 𝑩(𝑽, 𝑪, 𝑴, 𝒑𝑪 ) such that:
How do we find 𝑩(𝑽, 𝑪, 𝑴, 𝒑𝑪 ) that maximizes the
likelihood?
Finding B means finding the bipartite affiliation network
P( | )
AGM
arg max
𝑮
4/20/2021 VANI KANDHASAMY, PSGTECH 20
21. Maximum Likelihood Estimation
Given graph G(V,E) and Θ, we calculate likelihood that Θ generated G: P(G|Θ)
0 0.9 0.9 0
0.9 0 0.9 0
0.9 0.9 0 0.9
0 0 0.9 0
Θ=B(V, C, M, {pc})
0 1 1 0
1 0 1 0
1 1 0 1
0 0 1 0
G
P(G|Θ)
G
A B
arg max
𝐵(𝑽,𝑪,𝑴, 𝒑𝑪 )
ෑ
𝒖,𝒗∈𝑬
𝑷 𝒖, 𝒗 ෑ
𝒖𝒗∉𝑬
(𝟏 − 𝑷 𝒖, 𝒗 )
4/20/2021 VANI KANDHASAMY, PSGTECH 21