1. Literature Survey to discuss topographical structure
of social networks and information propagation
Sathe, Vaibhav1
Indian Institute of Management Lucknow
IIM Campus, Prabandh Nagar, Off Sitapur Road, Lucknow, Uttar Pradesh – 226013, INDIA
1
vaibhav.sathe@iiml.org
Sr. Article/Paper Journal/Publisher
I. INT RODUCT ION 1 Measurement and Analysis of ACM
Facebook’s currently 800 million and continuously growing Online Social Networks
user base and increasing trend in time spent has attracted a lot 2 Linking via Social Similarity: The IEEE
of attraction fro m researchers in various fields. Recently Emergence of Co mmunity
Facebook has been used as platform for organizing mass Structure in Scale-free Network
protests in countries of middle-east. Even looking at events in 3 A fast algorith m for simu lating ICCTA (IEEE)
India like rise of India against Corruption and their Facebook scale-free networks
following of 500,000 people has underscored rising power of 4 Social Search in “Small-World” World Wide Web
social media. This has resulted in clashes with governments Experiments Consortium
which are seeking to curtail power of social networks and its 5 Recip rocity in evolving social Journal of
users to spread messages without restrictions. In our research, networks Evolutionary
we want to model this censorship activity. This literature Economics
survey is being conducted to support the research by
understanding network concepts required for modelling social B. Information Propagation
networks, primarily in areas of structure of network and how Following articles contribute to second objective of
message spreads. determining patterns in informat ion spread. Detailed reference
We will review some well cited papers published on top is included in references section.
Information Systems journals to identify various dimensions Sr. Article/Paper Journal/Publisher
required for modelling exercise. 1 Network Effects and Personal Journal of
II. PROBLEM DEFINIT ION Influences: The Diffusion of an Marketing
Online Social Network Research
Following are objectives of this literature review. 2 Forward or delete: What drives Journal of
peer-to-peer message propagation consumer
(1) Structure of Social Networks: across social networks? behaviour
In order to model social network, we need to determine 3 User Interactions in Social EuroSys’09, ACM
which model fro m network science applies to social Networks and their Implications
network. Probable options are small world, random
4 Online organization of offline HICSS 2011
network and scale free network. It is also noted that Protest: Fro m Social to Tradit ional
different social networks may d isplay different structures
Media and Back
due to fundamental differences. Fro m point of view of
5 Information propagation analysis IEEE
censorship, we will focus mo re on social networks like
in a social network site
Facebook. Facebook clearly holds largest interest due to
6 Detecting and Characterizing IMC’10, ACM
largest user base which gives it capability to influence
Social Spam Campaigns
behaviour of actors involved in censorship related study.
IV. TERMINOLOGIES
(2) Information Propagation Pattern:
In order to identify parameters that model interactions of Let’s look at some terminologies in detail required to
users on social network which lead to information understand concepts discussed in this review.
diffusion, we need to understand how informat ion spreads
on networks and what all factors affect it. Power Law:
When frequency varies inversely with power of
III. LIT ERAT URE SEARCH quantifiable size of event, the relationship is said to follow
The literature surveyed for this is divided into following power law. One of the characteristics of such distribution is
large difference between mean and median.
sections.
A. Structure of Social Networks Types of networks:
Following articles contribute to first objective to determine
A. Random Networks
structure of social networks . Detailed reference is included in
references section. Random network are unstructured networks with low
clustering. They do not occur in nature. They are theoretically
2. studied to provide baseline for study of more structured area. They also know some more people at workp lace. There
networks like small world and scale free. is also tendency that they want to know more people and try to
gain access to larger contacts through person they think is
B. Small World Network well-connected. The information exchange may be intentional
Small world networks are networks wh ich have small or unintentional. The study of social networks focusses on
average path length due to large number of interconnections critical issues like d isease spread, news spread, riots, fads,
and high cluster coefficient. social awareness etc.
Online social networks demonstrate similar characteristics
C. Scale-Free Network with exception that users are not in physical connection with
Scale-free networks are those whose degree sequence each other. Examp les of online social networks include
distribution follows power law. i.e. the network consists of Facebook, Twitter, Flickr, YouTube or any other sites wh ich
Small nu mber of highly connected users and large number of facilitate interaction between users. This can be one-one
less connected users. (Google talk) or one-many (Facebook) or many-many (Foru m)
depending on nature of the site.
Terms related to networks:
B. Structure of Social Networks
(1) Network Diameter: Maximu m internode distance is called What graph structure social networks follo w has been very
diameter of network. interesting topic for the researchers as it is fundamental step in
(2) Indegree: No. of inward connections for given user. any modelling or simulation on the network.
(3) Outdegree: No. of outword connections for given user. Mislove et al [2] in their paper on measurement and analysis
This is valid measure when networks are directed graphs. of social network try to identify various characteristics of
Network like Facebook and Orkut are symmetrical social network. In the experiment they collected data fro m
networks i.e. for any user, indegree and outdegree are over 11.3 million users of Orkut, Youtube, Flickr and
equal. LiveJournal. When network analysis was done on each
(4) Assortativity: It is measure of likeliness that nodes in network, these networks follo wed Power Law. In addit ion,
network establish lin k with other node which is similar to they identified that these social networks display scale-free
it on some parameter. and small world properties. All networks have high clusters.
Authors have identified interesting parameter that whether
Information shared on Facebook: consent is required fro m second party to establish connection
The informat ion that is created and shared on Facebook by first party. The example is twitter, where anyone can
comes from various sources. These are as follows: follow you and you need not follow him. But on other hand,
(1) Status Messages: Users can share text message as their on Facebook, if so mebody wants to be friends with you then
status message. This is visible to other users (friends or he needs to send request and only when you approve, you both
others) on user’s wall. The message also appears in news become friends to each other. Twitter is example of
feed of other users which are friends or/and subscribed to asymmetric network wh ich has different indegree and
user’s updates. outdegree for each user. Facebook is examp le o f symmetric
(2) Hyperlink: A hyperlin k to so me other location on Internet, networks where each user has identical indegree and
typically news of interest, is another source of shared outdegree. Based on these parameters, characteristics of
informat ion. Friends can like, share, co mment on such network will vary. Sy mmetric networks have more
links. connections among users and hence, they form stronger
(3) Photo: Photographs, typically taken by user, are clusters thereby reducing network diameter. Hence, they
frequently shared, liked and commented. display characteristics of small world network. A mong
(4) Co mmunity/ Group: Facebook has different groups examples taken for analysis by author, we need to focus more
dedicated to various topics. Message posted by or on the on example of Orkut as it is most closely related to Facebook.
community is typically shared by user so that his To understand limitations, we need to note complex structure
subscribers can view it, which may not have access to the of Facebook. Although friendship is one of the prime ways
community. Facebook disseminates informat ion, we need to consider other
(5) Person: Famous people like Bill Gates have their own ways like groups, pages where user subscribes thereby
personal pages which are not like groups. These are used creating directed or asymmetric relat ionship. Nowadays,
by sending personal images and links to thousands of Facebook is also allowing users to subscribe to status updates
subscribers in similar way as these personalities are using fro m other users without requirements of explicit consent.
twitter today. This is for one-way communication. This has resulted in formation of Facebook has hybrid
(6) Event Invitations: Users can create events and invite network with different types of nodes. With regards to cluster
people. Users can also forward event invites. formation, the authors state that the online social networks
score higher on assortativity on parameter that users of high
V. DAT A EVALUAT ION degree establish relation with other users of high degree wh ile
This section is split into sections as below. users of low degree establish relation with other users of low
A. Social Networks degree. This looks in violat ion with scale-free properties
Before starting, let’s look at what is mean ing of social where low degree users have tendency to attach to high degree
networks and how online social networks are different. users more in order to form Hub and Spoke model.
Social Network concept applies to naturally formed The social networks are examp les of very large scale
networks like co mmunity, family t ies and relationships etc. networks and they are not random. Study by Erdos and Renyi
[6]
For e.g. In a town, people know each other in one residential proved that networks like social networks evolve with
3. particular patterns and they have certain structure, but not they also apply to user behaviour on social network like
random. Facebook. Authors have identified that likelihood of video
Wei Ren and Jianping Li’s [4] paper proposes RX algorith m being forwarded are closely correlated to sender involvement,
to simu late scale free network, wh ich they claim is better sender tie strength and amount of online commun ication
performing than popular Barabasi-Albert (BA) algorith m. across ties. We would explain these factors in short. Sender
Authors state that as number of nodes increase, the time involvement means, as explained by Norman [10], is relation of
required for RX is much lesser co mpared to that taken by BA. subject to person’s needs. Sender’s tie strength means how
They conclude that the networks that expand continuously close is the user to sender of message. Third factor on amount
exhibit characteristics of scale-free networks. And since, of co mmunication that sender has with p robable to who m he
social networks are both very large in size as well as would forward. Authors reject factor that knowledge of how
continuously expanding, scale-free characteristics apply. The to forward given message has got any correlation to this.
same is true about online social network like Facebook, wh ich Skoric et al [12] in their paper discuss parameter of trust
has currently 800 million users and is increasing in terms of which is similar to ties with sender which we discussed in
total users as well as average number of friends at very rapid previous paper. Authors say that in general, user t rust their
rate. friends over any other person like polit ical leader or advertiser.
Yixiao Li et al [3] in their paper, make important What this means is when a friend forwards or share some
observations that social network model exh ib its community message, they consider it as serious message. This improves
structure. This paper however correctly establishes clustering likelihood that they forward such message. This research also
method based on “Birds of feather flock together”, stating that identifies that groups, events and status messages are the tools
users having something in co mmon tend to form clusters or on Facebook by which users can reach one’s immed iate and
groups with a lot of interconnections among them. This does extended friends in fast, easily accessible and cost effective
not agree with statement in paper of Mislove [2], which stated way. One important contribution of this paper is identification
that users with high degree have tendency to connect to other that spread of such messages will be limited in individuals
users with high degree and vice versa. Further this paper who are mostly similar and in one category of politically
establishes that commun ities develop into scale-free networks engaged and socially act ive people. Th is is typically due to the
when they keep expanding. fact that such messages will spread only through friendship
There is one more factor discussed in literature on user’s networks, which are based on different intentions than
intention. As explained in paper by Goel et al [7], fro m spreading such message. Friends are generally of s imilar
physical social network standpoint, the topological connection thought process and hence similar on above parameters.
and algorith mic connection (intention to connect) with Katona et al [1] brings out some crit ical points based on
example of spread of diseases in social network. The paper sender’s influence in their paper. First, they discussed that as
distinguishes in network structure based on intention of user. number of contacts of recipient increase, influencing effect
Next paper discussed below extends this concept by looking that particular indiv idual has on him gets diluted accordingly.
into factor when such intentions evolve, making network very Second factor is of brokers. We have already seen that social
dynamic. networks demonstrate characteristics of scale-free and small
The paper by Jun and Sethi [8] discusses how social network world networks. This means that among different clusters of
structure is developed in dynamic and continuously evolving users there are few users which are co mmon, which form
environment. The changes in network result as random prominent nodes linking these two clusters. As proved
rewiring. Also, to certain extent, some old lin ks are severed emp irically, since they control large amount of informat ion,
over period of t ime. In physical as well as online social they have higher influential power.
networks it is due to changes in one’s lifestyle in terms of Another very interesting observation is made by Wilson et
location, co mmunity memberships etc. Also, changes may al [11] in their paper. Authors say that links or connections on
happen in intention factor which is taken as conditional social network like Facebook are not indicators of interaction
cooperation. Over period o f t ime, user’s reasons to connect among them. This is primarily due to time constraints that
can evolve e.g. looking for relationship, friendship or users face. So, all the friendships are not equally meaningful.
professional networking. Another important observation by Authors therefore have co me up with new concept of
the authors is based on increasing degree of network. With interaction graph as valid indicator to map social connectivity
increasing degree, the clustering increases as neighbours of than Facebook updates. Interesting observation they have
one node are likely to be neighbours of each other. Th is is made that such interaction graph does not exhib it small world
same phenomenon that social network like Facebook fo llows. characteristics. Therefore, authors believe more in the scale-
Hence, the diameter of network reduces. This paper identifies free network pattern when it co mes to interactions that happen
future research scope in terms of in fluence of behaviour of within users.
non-neighbours on given user. This is also valid scenario In paper by Magnani et al [13], authors have identified some
considering features of Facebook. User A may receive updates important dimensions of discussion. The average lifet ime of
fro m interaction of particu lar friend B to his friend C who is post or message is the time for wh ich it is availab le on news
not friend of user A. We will discuss this propagation in next feeds of user. It will vary inversely with nu mber of friends the
section. user has and their frequency of activity on Facebook. Overall,
authors have found that such lifetime of post also follows
C. Information Propagation power law. Based on their empirical analysis it was found that
Harvey et al [9] in their paper on v iral marketing on Internet 50% of entries survive fo r around one hour, 85% survive for a
researched how users Forward or Delete particu lar message on day and so on. Authors have also identified specific time trend
social network like YouTube. Fro m our research point of view, in content generation. Since users in given clusters have some
observations on this forward ing behaviour are important as
4. parameters in co mmon, any temporal factors affecting those B. Information Propagation
parameters will also affect activity of all users simultaneously. As literature explains, we have several factors that define
One impo rtant issue that needs attention is increasing the pattern of propagation of information. However, we need
quantity of spam. The paper by Gao et al [14], looks at to alter some conditions when we apply these to our research
quantifying and characterizing online spam campaigns for purpose of understanding how a message spreads over
launched by online social network accounts. Important social network like Facebook, fundamentally due to several
observation fro m this emp irical study of 3.5 million Facebook differences in characteristics of Facebook against social
users indicate that over 97% of accounts are compro mised networks that were considered for empirical research in
accounts and only rest are fake accounts. Another observation literature researched.
is that spamming activ ity is more generally at early mo rning As against preferential forwarding discussed in paper by
hours for users based on their local time. Harvey et al [9], on Facebook, the user would forward i.e.
share message that he likes to all of his friends and those who
VI. A NALYSIS AND INT ERPRET AT ION are subscribed to his updates. Very few times he would share
A. Network Structure such message with particu lar Facebook user. However, we
Based on reviews of art icles in section on network structure need to note that he can preferentially tie up some users based
above, we find that Mislove’s art icle [2] develops many on relevance he sees while sharing the message with larger
concepts required for understanding how this structure audience. The ways to do it are tagging a person or posting
develops. But, with help of co mmun ity as example fro m Yixio such link or image on wall of user intended.
Li et al [3], we can get idea how social networks evolve. This We also agree with Harvey’s finding that user’s knowledge
helps in understanding why social networks display has little to do with forwarding likelihood. While looking at
characteristics of both small world networks and scale-free this observation from Facebook’s point of view, we can’t
networks. logically thin k of any reason to believe that a Facebook user
Initially a group of individuals with something in co mmon will not be aware how to share the message that he or she is
like belonging to same school come together on network like reading if at all he wants to do that.
Facebook. They add each other as lin ks, thereby establishing As we have seen in the structure of social networks, the
community structure. This is also a cluster of users tightly users of similar nature co me together and form clusters. This
coupled with each other. Th is behaves like Small World creates strong bonds between similar people and weaker
network due to shorter diameter. As time progresses, the bonds between dissimilar people. Moreover we saw that wh ile
individuals fro m these clusters may get exposed to a different friendship networks are formed based on consent, the user
group or set of users. Now this particular user becomes gives such consent based on different criteria than spreading
connection between these two clusters. That way, this particular message. This results in effectively reducing
individual will have much higher degree of links than his velocity of message spread as it does not reach to dissimilar
earlier cluster peers. This develops into hub and spoke model users with equal intensity.
and thereby into scale free networks. These follo w Power Law, Wilson et al [11] have found that small world clustering does
as there are lesser users connected across clusters and hence not exist due to low degree of connection in their interaction
have higher degree, than large number of users connected only graph, which is different than friendship link graph. Th is is
within cluster, therefore have lesser degree of links. due to the fact that users on regular basis interact with a s mall
Another parameter that impacts expansion of social portion of their friends. As degree of lin ks per user fro m
networks is how users can search other users in order to interaction point of view decreases, clustering index reduces ,
connect them. Networks like LinkedIn allow users to search thereby network becomes mo re scale-free and less small-
only within certain levels of neighbourhood. This limits world.
capability of less connected users to connect to large number As described by Katona et al [1], the dilution of influence
of users. This further provides incentive to user to connect to occurs as number o f contacts increase. This is very logical. As
another user which is highly connected. This simp le behaviour number of friends on Facebook increases frequency of updates
contradicts concept given in paper of Mislove [2] that users of in Feeds also increases proportionately. As pointed out by
similar degree are more likely to connect to each other. Wilson, every user has limited time on Facebook. Hence,
Scenario of lin king unintentionally is not applicable to likelihood that particular update will be visib le in considerable
online social network like Facebook as there is no reason to portion of his news feed he would scroll at time redu ces with
believe that two users are connected to each other unless they increasing number of contacts. This weakens influence level
have some intention to do so. At least one user will have some and hence the interaction that we are looking for.
reason to connect to other, second user may approve request Paper written by Magnani et al [13] discusses lifet ime of post
unknowingly. Additionally it may need to be noted that the where it is active and accessible to friends. Overall it indicates
intentions of different users connecting to each other may be short lifespan of the message. We also need to note that as
different. What this means is one user A intends to connect to clustering will increase in Facebook with mo re and more user
user B for reason X. But user B wants to connect to user A for activity and more friends, average lifespan of particular
reason Y and still they can establish connection as long as message would lower further. This further underlines point
both users agree. But if there is no reason Y for B to connect mentioned in Wilson’s paper about constrained time makes
to A then the link will not establish. However, we could not interaction networks rather than connection networks more
locate any literature modelling the network taking into important for modelling, which are scale-free in nature.
account heterogeneous intentions. Regarding spread of spam content, impo rtant factor fro m
our study point of view is that co mpro mised accounts
contribute to 97% of spam and only 3% by fake accounts.
This further highlights that users trust their friends. Message
5. coming fro m unknown user is identified as spam easily than only scale-free characteristics, we need to model social
the one coming fro m friend with who m user has closer ties. network as scale-free network for our research perspective.
Regarding t iming issue of the spam generation, we do not find We conclude that following factors should be taken into
any relevance to our study on spread of information. account by our model which will impact likelihood and
But time of content generation has critical ro le to play when velocity of message spread.
it co mes to find lifet ime of the message to remain active in (1) Nu mber of friends of user is inversely proportional to
news feed of the user. If message is created or shared at peak amount of influence of friend has on user
time for local user, as per clustering of users, there is (2) Nu mber of friends of user is inversely proportional to
significant evidence that most friends are geographically lifetime of message to remain active in user’s news feed
collocated. And hence, there will be higher activity in the (3) Amount of time user spends on average on Facebook is
entire cluster. This further reduces lifetime of message in the directly proportional to likelihood of spreading message
news feed, but simultaneously increases likelihood that user (4) Stronger bond with sender is directly proportional to
sees such message due to he or she is actively v iewing the likelihood of spreading message further
news feed. (5) More is the clustering in user’s network, less is the
Another important point is that not all content that is velocity of message to spread, primarily due to
frequently shared is genuine. Unfortunately we could not find duplication of messages it will remain confined to same
any conclusive literature on user behaviour where they cluster
forward or share spam or incorrect info rmation knowingly (6) Message shared at peak time will have less lifetime on
simp ly for amusement purpose. This typically includes some news feed but higher likelihood to get replicated due to
random so called “confidential” information about some high activity in entire cluster
political leader or forged images. If users share this (7) If users perceive particular message as no harmful to
informat ion unknowingly, then this behaviour can be them, then there is higher likelihood that it will be spread
considered under trusting the ties which we just discussed. But, or shared, irrespective of user’s analysis of message’s
many a times user is completely aware of fraudulent nature. authenticity. This will be typical sharing of such
Still, either for amusement purpose or out of po litical or messages for amusement or political conflicts.
ideological conflict with person or event in question, they find
it encouraging sharing of such material. We could not REFERENCES
however find any emp irical research on this behaviour. It [1] Katona Z., Zubcsek P., Sarvary M., Network Effects and
should also be noted that users who are aware of spam, if they Personal Influences: The Diffusion of an Online Social
think it may be harmful to them, then they do not indulge in Network , Journal of Marketing Research, Vo l. XLVIII
such activity. But when it co mes to pure static spam content, (June 2011), 425-443, American Marketing Association.
which they are sure that it won’t co mpro mise their pro files, [2] Mislove A., Marcon M., Gu mmad i K., Druschel P.,
they do not have objection to share or comment on it. If we Bhattacharjee B., Measurement and Analysis of Online
look at censorship proposals from governments, we may find Social Networks, proceedings of IMC’07, ACM.
that they are largely interested in controlling such content. [3] Yixiao Li, Xiaogang Jin, Fansheng Kong and Jiming Li,
Linking via Social Similarity: The Emergence of
VII. LIMIT AT IONS Community Structure in Scale-free Network , IEEE
Facebook is continuously updating its features. Literature symposium on digital object identifier, 2009.
suggests that new features have significant impact on user [4] Wei Ren, Jianping Li, A fast algorithm for simulating
behaviour. For newly introduced timeline feature, wh ich scale-free networks, proceedings of ICCTA2009
allo ws users to view past important interactions with ease, has [5] Ted G. Lewis, Network Science: Theory and Practice,
greater significance on user interactivity. But, we could not John Wiley & Sons, Inc. 2009.
locate any literature discussing impact of timeline. Also, we [6] P. Erdos, A. Renyi, On the evolution of random graphs,
could not find literature conclusively quantifying Facebook Publ. Math. Inst. Hung. Acad. Sci., vol. 5, pp. 17-60,
events and their impact on social events. We also did not 1959.
locate any literature wh ich can explain user bias in sharing [7] Goel S., Muhamad R., Watts D., Social Search in
fake informat ion knowingly. We understand that social “Small-World” Experiments, proc. WWW 2009 , ACM.
networking phenomenon is relatively new and hence there is [8] Jun T., Sethi R., Reciprocity in evolving social networks,
no enough research done on every aspect of social network’s Journal of Evolutionary Economics , June 2009.
impact on our real time interactions. [9] Harvey C., Stewart D., Ewing M., Forward or delete:
What drives peer-to-peer message propagation across
VIII. CONCLUSION social networks?, Journal of Consumer Behavior, Vol.
In this literature survey, we have identified factors that need 10, 2011, Published by Wiley.
to be accounted while modelling informat ion spread on social [10] Norman AT, Russell CA. 2006. The Pass-Along Effect:
networks. We have avoided going into details of mathemat ical Investigating Word-of-Mouth Effects on Online Survey
details supporting conclusions derived for simp licity. We have Procedures. Journal of Co mputer-Mediated
lin ked various papers that is available on this topic to identify Communication 11(4): 1085–1103.
following conclusions. [11] Wilson C., Boe B., Sala A., Puttaswamy P., Zhao B.,
On network structure side, we conclude that social network User Interactions in Social Networks and their
fro m friendship perspective demonstrates characteristics of Implications, Proceedings of EuroSys 2009, ACM.
both scale-free and small-world networks. But since, [12] Skoric M., Poor N., Liao Y., Wei S., Online
interactions between users which are time constrained, display Organization of an Offline Protest: From Social to
6. Traditional Media and Back , proceedings of HICSS
2011, retrieved from IEEE.
[13] Magnani M., Montesi D., Rossi L., In formation
propagation analysis in a social network site,
proceedings of International Conference on Advances in
Social Networks Analysis and Mining, 2010, IEEE.
[14] Gao H., Hu J., Wilson C., Li Z., Chen Y., Zhao B.,
Detecting and Characterizing Social Spam Campaigns ,
proceedings of IMC’10. ACM.