SlideShare une entreprise Scribd logo
1  sur  18
Prepared and Presented by,
Dr.Nisha Soms
Department of CSE
KPR Institute of Engineering and Technology
Coimbatore
Data Management in Social
Network Analysis
03-10-2022
1 U19CSP38 SOCIAL NETWORK ANALYSIS
Outline
03-10-2022
U19CSP38 SOCIAL NETWORK ANALYSIS
2
1. Data Management
2. Data Transformation Techniques
Data Management
03-10-2022
U19CSP38 SOCIAL NETWORK ANALYSIS
3
 How to format network data for import into a
network analysis software package,
 How to transform network data to make it suitable
for different analyses, and
 How to export network data and results for use in
other programs, such as statistical packages.
Data Import
03-10-2022
U19CSP38 SOCIAL NETWORK ANALYSIS
4
 One of the most important steps in any network
analysis.
 For large datasets, a proper database such as
Microsoft Access or MySQL is useful
 For most users, using Microsoft Excel is
recommended as a sort of universal translator
Cleaning network data
03-10-2022
U19CSP38 SOCIAL NETWORK ANALYSIS
5
 Once the data is imported, it is advisable to
examine it in some detail.
Look for repeated nodes
Look for differences in how the node’s name
was typed
Look for missing actors.
Look for isolates.
Run a quick centrality analysis early!
Etc.
Methodology
 Step 1: Preparation. Identify the problem and what questions
should be answered; is data available to answer this question?
 Step 2: Data retrieval. Retrieve data (from sources).
 Step 3: Data cleaning. Clean data by unifying the format and
handling missing data/duplication, and fix errors if possible.
 Step 4: Data selection. Use statistical tools to select the
significant data, create fields (attributes), keep the important
ones, and drop the others.
 Step 5: Network representation. Build graph (s) from the
preprocessed data.
 Step 6: Graph analysis. Process the graph(s). Compute the
(strong) components, clusters, and communities. Create new
attributes based on these, and add to the ones gained in Step
4.
03-10-2022
6 U19CSP38 SOCIAL NETWORK ANALYSIS
Data Transformation
 These include transposing matrices,
symmetrizing, dichotomizing, imputing missing
values, combining relations, combining nodes,
extracting subgraphs, and many more.
03-10-2022
7 U19CSP38 SOCIAL NETWORK ANALYSIS
1. Transposing
 To transpose a matrix is to interchange its rows with
its columns
 This can be helpful in maintaining a consistent
interpretation of the ties in a network.
 Example: A matrix and its transpose: (a) who likes
whom; (b) who is liked by whom.
 Stacked datasets can be seen as three-dimensional
matrices consisting of rows, columns and layers or
slices.
 In these matrices, three different transpositions can be
done: interchanging rows with columns, rows with
layers, and columns with layers.
03-10-2022
8 U19CSP38 SOCIAL NETWORK ANALYSIS
2. Imputing missing data
 Missing data can be a problem in full network
research designs.
 The most common kind of missing data is where
a respondent has chosen not to fill out the survey.
This creates a row of missing values in the
network adjacency matrix.
 Solution?
03-10-2022
9 U19CSP38 SOCIAL NETWORK ANALYSIS
2. Imputing missing data (contd)
 When confronted with missing data, researchers
often want to handle the missing observations by
substituting plausible values for the missing
scores. This practice of filling in missing items is
called imputation
 It gives the opportunity to use information
contained in the observed data in predicting the
missing scores, and allows analysis using
standard techniques and software on a
complete(d) dataset that is the same for all
following analyses
03-10-2022
10 U19CSP38 SOCIAL NETWORK ANALYSIS
2. Imputing missing data (contd)
 The shortcomings of imputation are related to
bias and uncertainty. Ad hoc imputations can
seriously distort data distributions and
relationships, and produce biased estimates.
 Solution: Multiple imputation
03-10-2022
11 U19CSP38 SOCIAL NETWORK ANALYSIS
3. Symmetrizing
 Symmetrizing refers to creating a new dataset in
which all ties are reciprocated
 Reason being, some analytical techniques, such
as multidimensional scaling, assume symmetric
data.
 OR, or union, rule.
 AND, or intersection, rule
 the union rule creates networks denser than the
original, while the intersection rule makes them
sparser.
03-10-2022
12 U19CSP38 SOCIAL NETWORK ANALYSIS
4. Dichotomizing
 refers to converting valued data to binary data.
 Reason being, some methods, especially graph-
theoretic methods, are only applicable to binary
data.
 Helps to reduce the density of the network, which
is useful in handling large networks
 This approach retains the richness of the data
and can reveal insights into the network structure
that would not be easy to deduce from techniques
designed to deal with valued data directly.
 It also gives you an idea of the extent to which
your findings are robust across different
definitions of ties. 03-10-2022
13 U19CSP38 SOCIAL NETWORK ANALYSIS
5. Combining relations
 most network studies collect multiple relations on
the same set of nodes.
 For some analyses, they are combined into one.
 For eg. we might take several relations involving
friendship, support, liking and so on and combine
them to create a category of relations that we
might call ‘expressive ties’.
03-10-2022
14 U19CSP38 SOCIAL NETWORK ANALYSIS
6. Combining nodes
 we might want to aggregate the nodes into
departments such that a tie between any two
nodes becomes a tie between their departments.
 The inter-departmental ties could be defined as a
simple count of the individual-level ties, or we
could normalize the count to account for the
number of people in each department.
03-10-2022
15 U19CSP38 SOCIAL NETWORK ANALYSIS
7. Subgraphs
 Finally, it may happen that we do not want analyze
the whole network.
 We may wish to delete a node or nodes from the
network. This may be because they are outliers in some
respect, or because we need to match the data to
another dataset where some but not all of the same
nodes are present.
 Or we may wish to combine nodes to form one node that
is connected to the same nodes as the individuals were.
One reason for combining nodes may be that the data
was collected at too fine a level and we need to take a
courser-grained analysis.
 Combining nodes in the same departments would be an
example of moving up from the individual level to the
department level.
03-10-2022
16 U19CSP38 SOCIAL NETWORK ANALYSIS
References
1. “Analyzing Social Networks” by Stephen
P Borgatti, Martin G Everett, Jeffrey C
Johnson, SAGE Publications Ltd.
2. “Introduction to Social Network
Methods” by Robert A Hanneman
03-10-2022
17 U19CSP38 SOCIAL NETWORK ANALYSIS
Thank you
03-10-2022
18 U19CSP38 SOCIAL NETWORK ANALYSIS

Contenu connexe

Similaire à Data Management.pptx

Intrusion Detection System using K-Means Clustering and SMOTE
Intrusion Detection System using K-Means Clustering and SMOTEIntrusion Detection System using K-Means Clustering and SMOTE
Intrusion Detection System using K-Means Clustering and SMOTEIRJET Journal
 
A TALE of DATA PATTERN DISCOVERY IN PARALLEL
A TALE of DATA PATTERN DISCOVERY IN PARALLELA TALE of DATA PATTERN DISCOVERY IN PARALLEL
A TALE of DATA PATTERN DISCOVERY IN PARALLELJenny Liu
 
Survey Paper on Clustering Data Streams Based on Shared Density between Micro...
Survey Paper on Clustering Data Streams Based on Shared Density between Micro...Survey Paper on Clustering Data Streams Based on Shared Density between Micro...
Survey Paper on Clustering Data Streams Based on Shared Density between Micro...IRJET Journal
 
Enhanced Privacy Preserving Accesscontrol in Incremental Datausing Microaggre...
Enhanced Privacy Preserving Accesscontrol in Incremental Datausing Microaggre...Enhanced Privacy Preserving Accesscontrol in Incremental Datausing Microaggre...
Enhanced Privacy Preserving Accesscontrol in Incremental Datausing Microaggre...rahulmonikasharma
 
Certain Investigation on Dynamic Clustering in Dynamic Datamining
Certain Investigation on Dynamic Clustering in Dynamic DataminingCertain Investigation on Dynamic Clustering in Dynamic Datamining
Certain Investigation on Dynamic Clustering in Dynamic Dataminingijdmtaiir
 
A PROCESS OF LINK MINING
A PROCESS OF LINK MININGA PROCESS OF LINK MINING
A PROCESS OF LINK MININGcsandit
 
In network aggregation techniques for wireless sensor networks - a survey
In network aggregation techniques for wireless sensor networks - a surveyIn network aggregation techniques for wireless sensor networks - a survey
In network aggregation techniques for wireless sensor networks - a surveyGungi Achi
 
Asif nosql
Asif nosqlAsif nosql
Asif nosqlAsif Ali
 
Target Response Electrical usage Profile Clustering using Big Data
Target Response Electrical usage Profile Clustering using Big DataTarget Response Electrical usage Profile Clustering using Big Data
Target Response Electrical usage Profile Clustering using Big DataIRJET Journal
 
Anonymization of centralized and distributed social networks by sequential cl...
Anonymization of centralized and distributed social networks by sequential cl...Anonymization of centralized and distributed social networks by sequential cl...
Anonymization of centralized and distributed social networks by sequential cl...IEEEFINALYEARPROJECTS
 
2011 IEEE Social Computing Nodexl: Group-In-A-Box
2011 IEEE Social Computing Nodexl: Group-In-A-Box2011 IEEE Social Computing Nodexl: Group-In-A-Box
2011 IEEE Social Computing Nodexl: Group-In-A-BoxMarc Smith
 
IRJET- Providing In-Database Analytic Functionalities to Mysql : A Proposed S...
IRJET- Providing In-Database Analytic Functionalities to Mysql : A Proposed S...IRJET- Providing In-Database Analytic Functionalities to Mysql : A Proposed S...
IRJET- Providing In-Database Analytic Functionalities to Mysql : A Proposed S...IRJET Journal
 
Review Paper on Shared and Distributed Memory Parallel Algorithms to Solve Bi...
Review Paper on Shared and Distributed Memory Parallel Algorithms to Solve Bi...Review Paper on Shared and Distributed Memory Parallel Algorithms to Solve Bi...
Review Paper on Shared and Distributed Memory Parallel Algorithms to Solve Bi...JIEMS Akkalkuwa
 
A relational model of data for large shared data banks
A relational model of data for large shared data banksA relational model of data for large shared data banks
A relational model of data for large shared data banksSammy Alvarez
 
Pmit 6102-14-lec1-intro
Pmit 6102-14-lec1-introPmit 6102-14-lec1-intro
Pmit 6102-14-lec1-introJesmin Rahaman
 
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET Journal
 
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET Journal
 
A Proposed Algorithm to Detect the Largest Community Based On Depth Level
A Proposed Algorithm to Detect the Largest Community Based On Depth LevelA Proposed Algorithm to Detect the Largest Community Based On Depth Level
A Proposed Algorithm to Detect the Largest Community Based On Depth LevelEswar Publications
 
Ncct Ieee Software Abstract Collection Volume 1 50+ Abst
Ncct   Ieee Software Abstract Collection Volume 1   50+ AbstNcct   Ieee Software Abstract Collection Volume 1   50+ Abst
Ncct Ieee Software Abstract Collection Volume 1 50+ Abstncct
 
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET Journal
 

Similaire à Data Management.pptx (20)

Intrusion Detection System using K-Means Clustering and SMOTE
Intrusion Detection System using K-Means Clustering and SMOTEIntrusion Detection System using K-Means Clustering and SMOTE
Intrusion Detection System using K-Means Clustering and SMOTE
 
A TALE of DATA PATTERN DISCOVERY IN PARALLEL
A TALE of DATA PATTERN DISCOVERY IN PARALLELA TALE of DATA PATTERN DISCOVERY IN PARALLEL
A TALE of DATA PATTERN DISCOVERY IN PARALLEL
 
Survey Paper on Clustering Data Streams Based on Shared Density between Micro...
Survey Paper on Clustering Data Streams Based on Shared Density between Micro...Survey Paper on Clustering Data Streams Based on Shared Density between Micro...
Survey Paper on Clustering Data Streams Based on Shared Density between Micro...
 
Enhanced Privacy Preserving Accesscontrol in Incremental Datausing Microaggre...
Enhanced Privacy Preserving Accesscontrol in Incremental Datausing Microaggre...Enhanced Privacy Preserving Accesscontrol in Incremental Datausing Microaggre...
Enhanced Privacy Preserving Accesscontrol in Incremental Datausing Microaggre...
 
Certain Investigation on Dynamic Clustering in Dynamic Datamining
Certain Investigation on Dynamic Clustering in Dynamic DataminingCertain Investigation on Dynamic Clustering in Dynamic Datamining
Certain Investigation on Dynamic Clustering in Dynamic Datamining
 
A PROCESS OF LINK MINING
A PROCESS OF LINK MININGA PROCESS OF LINK MINING
A PROCESS OF LINK MINING
 
In network aggregation techniques for wireless sensor networks - a survey
In network aggregation techniques for wireless sensor networks - a surveyIn network aggregation techniques for wireless sensor networks - a survey
In network aggregation techniques for wireless sensor networks - a survey
 
Asif nosql
Asif nosqlAsif nosql
Asif nosql
 
Target Response Electrical usage Profile Clustering using Big Data
Target Response Electrical usage Profile Clustering using Big DataTarget Response Electrical usage Profile Clustering using Big Data
Target Response Electrical usage Profile Clustering using Big Data
 
Anonymization of centralized and distributed social networks by sequential cl...
Anonymization of centralized and distributed social networks by sequential cl...Anonymization of centralized and distributed social networks by sequential cl...
Anonymization of centralized and distributed social networks by sequential cl...
 
2011 IEEE Social Computing Nodexl: Group-In-A-Box
2011 IEEE Social Computing Nodexl: Group-In-A-Box2011 IEEE Social Computing Nodexl: Group-In-A-Box
2011 IEEE Social Computing Nodexl: Group-In-A-Box
 
IRJET- Providing In-Database Analytic Functionalities to Mysql : A Proposed S...
IRJET- Providing In-Database Analytic Functionalities to Mysql : A Proposed S...IRJET- Providing In-Database Analytic Functionalities to Mysql : A Proposed S...
IRJET- Providing In-Database Analytic Functionalities to Mysql : A Proposed S...
 
Review Paper on Shared and Distributed Memory Parallel Algorithms to Solve Bi...
Review Paper on Shared and Distributed Memory Parallel Algorithms to Solve Bi...Review Paper on Shared and Distributed Memory Parallel Algorithms to Solve Bi...
Review Paper on Shared and Distributed Memory Parallel Algorithms to Solve Bi...
 
A relational model of data for large shared data banks
A relational model of data for large shared data banksA relational model of data for large shared data banks
A relational model of data for large shared data banks
 
Pmit 6102-14-lec1-intro
Pmit 6102-14-lec1-introPmit 6102-14-lec1-intro
Pmit 6102-14-lec1-intro
 
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
 
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
IRJET- Deduplication Detection for Similarity in Document Analysis Via Vector...
 
A Proposed Algorithm to Detect the Largest Community Based On Depth Level
A Proposed Algorithm to Detect the Largest Community Based On Depth LevelA Proposed Algorithm to Detect the Largest Community Based On Depth Level
A Proposed Algorithm to Detect the Largest Community Based On Depth Level
 
Ncct Ieee Software Abstract Collection Volume 1 50+ Abst
Ncct   Ieee Software Abstract Collection Volume 1   50+ AbstNcct   Ieee Software Abstract Collection Volume 1   50+ Abst
Ncct Ieee Software Abstract Collection Volume 1 50+ Abst
 
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
 

Plus de NISHASOMSCS113

Building blocks of Algblocks of Alg.pptx
Building blocks of Algblocks of Alg.pptxBuilding blocks of Algblocks of Alg.pptx
Building blocks of Algblocks of Alg.pptxNISHASOMSCS113
 
dokumen.tips_1-cryptography-and-network-security-third-edition-by-william-sta...
dokumen.tips_1-cryptography-and-network-security-third-edition-by-william-sta...dokumen.tips_1-cryptography-and-network-security-third-edition-by-william-sta...
dokumen.tips_1-cryptography-and-network-security-third-edition-by-william-sta...NISHASOMSCS113
 
Exception handling.pptx
Exception handling.pptxException handling.pptx
Exception handling.pptxNISHASOMSCS113
 
Introduction to Information Storage.pptx
Introduction to Information Storage.pptxIntroduction to Information Storage.pptx
Introduction to Information Storage.pptxNISHASOMSCS113
 

Plus de NISHASOMSCS113 (9)

Building blocks of Algblocks of Alg.pptx
Building blocks of Algblocks of Alg.pptxBuilding blocks of Algblocks of Alg.pptx
Building blocks of Algblocks of Alg.pptx
 
dokumen.tips_1-cryptography-and-network-security-third-edition-by-william-sta...
dokumen.tips_1-cryptography-and-network-security-third-edition-by-william-sta...dokumen.tips_1-cryptography-and-network-security-third-edition-by-william-sta...
dokumen.tips_1-cryptography-and-network-security-third-edition-by-william-sta...
 
Unit 1.pptx
Unit 1.pptxUnit 1.pptx
Unit 1.pptx
 
Exception handling.pptx
Exception handling.pptxException handling.pptx
Exception handling.pptx
 
Introduction to Information Storage.pptx
Introduction to Information Storage.pptxIntroduction to Information Storage.pptx
Introduction to Information Storage.pptx
 
recursion.ppt
recursion.pptrecursion.ppt
recursion.ppt
 
social.pptx
social.pptxsocial.pptx
social.pptx
 
intro to sna.ppt
intro to sna.pptintro to sna.ppt
intro to sna.ppt
 
intro.ppt
intro.pptintro.ppt
intro.ppt
 

Dernier

Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 

Dernier (20)

Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 

Data Management.pptx

  • 1. Prepared and Presented by, Dr.Nisha Soms Department of CSE KPR Institute of Engineering and Technology Coimbatore Data Management in Social Network Analysis 03-10-2022 1 U19CSP38 SOCIAL NETWORK ANALYSIS
  • 2. Outline 03-10-2022 U19CSP38 SOCIAL NETWORK ANALYSIS 2 1. Data Management 2. Data Transformation Techniques
  • 3. Data Management 03-10-2022 U19CSP38 SOCIAL NETWORK ANALYSIS 3  How to format network data for import into a network analysis software package,  How to transform network data to make it suitable for different analyses, and  How to export network data and results for use in other programs, such as statistical packages.
  • 4. Data Import 03-10-2022 U19CSP38 SOCIAL NETWORK ANALYSIS 4  One of the most important steps in any network analysis.  For large datasets, a proper database such as Microsoft Access or MySQL is useful  For most users, using Microsoft Excel is recommended as a sort of universal translator
  • 5. Cleaning network data 03-10-2022 U19CSP38 SOCIAL NETWORK ANALYSIS 5  Once the data is imported, it is advisable to examine it in some detail. Look for repeated nodes Look for differences in how the node’s name was typed Look for missing actors. Look for isolates. Run a quick centrality analysis early! Etc.
  • 6. Methodology  Step 1: Preparation. Identify the problem and what questions should be answered; is data available to answer this question?  Step 2: Data retrieval. Retrieve data (from sources).  Step 3: Data cleaning. Clean data by unifying the format and handling missing data/duplication, and fix errors if possible.  Step 4: Data selection. Use statistical tools to select the significant data, create fields (attributes), keep the important ones, and drop the others.  Step 5: Network representation. Build graph (s) from the preprocessed data.  Step 6: Graph analysis. Process the graph(s). Compute the (strong) components, clusters, and communities. Create new attributes based on these, and add to the ones gained in Step 4. 03-10-2022 6 U19CSP38 SOCIAL NETWORK ANALYSIS
  • 7. Data Transformation  These include transposing matrices, symmetrizing, dichotomizing, imputing missing values, combining relations, combining nodes, extracting subgraphs, and many more. 03-10-2022 7 U19CSP38 SOCIAL NETWORK ANALYSIS
  • 8. 1. Transposing  To transpose a matrix is to interchange its rows with its columns  This can be helpful in maintaining a consistent interpretation of the ties in a network.  Example: A matrix and its transpose: (a) who likes whom; (b) who is liked by whom.  Stacked datasets can be seen as three-dimensional matrices consisting of rows, columns and layers or slices.  In these matrices, three different transpositions can be done: interchanging rows with columns, rows with layers, and columns with layers. 03-10-2022 8 U19CSP38 SOCIAL NETWORK ANALYSIS
  • 9. 2. Imputing missing data  Missing data can be a problem in full network research designs.  The most common kind of missing data is where a respondent has chosen not to fill out the survey. This creates a row of missing values in the network adjacency matrix.  Solution? 03-10-2022 9 U19CSP38 SOCIAL NETWORK ANALYSIS
  • 10. 2. Imputing missing data (contd)  When confronted with missing data, researchers often want to handle the missing observations by substituting plausible values for the missing scores. This practice of filling in missing items is called imputation  It gives the opportunity to use information contained in the observed data in predicting the missing scores, and allows analysis using standard techniques and software on a complete(d) dataset that is the same for all following analyses 03-10-2022 10 U19CSP38 SOCIAL NETWORK ANALYSIS
  • 11. 2. Imputing missing data (contd)  The shortcomings of imputation are related to bias and uncertainty. Ad hoc imputations can seriously distort data distributions and relationships, and produce biased estimates.  Solution: Multiple imputation 03-10-2022 11 U19CSP38 SOCIAL NETWORK ANALYSIS
  • 12. 3. Symmetrizing  Symmetrizing refers to creating a new dataset in which all ties are reciprocated  Reason being, some analytical techniques, such as multidimensional scaling, assume symmetric data.  OR, or union, rule.  AND, or intersection, rule  the union rule creates networks denser than the original, while the intersection rule makes them sparser. 03-10-2022 12 U19CSP38 SOCIAL NETWORK ANALYSIS
  • 13. 4. Dichotomizing  refers to converting valued data to binary data.  Reason being, some methods, especially graph- theoretic methods, are only applicable to binary data.  Helps to reduce the density of the network, which is useful in handling large networks  This approach retains the richness of the data and can reveal insights into the network structure that would not be easy to deduce from techniques designed to deal with valued data directly.  It also gives you an idea of the extent to which your findings are robust across different definitions of ties. 03-10-2022 13 U19CSP38 SOCIAL NETWORK ANALYSIS
  • 14. 5. Combining relations  most network studies collect multiple relations on the same set of nodes.  For some analyses, they are combined into one.  For eg. we might take several relations involving friendship, support, liking and so on and combine them to create a category of relations that we might call ‘expressive ties’. 03-10-2022 14 U19CSP38 SOCIAL NETWORK ANALYSIS
  • 15. 6. Combining nodes  we might want to aggregate the nodes into departments such that a tie between any two nodes becomes a tie between their departments.  The inter-departmental ties could be defined as a simple count of the individual-level ties, or we could normalize the count to account for the number of people in each department. 03-10-2022 15 U19CSP38 SOCIAL NETWORK ANALYSIS
  • 16. 7. Subgraphs  Finally, it may happen that we do not want analyze the whole network.  We may wish to delete a node or nodes from the network. This may be because they are outliers in some respect, or because we need to match the data to another dataset where some but not all of the same nodes are present.  Or we may wish to combine nodes to form one node that is connected to the same nodes as the individuals were. One reason for combining nodes may be that the data was collected at too fine a level and we need to take a courser-grained analysis.  Combining nodes in the same departments would be an example of moving up from the individual level to the department level. 03-10-2022 16 U19CSP38 SOCIAL NETWORK ANALYSIS
  • 17. References 1. “Analyzing Social Networks” by Stephen P Borgatti, Martin G Everett, Jeffrey C Johnson, SAGE Publications Ltd. 2. “Introduction to Social Network Methods” by Robert A Hanneman 03-10-2022 17 U19CSP38 SOCIAL NETWORK ANALYSIS
  • 18. Thank you 03-10-2022 18 U19CSP38 SOCIAL NETWORK ANALYSIS