SlideShare une entreprise Scribd logo
1  sur  175
社群媒體的資安挑戰—
認知、跨平台、與推送演算法
東海大學 資工系&雲創學院
賴俊鳴 助理教授
<cmlai@thu.edu.tw>
 AWS Certified Instructor
 AWS SAA & CLF Certificate
 研究興趣:社群網路數據分析、應用大數據、
人機互動
 WSET Level I & II
 旅遊點數玩家
2021/10/18 2
關於我
Chun-Ming Lai
AWS Certified Solutions Architect - Associate
Nov 03, 2019
Nov 03, 2022
Validation Number WM1ENMBCGEE1QY51
Validate at: http://aws.amazon.com/verification
Chun-Ming Lai
Sep 21, 2019
Sep 21, 2022
Validation Number MGFL2VQCGE1E1PC5
Validate at: http://aws.amazon.com/verification
Advertising
Recommendation Engines
Public Impersonal Pages
Public Personal Profiles
Private Groups
Private Personal
Profiles
Group
Messages
1:1
10/18/2021 3
Embrace transparency and restraint
on communication behavior
Amplification Privacy Concern
• Confidentiality,
• Policy
• Law
 Abuse with an Internal Victim
• Cyber Bullying
• Doxing (揭露隱私)
• Child grooming
• Sextortion (敲詐勒索)
• Terrorist recruiting
2021/10/18 4
Security Issues With Targets (1/2)
Abuse with an External Victim
• CSAM Trading (Child Sexual Abuse Material)
• Conspiracy (陰謀)
• Hate Speech
• Anti-Vax (反疫苗)
• Disinformation
2021/10/18 5
Security Issues With Target (2/2)
 Do you try hard to find the News that you
like to receive?
 Or, is there a special “force” to push the
News in front of you?
2021/10/18 6
Ask??
12/06/2019
Media Sources
Social Algorithms
Online Participants
Content
Comments
Reactions
ML is learning how to select the
information you like to read
Addictive Design
A major design change around 2012~2013
12/06/2019 8
𝑒𝑑𝑔𝑒𝑠 𝑒
𝑢𝑒 𝑤𝑒𝑑𝑒
• ue is user affinity
• 𝑤𝑒 is how the content is weighted
• 𝑑𝑒 𝑖𝑠 𝑎 𝑡𝑖𝑚𝑒 𝑏𝑎𝑠𝑒𝑑 𝑑𝑒𝑐𝑎𝑦 𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟
Social Algorithms
11/28/2019
Occupy Wall St.
OUR ALGORITHM
Attack
不實訊息如果要大量散播需要社群演算法的推波助瀾
Disinformation is ineffective without a successful manipulation of social algorithm
Detect Attack behavior-OUR ALGORITHM
11/28/2019
Media Sources
Manipulated Social Algorithms
Online Participants
Content
Comments
Reactions
Online Participants
12/06/2019
Systematic AI and Security
Media Sources
Social Algorithms
Online Participants
Content
Comments
Reactions
 資本主義是蘇中的間諜
 共產主義是歐美的間諜
12/06/2019 13
Joke
12/06/2019
Systematic AI and Security
Media Sources
Restricted
Social Algorithms
Online Participants
Cleaned
Content
Comments
Reactions
Suspicious Accounts Detection
Fact Check sites What’s the problem?
12/06/2019
Media Sources
Personalized
Social Algorithms
Participants
Restricted
Social
Informatics
Comments
Reactions
Individual
Participant
Not universally filtering the content, but
personalized removing the influence from
the suspicious accounts
12/06/2019 16
Temporal/
Spatial
Interaction
Graph
Accounts Credibility
Fact Checker
Research Directions
12/06/2019 17
Our Work in National Defense
Conference
 Giants / Big Brother
 Information Gathering
 Internet Archive / Wayback Machine
 Political Correctness
12/06/2019 18
Challenges
12/06/2019 19
CrowdTangle
12/06/2019 20
Search
2021/10/18 21
Crisis Informatics
 Systematic AI, not just the data, but the
adaptive process and ecological system around
all the data
• Systematic means the depth of domain
 From universally collecting all the data to
systematically select the data (or know what we
don’t have)
• we need systematic AI to know what to do
• we cannot learn system ecological system easily with
adversarial (so we need to filter them out)
Decentralized, at least virtually, information
ownership  better resistance and robustness
12/06/2019 22
Data-Centric Computing
Takeaways
10/18/2021 23
Reaction
 回覆的即時性
 是否切中要點,立案追蹤
 文章的生命週期
 平均1.5小時,影響人的生活
10/18/2021 24
Security Threat
Severe Threat
• Phishing
• Malware, drive-by-download
Medium to light Threat
• Advertisement
• Spamming (Fund-raising, porn, canned messages, etc.)
New type Threat
• Rumors, Media manipulation, sign up, vote stuffing, etc.
• Fake News
• Crowdturfing = CrowdSourcing + Astroturfing
10/18/2021 25
Outline
 Suitable Target, Lifecycle Analysis
 Multiple Accounts Detection
 Geolocation Identification
 Personal words
10/18/2021 26
10/18/2021 27
Facebook.com/63811549237/posts/10153038271604238
2014, 12-19, 03:06 am GMT
Social Media— Climate Change
10/18/2021 28
GMT+0
10/18/2021 29
Total: 609 comments
Suitable Targets Problem
Any post thread p in social media
platform, predict whether p
contains at least one malicious
comment via a classifier – c
{target,nontarget}
10/18/2021 30
Key idea: Life Cycle of Posts
10/18/2021 31
10 hrs
Definition
 Time Series (TS)
• TScreated(post): the time an original article is posted
• TSj: a time period j following the time of the original
• TSfinal: the end of our observation
 Accumulated Number of participants (AccNcomment)
• The number of post comments between TSi and TS(i-1)
 Discussion Atmosphere Vector (DAV)
10/18/2021 32
Example
TScreated(Climate) = 2014-12-19 03:06:42
Suppose j = 5, final = 120
DAV(Climate) = [# of comments 03:06:42 ~ 03:11:42 1st
# of comments 03:11:42 ~ 03:16:42 2nd
…
# of comments 05:01:42 ~ 05:06:42] 24th
10/18/2021 33
Dataset
2011~2014 Ten Main Media pages on
Facebook
Totally 42,703,463
10/18/2021 34
Feature Engineering
 # of comments, # of likes, # of shares
 Spanning time (Last comment time – first comment time)
 Temporal Feature with Delta Time window, with a final
observation time
 Context-free, don’t need to address Natural Language
Processing
10/18/2021
35
Time Elapsed
1st
Comments 1st Likes 1st Shares
Results
10/18/2021 36
Near Real Time
Discussion: Do you understand Facebook enough?
10/18/2021 37
• Attackers’ preference
• Selected by Facebook
• Audience reaction
• Bandwagon Effect
• Rich get Richer
• Human loves biased and
debating ones
Life Cycle and Influence Ratio
10/18/2021 38
CNN 2012 all post threads
>70%
mURL
DAV Predict IR (1/2)
10/18/2021 39
DAV Predict IR (2/2)
10/18/2021 40
Accounts Activity within a week around election date
10/18/2021 41
Active = Count(Activities) within 1 week >= threshold
10/18/2021 42
Clinton
1st week
Clinton
2nd
week
10/18/2021 43
Trump
2nd
week
Trump
1st
week
All accounts:
Periodic
Attacker accounts:
Random
Conclusion
Predict Suitable Targets successfully with temporal
features
• Attackers: Follow or not?
• Defenders: Deploy resource
Temporal Analysis with different variables
• Influence Ratio, increase or decrease for next time
window?
• 24 hours pattern, link online and offline behavior
10/18/2021 44
Outline
 Suitable Target, Lifecycle Analysis
 Multiple Accounts Detection
 Geolocation Identification
 Personal words
10/18/2021 45
Semi-Supervised Learning on Graphs
Motivation of detecting multiple accounts on FB
Crawler
1
Crawler
2
Crawler
3
FaceBook
API
When Call FaceBook
API:
API will give each
crawler a different
scope ID. Thus it leads
to same user with
different scope ID in
the dataset.
100003468896671 高婷婷
https://www.facebook.com/mayuko.sakamoto.503
100004123536871 賴婷婷
https://www.facebook.com/profile.php?id=100004123536871
100003251795795 陳婷婷 https://www.facebook.com/rika.etoh
100000681128139 高婷婷 https://www.facebook.com/vincenzo.muscari.5
100002630019886 陳婷婷 https://www.facebook.com/sven.erkens.98
813243492 高婷婷 https://www.facebook.com/profile.php?id=813243492
Ting-Ting’s Family
Facebook 允許朋友數
100003468896671 高婷婷 45xx
100004123536871 賴婷婷 45xx
100003251795795 陳婷婷 4xxx
5000
Multiple Accounts Detection using
Semi-Supervised Learning on Graphs
When crawling data from FB using multiple crawlers, it will give you a scope ID instead of
giving you primary ID for each crawler.
For example, a user’s primary ID is mohamed.aimane.98. He has multiple scope ID,
they are 1815396745342476, 1815402648675219 , 1815411572007660,
1815468805335270 , 1815515615330589 ,1815482155333935 , 1815488781999939 ,
1816157185266432. It implies mohamed’s data is crawled by 8 different crawlers.
As the result, in our dataset we know their users names are all mohamed aimane, but
there are a lot of ID with the same user name.
Problem : Given 2 scope ID with the same user name. Are they the same user(same
primary ID) or not?
Motivation of detecting multiple accounts on FB
Graph Construction
U: {Users}, V:{Pages}, edge:{u,v} : u had an activity on page v
Activities
Main Algorithms
Unsupervised learning using Katz Similarity
Pxy(i) = (x,x1,x2,….y), length I
u1, u2 are similar if their activity paths are similar
Katz similarity can be computed by:
Where M is the adjacency matrix of graph G. 𝛽 is a scalar smaller than 1/ 𝑀 2
to
ensure convergence, and I is the identity matrix.
Main Algorithms
Unsupervised learning using Katz Similarity
Katz matrix is
1 0.9
0.9 1
0.2 0.3
0.5 0.5
0.2 0.6
0.3 0.5
1 0.8
0.8 1
The threshold we use is 0.8
Then the 1st node and the 2nd node are belong to the same user, and the 3rd and 4th
node are belongs to the same user, others are not.
Example of Algorithm 1
Main Algorithms
Semi-Supervised Method using Graph Embedding
Classical ML Tasks in Networks
• Node Classification
• Predict type of a node
• Link Prediction
• Predict friends
• Community Detection
• Network Similarity
• Similar with two networks
Node2vec(1/4)
Many Possible ways:
• PageRank score, Degree, centrality, # of edges…etc.
Features
Node2vec(2/4)
Mixture of BFS and DFS
BFS --- LocalView (u and S1)
DFS --- GlobalView (u and S6)
Node2vec(3/4)
• Two Parameters:
• Return parameter p:
• Return back to the previous node
• In-out parameter q:
• Moving outwards (DFS) vs. inwards (BFS)
• The ratio of BFS vs.DFS
• Biased 2nd-order random walks explore network neighborhoods.
Parameters
Node2vec(4/4)
• Simulate r random walks of length l starting from each node u
• Optimize the node2vec objective using Stochastic Gradient Descent
Embedding for node 1 : (0.1, 0.3, 0.2, 0.4), Embedding for node 2 : (0.2, 0.3, 0.2, 0.4)
We sample some ground truth that : node 1 and node 2 are belongs to the same node,
ect.
L looks like :((1,2), 1) ((1,3), 0 ), ((2,3), 0) ((2,4), - 1) ((3,4), -1) …..
X is from embedding : for example, ((1,2), (0.1, 0, 0, 0 )) ….
Then feed X and L into label spreading model, we will get, the 1st node and the 2nd node
are belong to the same user, and the 3rd and 4th node are belongs to the same user,
others are not.
Example of Algorithm 3
Main Algorithms
Different measurement of Embedding Vectors
Experiments and Evaluation
Comparison among the Three Methods
Two simple datasets : dataset 1: 188 nodes and 262 activities (links);
dataset 2: 4188 accounts and 6715 activities(links).
Outline
 Suitable Target, Lifecycle Analysis
 Multiple Account Detection
 Geolocation Identification
 Personal words
10/18/2021 64
Page Information and Page-like Graph
10/18/2021
Sport Illustrated
Golden State
Warriors
Oakland Museum
Giving Tuesday
like
like
like
Field Example
Page ID 47657117525
Name Golden State Warriors
Category Sports Team
Country United States
Fan Count 11,019,236
Description The Official Facebook page
of
the Golden State Warriors
65
10/18/2021
• Facebook public
pages are public
profiles used by
local businesses,
companies,
organizations or
public figures
Likes
Promoting other pages to
community participants
66
Data Collection
Facebook Graph API version 2.8 used to collect our
data [1]
• 38,831,367 pages (for this work)
• 2,430,873 US
• 12,685,090 other countries
• 23,715,404 unknown
 [1] https://developers.facebook.com/docs/graph-api/reference/page
10/18/2021 67
Majority Vote Algorithm
10/18/2021
• location designated as state
information in this scenario
• The location labeling is determined by
the most votes
• Overall accuracy is only 59.4%
• This algorithm works well in page nationality
prediction task, with 90.25% accuracy
68
Baseline Algorithm
Utilizes locality of states to find pages
belonging to their corresponding states
• Pick out anchored pages, with local property, as
multiple seeds to start BFS from
Target classifier: 51 classes
• 50 classes of US states and a class of ”others (OT)”
State Distance Vector (SDV)
10/18/2021 69
Alabama Arkansas Arizona Wyoming
……
P IHOP(P, S_Arizona) == 4
OHOP(P, S_Arizona) == 3
31M+ nodes, 600M+ edges
10/18/2021
Alaska
70
Anchor Page Selection (1/2)
10/18/2021
Effectiveness of BFS-based algorithms
• It depends on anchored page selection
Anchored pages have to be local such that SDV can provide authentic
tendency of a page’s locality
Suitable examples (focusing on local communities)
• state universities, government, park or police organizations
Ill-suited examples (popular and thus having global impact)
• NBA, MLB, or NFL sports teams
71
• We adopt all subsidiary
pages
of ”OnlyInYourState.com” as
a set of anchored pages
• It has a distinct page for each state
• Each subsidiary page mostly
connects local communities
Anchor Page Selection (2/2)
Page Name Page ID
Only In Alabama 783744898386760
Only In Alaska 686107314826906
Only In Southern California
184034905285700
6
Only In Northern California 856450181102963
Idaho Only 435099846671531
Only In New York 386608421546055
Only In Virginia
156051573754049
2
Only In West Virginia
150970950928653
2
Only In Wisconsin
139029706462742
0
Only In Wyoming
172417436447638
1
10/18/2021 72
51 Anchors
Arizona
Northern
California
10/18/2021 73
Advanced Algorithm
Baseline algorithm’s drawback
• A local page can have a few connections with those pages far beyond
• This kind of connection noise would highly reduce prediction accuracy
State Neighborhood Probability (SNP)
Both SDV and SNP are taken as feature vectors for ML models
• Utilize locality and neighborhood context for better identification
10/18/2021 74
Dataset
California accounts for 20% of all US pages, and half of all
pages (49.49%) are located in top 5 states
• California, New York, Florida, Illinois, and Texas
10/18/2021 75
Accuracy Summary
Classifier Precision Recall F1 score
Naive Bayes (Baseline BFS) 0.44 0.27 0.26
Adaboost (Baseline BFS) 0.46 0.40 0.37
Random Forest (Baseline BFS) 0.69 0.69 0.68
Random Forest (Advanced BFS) 0.89 0.88 0.88
10/18/2021 76
酒的種類
依製造方式可分為:
• 釀造酒 – ex: 啤酒、葡萄酒、米酒、 紹興酒、 日本清酒…
(15% ↓)
• 蒸餾酒– ex: 高粱酒、白蘭地、威士忌、伏特加、蘭姆酒… (40% ↑)
• 調製酒 – ex: 藥酒、奶酒、…
+ =
葡萄酒的成分
葡萄梗→單寧
葡萄皮→單寧、顏色
葡萄肉→糖分、酸度
葡萄籽→苦油
葡萄生長的要素
葡萄生長的要素
Sunlight
Sunlight Warmth
Sunlight Warmth
Water
Sunlight Warmth
Water Nutrients
氣 溫 糖 度 酒 精 濃 度 顏 色
單 寧 酸 度
𝐶6𝐻12𝑂6 → 𝐶2𝐻5𝑂𝐻 + 𝐶𝑂2
葡萄糖 酒精
發酵作用
氣 溫 糖 度 酒 精 濃 度 顏 色
單 寧 酸 度
South Africa
Germany
葡萄酒的分類
1. 顏色:
• 紅葡萄酒/紅酒(Red wine)
• 白葡萄酒/白酒(White wine)
• 玫瑰紅葡萄酒(Rosé wine)
紅葡萄酒就是紅葡萄釀的,
白葡萄酒就是白葡萄釀的?
紅酒
白酒
白酒: 壓碎(Crush) → 榨汁(Press) → 發酵(Ferment)
紅酒: 壓碎 → 帶皮發酵 → 榨汁
Rosé: 壓碎 → 帶皮發酵(12-36 hrs) → 榨汁
紅葡萄可釀紅、白葡萄酒,
白葡萄只能釀白葡萄酒
2. 氣泡酒(Sparkling wine):
• 香檳 (Champagne, France)
• Asti (Italy)
• Cava (Spain)
為什麼會有氣泡?
氣泡酒:二次發酵
發酵後的葡萄酒→ 加入糖以及酵母 →
密封發酵 → 發酵後酵母自溶(autolysis)
→ 過濾殘渣 → 瓶中or桶中陳放
• 香檳(Champagne):瓶中發酵,個別加入酵母,
冷凍瓶頸過濾殘渣
• 一般氣泡酒 :桶中發酵,一起加入酵母,
高壓下裝瓶
只有法國香檳區符合香檳製造相關規範所
產的氣泡酒才能叫香檳
3. 甜酒(Sweet wine):
• 冰酒(Ice wine)
• 貴腐甜酒(Noble rot / botrytis)
• 波特酒(Port)
• 雪莉酒(Sherry)
3. 甜酒(Sweet wine):
•冰酒(Ice wine)
• 貴腐甜酒(Noble rot / botrytis)
• 波特酒(Port)
• 雪莉酒(Sherry)
-8。C左右採收葡萄
Canada
Austria
Germany
375ml, 9% ABV
3. 甜酒(Sweet wine):
• 冰酒(Ice wine)
•貴腐甜酒(Noble rot / botrytis)
• 波特酒(Port)
• 雪莉酒(Sherry)
貴腐甜酒(Noble rot / Botrytis)
• 天時:早霧午陽
• 地利:兩河交界
• 代表區域:
• 法國-索甸(Sauterne)
• 匈牙利-托凱(Tokaji)
• 德國(Germany)TBA
Guess how much is it ?
甜酒之王 ─ 伊昆堡
3. 甜酒(Sweet wine):
• 冰酒(Ice wine)
• 貴腐甜酒(Noble rot / botrytis)
•波特酒(Port)
•雪莉酒(Sherry)
波特酒 (Port) 雪莉酒(Sherry)
代表國家 葡萄牙 西班牙
葡萄 紅葡萄 白葡萄
加烈時機 發酵時 發酵後
口感 甜 不甜
葡萄酒相關Q&A
如何開瓶?
常用開瓶器
• 蝴蝶型開瓶器
• 侍酒師之友 (sommelier’
knife)
• 老酒開瓶器 (AH-SO)
常用開瓶器
• 蝴蝶型開瓶器
• 侍酒師之友 (sommelier’
knife)
• 老酒開瓶器 (AH-SO)
常用開瓶器
• 蝴蝶型開瓶器
• 侍酒師之友 (sommelier’
knife)
• 老酒開瓶器 (AH-SO)
常用開瓶器
• 蝴蝶型開瓶器
• 侍酒師之友 (sommelier’
knife)
• 老酒開瓶器 (AH-SO)
如何開汽泡酒?
如何保存?
直放?橫放?
冰箱?酒櫃?
恆溫恆濕
儲存環境
分裝
• 密封容器
酒喝不完怎麼辦?
分裝
• 密封容器
取酒器:
• Coravin™
酒喝不完怎麼辦?
酒杯的拿法?
A B C D
原則:不直接碰觸到酒
A B C D
酒杯的分類?
酒要倒多少?
不要超過1/3,除了氣泡酒
醒酒器(decanter)
盡可能和氧氣接觸
濾渣
醒酒要醒多久?
所有的紅酒
優質白酒 (經橡木桶陳年)
所有的紅酒
優質白酒 (經橡木桶陳年)
Follow your own taste !
葡萄酒越老越好?
視葡萄品種、產區、
當年氣候而定
保存環境很重要 !!!
初學者如何找酒?
• 舊世界:
 Ex: Spain
• 新世界:
 Ex: USA、Chile、NZ、Australia…
• 大賣場也有不錯的酒
• 網站:
 Ex: 葡萄酒新手選…
好用Apps
Vivino
Wine-Searcher
簡易餐酒搭配
• 紅酒配紅肉,白酒配白肉
• 風土搭配法
• 不甜氣泡酒百搭
經典餐酒搭配
• 夏布利(Chablis)白酒 + 生蠔
• 香檳 + 魚子醬
• 貴腐甜酒 + 藍紋起司 or 鵝肝醬
• 台菜 + 德國麗絲玲白酒
經典餐酒搭配
• 夏布利(Chablis)白酒 + 生蠔
• 香檳 + 魚子醬
• 貴腐甜酒 + 藍紋起司 or 鵝肝醬
• 台菜 + 德國麗絲玲白酒
香檳百搭 !!!
如何品酒?
品酒五術語
甜度 (sweetness)
酸度 (acidity)
單寧 (tannin)
果味 (fruit)
酒體 (body)
視覺 嗅覺 味覺
系統性品酒
視覺
顏色
白酒
檸檬 → 金色 → 琥珀色
(新→老)
紅酒
紫色 → 紅寶石/石榴石 → 黃褐色
(新→老)
嗅覺
濃郁
程度
淡 → 中等 → 濃郁
香氣
何種花香、果香、香
料、草本植物…etc
味覺
甜度 不甜 → 微甜 → 中等 → 甜
酸度 低 →中 →高
單寧
(白酒不評)
低 →中 →高
酒體 輕 → 中 → 重
味道 何種花香、果香、香料、草本植物…etc
尾韻 短 → 中 → 長
視覺
顏色
白酒: 檸檬 → 金色 → 琥珀色 (新→老)
紅酒:紫色 → 紅寶石/石榴石 → 黃褐色(新→老)
嗅覺
濃郁程度 淡 → 中等 → 濃郁
香氣 何種花香、果香、香料、草本植物…etc
味覺
甜度 不甜 → 微甜 → 中等 → 甜
酸度 低 →中 →高
單寧 低 →中 →高
酒體 輕 → 中 → 重
味道 何種花香、果香、香料、草本植物…etc
尾韻 短 → 中 → 長
何謂好酒
平衡性:balance
尾韻 : finish
複雜性:complexity
經典性:typicity
強度性:intensity
何謂好酒
Instead of Drinking the best
wine, try to drink wine the
best.
2021/10/18 174
Partnership
12/06/2019 175
感謝聆聽指教

Contenu connexe

Similaire à CML's Presentation at FengChia University

Era ofdataeconomyv4short
Era ofdataeconomyv4shortEra ofdataeconomyv4short
Era ofdataeconomyv4shortJun Miyazaki
 
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSABetter Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSAPRBETTER
 
Social Friend Overlying Communities Based on Social Network Context
Social Friend Overlying Communities Based on Social Network ContextSocial Friend Overlying Communities Based on Social Network Context
Social Friend Overlying Communities Based on Social Network ContextIRJET Journal
 
Automatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia KalavriAutomatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia KalavriFlink Forward
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...confluent
 
Monitoring as an entry point for collaboration
Monitoring as an entry point for collaborationMonitoring as an entry point for collaboration
Monitoring as an entry point for collaborationJulien Pivotto
 
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...IEEEMEMTECHSTUDENTPROJECTS
 
IRJET- Secure E-Documents Storage using Blockchain
IRJET- Secure E-Documents Storage using BlockchainIRJET- Secure E-Documents Storage using Blockchain
IRJET- Secure E-Documents Storage using BlockchainIRJET Journal
 
Visual Cryptography Industrial Training Report
Visual Cryptography Industrial Training ReportVisual Cryptography Industrial Training Report
Visual Cryptography Industrial Training ReportMohit Kumar
 
Relationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine LearningRelationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine LearningNeo4j
 
Blockchain based News Application to combat Fake news
Blockchain based News Application to combat Fake newsBlockchain based News Application to combat Fake news
Blockchain based News Application to combat Fake newsIRJET Journal
 
Management and analysis of social media data
Management and analysis of social media dataManagement and analysis of social media data
Management and analysis of social media dataWeining Qian
 
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’S
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’SDiscovering Influential User by Coupling Multiplex Heterogeneous OSN’S
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’SIRJET Journal
 
1 IT 140 A Mini History of Text-Based Games Text
1  IT 140 A Mini History of Text-Based Games  Text1  IT 140 A Mini History of Text-Based Games  Text
1 IT 140 A Mini History of Text-Based Games TextMartineMccracken314
 
1 IT 140 A Mini History of Text-Based Games Text
1  IT 140 A Mini History of Text-Based Games  Text1  IT 140 A Mini History of Text-Based Games  Text
1 IT 140 A Mini History of Text-Based Games TextSilvaGraf83
 
Assessment Worksheet Aligning Risks, Threats, and Vuln.docx
Assessment Worksheet Aligning Risks, Threats, and Vuln.docxAssessment Worksheet Aligning Risks, Threats, and Vuln.docx
Assessment Worksheet Aligning Risks, Threats, and Vuln.docxfestockton
 
Essay On Cryptography
Essay On CryptographyEssay On Cryptography
Essay On CryptographyHaley Johnson
 
DevOps Support for an Ethical Software Development Life Cycle (SDLC)
DevOps Support for an Ethical Software Development Life Cycle (SDLC)DevOps Support for an Ethical Software Development Life Cycle (SDLC)
DevOps Support for an Ethical Software Development Life Cycle (SDLC)Mark Underwood
 

Similaire à CML's Presentation at FengChia University (20)

Yuntech present
Yuntech presentYuntech present
Yuntech present
 
Era ofdataeconomyv4short
Era ofdataeconomyv4shortEra ofdataeconomyv4short
Era ofdataeconomyv4short
 
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSABetter Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
 
Social Friend Overlying Communities Based on Social Network Context
Social Friend Overlying Communities Based on Social Network ContextSocial Friend Overlying Communities Based on Social Network Context
Social Friend Overlying Communities Based on Social Network Context
 
Automatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia KalavriAutomatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia Kalavri
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
 
Monitoring as an entry point for collaboration
Monitoring as an entry point for collaborationMonitoring as an entry point for collaboration
Monitoring as an entry point for collaboration
 
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
 
IRJET- Secure E-Documents Storage using Blockchain
IRJET- Secure E-Documents Storage using BlockchainIRJET- Secure E-Documents Storage using Blockchain
IRJET- Secure E-Documents Storage using Blockchain
 
Visual Cryptography Industrial Training Report
Visual Cryptography Industrial Training ReportVisual Cryptography Industrial Training Report
Visual Cryptography Industrial Training Report
 
Relationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine LearningRelationships Matter: Using Connected Data for Better Machine Learning
Relationships Matter: Using Connected Data for Better Machine Learning
 
Blockchain based News Application to combat Fake news
Blockchain based News Application to combat Fake newsBlockchain based News Application to combat Fake news
Blockchain based News Application to combat Fake news
 
Management and analysis of social media data
Management and analysis of social media dataManagement and analysis of social media data
Management and analysis of social media data
 
tweet segmentation
tweet segmentation tweet segmentation
tweet segmentation
 
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’S
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’SDiscovering Influential User by Coupling Multiplex Heterogeneous OSN’S
Discovering Influential User by Coupling Multiplex Heterogeneous OSN’S
 
1 IT 140 A Mini History of Text-Based Games Text
1  IT 140 A Mini History of Text-Based Games  Text1  IT 140 A Mini History of Text-Based Games  Text
1 IT 140 A Mini History of Text-Based Games Text
 
1 IT 140 A Mini History of Text-Based Games Text
1  IT 140 A Mini History of Text-Based Games  Text1  IT 140 A Mini History of Text-Based Games  Text
1 IT 140 A Mini History of Text-Based Games Text
 
Assessment Worksheet Aligning Risks, Threats, and Vuln.docx
Assessment Worksheet Aligning Risks, Threats, and Vuln.docxAssessment Worksheet Aligning Risks, Threats, and Vuln.docx
Assessment Worksheet Aligning Risks, Threats, and Vuln.docx
 
Essay On Cryptography
Essay On CryptographyEssay On Cryptography
Essay On Cryptography
 
DevOps Support for an Ethical Software Development Life Cycle (SDLC)
DevOps Support for an Ethical Software Development Life Cycle (SDLC)DevOps Support for an Ethical Software Development Life Cycle (SDLC)
DevOps Support for an Ethical Software Development Life Cycle (SDLC)
 

Dernier

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 

Dernier (20)

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 

CML's Presentation at FengChia University

Notes de l'éditeur

  1. 王老師演講在前
  2. 倒三角形、1:1 通訊是什麼 Source Target Audience
  3. 介紹Alex
  4. 問自己:怎麼搜尋新聞的? 或者不搜尋新聞了?
  5. 四個 Components
  6. 大家公認的三個Factors
  7. Occupy 行動 UC Davis 的行動
  8. Sometimes it’s hard to evaluate “spamming” New SFW – Likefarm? Is that ContentFarm?
  9. Every principle has its mind, reason, everything has its causality
  10. SFW – we need to have a better organized presentation for problems. SFW – the defenders concern might be different – we need to consider the risk factor
  11. Shelf Life, skim messages, can “catch” ones eyes only , enlarge the influence https://www.facebook.com/barackobama/posts/10151673679836749 https://www.facebook.com/cnn/posts/313652498762911 SFW – ask the audience “which post has higher prob to be attacked”?
  12. SFW – watch out for the transition into this slide. SFW – do you want to provide one example for all or most of the slides? SFW – I feel that you should give an example to explain. SFW – Definition**s**
  13. SFW – how to interpret 10 minutes? (what is the total time and attack time)? Naïve Bayne: DAV not independent with each other Adaboost: Not good for outlier, number of estimators = 50 and learning rate = 1. Decision Tree: Good for social networks data we set minimum samples split = 2 and minimum samples leaf = 1, as with depth, nodes are expanded until all leaves are pure.
  14. 1. IR is learnable? 2. No difference between Light and Critical malicious URLs since their performance are quite similar 3. Increase recall result is high
  15. SFW – explain “Exact time after last attack”
  16. Why do you choose similarity Fast
  17. Read the silde
  18. Our first thought is majority vote algorithm
  19. where IHOP(Page,Si) denotes hop distance between page and seed Si, using inward edges as connection for BFS; OHOP(Page, Si) denotes hop distance between page and seed Si, using outward edges as connection for BFS.
  20. In particular, since California is much larger than other states in perspectives of population and economy, “OnlyInYourState.com” splits California into Northern and Southern regions, as shown in Table. Therefore, both ”Only In Northern California” and ”Only In Southern California” are used as anchored pages to calculate IHOP (P age, Si) and OHOP (P age, Si), in addition to the other forty nine an- chored pages. Hence Nanchored pages is set as 51. Furthermore, since ”Only In Idaho” had been registered, OnlyInYourState.com named its Idaho counterpart as ”Idaho Only” instead. In general, more anchored pages involved would enlarge the BFS coverage of pages.
  21. This probability is not high; however, the baseline BFS-based ML algorithm only cares about the hop distances to the anchored pages. where INP(Page,Ri) denotes inward neighborhood location probability between this page and the adjacent pages belonging to the region Ri; where IE(Page,Ri) is the number of inward edges between this page and the adjacent pages belonging to the region Ri;
  22. We took the pages with declared location information of country and city as ground truth data. Few pages are excluded because their city names exist in multiple states, which can result in ambiguous city-to-state mapping. There are 29,849 cities in total in the US.
  23. The training set utilized 80% of data while test set employed the rest. Since number of classes is rather large, Random Forest classifier is preferably adopted, instead of Gradient Boosting classifier [23]. The default parameter sets were applied when using the implementations available in the scikit-learn package [54]. As shown in Table 4.2, the precision, recall, f1 score of the Random Forest classifier are at least 20% better than the counterparts of the Naive Bayes classifier and the Adaboost classifier. Thus in the following, we only present results done with the Random Forest classifier. baseline BFS-based ML algorithm with the Random Forest classifier achieved 69% accuracy, which is 10% better than accuracy of the majority vote algorithm. With addition of SNP, advanced BFS-based ML algorithm accomplished 89% prediction accuracy, which is a 20% improvement over baseline.
  24. 必須要有陽光,才能行光合作用