Heart Disease Classification Report: A Data Analysis Project
Open Web Data Feeds for Cybersecurity & Homeland Security Intelligence
1. Open Web Data Feeds for
Cybersecurity & Homeland Security Threat Intelligence
Ohad Flinker | Director of Content & Data Insights
February 2017
2. About Webhose.io Data Feeds
We power big data analytics platforms
(SalesForce, Kantar Media, Hootsuite, Buzzilla, Digitalstakeout, ASRC Federal)
News Sites
Message
Boards
Blogs
Webhose.io platform
OSINT
Media Monitoring
Machine Learning
Financial Analysis
Darknet
4. Homeland Security Use Cases
News and media monitoring
Threat actor profile compilation
Crime prevention, investigation, and evidence collecation
Machine learning
Incident response and crisis management
5. DHS Recommendations
• Social media monitoring tools/licenses have been purchased
(commercial off-the-shelf or Software as a Service)
• Data from available technologies has been integrated into common operating picture via
web map or other dynamic data feeds
• Technical requirements have been identified and addressed
• Data available from multiple sources; data is standardized upon publication or receipt
• Social media data integrated with other data to produce enhanced maps (aggregation
and fusion of applicable information); multiple data layers are available for consideration
Table 2.3: Phase Three of the Social Media Integration Maturity Model
7. Big Data OSINT
To deliver actionable alerts and insights, you need to develop new capabilities:
Massive volumes of machine readable data (clean, organized, structured)
Continuous discovery of new data sources
Up-to-the-minute current information
Analysis that overcomes anonymity and completeness of information
8. OSINT 1.0 The dogdaygod murder plot
Stephen Carl Allwines murder trial reconvenes today February 13th 2017
Reported suicide of his wife Amy in November 2016
Forensic evidence collected
Claimed no knowledge of Darknet
9. OSINT 1.0 The dogdaygod murder plot
Digital trail traced to user ‘dogdaygod’ contracting Besa Mafia “hit service”
… which took his money but never delivered the hit
10. OSINT 1.0 The dogdaygod murder plot
They did, however, leak their entire ‘customer’ and ‘contractor’ list
11. OSINT 1.0 The dogdaygod murder plot
Physical evidence suggested cover up
Claimed to have no knowledge of Darknet
Reddit activity suggests otherwise
Reddit post by the same username
13. OSINT 2.0
Exponential volume of data
Threat actor activity posted in broad daylight
Anonymized and/or encryptied
Besides eBay, messages are often hidden in the “X-rated pornographic pictures
which conceal documents and orders for the next target,” said one intelligence
source.
Several other Mossad operatives spent their time tracking the Internet message board
Reddit. More than once, it had led an operator to a terrorist using hexadecimal characters
and prime numbers. Decoded, they sometimes indicated an attack was being planned or
even about to happen.
15. The Challenge
Actionable intel is significant
Requires a new set of capabilities
Identify threat patterns as they emerge
Analyze structured datasets
16. Case Study: The $5B Credit Card Fraud Market
Researchers used webhose.io data to expose widespread CC fraud
The fraudster “market challenge”
Explicit fraudulent activity on social media will get your account shut down
The fraudster workaround: Create new dummy accounts
17. Case Study: The $5B Credit Card Fraud Market
But how can we identify patterns between one digital identity
18. Case Study: The $5B Credit Card Fraud Market
And multiple dummy accounts generated by thousands of threat actors
19. Case Study: The $5B Credit Card Fraud Market
Complete price list
Data dump sample
Anonymized contact information
20. The Pattern identified by researchers
1. Identify victim talking about CC information on Twitter
while using benign account (e.g. @harmless-good-guy1)
2. Create new dummy account and engage with victim
(follow, friend, RT using fresh new account @harmeless-good-guy2)
3. Send victim link to blog/forum that contains malicious phishing site
4. Harvest victim CC information
5. Post harvested CC database for sale
21. The system to confirm the pattern is widespread
Obtain two datasets over a 48 hour period
by querying Twitter and Webhose.io API
for fraud signal keywords (ICQ, cvv, cvv2, amex)
Multi-layered graph-based model for social engineering vulnerability assessment