SlideShare une entreprise Scribd logo
1  sur  46
Télécharger pour lire hors ligne
1
Natalino Busa - @natbusa
Global Artificial Intelligence
Conference
AI in Finance:
from Hype to marketing and
cyber security use cases
www.globalbigdataconference.com
Twitter : @bigdataconf
Global Artificial Intelligence Conference
AI in Finance:
from Hype to marketing and cyber security use cases
Natalino Busa
Twitter : @natbusa
3 Natalino Busa - @natbusa
Cognitive Finance Group Advisory Board Member
ING Group Enterprise Architect: Cybersecurity, Fintech
Teradata Head of Applied Data Science
Teradata Global Evangelist on Open Sourced Technologies
O’Reilly Author and Speaker
Philips Senior Researcher, Data Architect
Linkedin and Twitter:
@natbusa
4
Natalino Busa - @natbusa
What about AI in Finance?
5
Natalino Busa - @natbusa
The Medici Bank:
Italian: Banco Medici
1397–1494
6
Natalino Busa - @natbusa
Data as a Relationship
● Trust
● Transparency of Use
● Customer First
● Regulations and Laws
● Respect and Protect
● Providing a Service
7 Natalino Busa - @natbusa
An ethical approach
for Actionable Financial Data
Help the customer
Propose, Advise, Select, Filter, Connect,
Simplify1.
Protect the customer
Detect, Prevent, Alert, Block, Defend,
Identify, Authorize
2.
8
Natalino Busa - @natbusa
Personalized Financial
9 Natalino Busa - @natbusa
http://www.slideshare.net/ING/4q15-media
● Innovation helps to empower people to make better
financial decisions. ING, has launched several new
omni-channel banking platforms.
● The platform gives customers insights
into their personal finances in an easy
and intuitive way.
Financial personalized recommenders
10 Natalino Busa - @natbusa
Financial personalized recommenders
● It Knows Finance
● Conversational
● Personal
● Actionable
● Predictive
● Reuse Existing Content
11 Natalino Busa - @natbusa
Inspiration from the Web
12
Natalino Busa - @natbusa
Credit Pre-Authorization
13 Natalino Busa - @natbusa
● Fintech innovation to help strengthen our lending
capabilities and better serve our consumer and SME
clients.
● Kabbage, one of the leading US-based technology
platforms providing automated lending to SME.
● In January 2016, ING has made an investment in
fintech WeLab, which provides consumer loans in
China and Hong Kong in a fully automated process
that just takes minutes, from application to approval.
http://www.slideshare.net/ING/4q15-media
Strategic data-driven initiatives
14
Natalino Busa - @natbusa
Approaching (Almost) Any Machine Learning Problem
- Abhishek Thakur, Kaggle Grandmaster -
data labels
raw data: tables, files Useful dataData munging Feature
Engineering
Tabular Data ready for ML
15 Natalino Busa - @natbusa
Input
Hand Designed
Program
Input Input
Rule-based System
Output
Hand Designed
Features
Mapping from
features
Output
Learned
Features
Mapping from
features
Output
Classic Machine
Learning
Input
Learned
Features
Learned
Complex features
Output
Mapping from
features
Representational
Machine Learning
Deep Learning
(end-to-end learning)
Prof. Yoshua Bengio - Deep Learning
https://youtu.be/15h6MeikZNg
Predictive API’s: How to get there?
16 Natalino Busa - @natbusa
From Feature to Architecture Engineering:
17 Natalino Busa - @natbusa
Demo:
Credit Payment Defaulting
with TensorFlow and Keras
Methodology
This research aimed at the case of customers
default payments in Taiwan and compares the
predictive accuracy of probability of default among
six data mining methods. From the perspective of
risk management, the result of predictive
accuracy of the estimated probability of default
will be more valuable than the binary result of
classification - credible or not credible clients https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients
18 Natalino Busa - @natbusa
Step 0: data exploration
Target variable: default payment next month
Color scheme: yes, defaulting not defaulting g
19 Natalino Busa - @natbusa
Step 1: feature engineering
pay_1 -1
pay_2 0
pay_3 -1
pay_4 0
pay_5 0
pay_6 0
pay_avgamt1 0.203221
pay_avgamt2 3.72718
pay_avgamt3 1.01611
pay_avgamt4 0.914495
pay_avgamt5 0.0700097
pay_avgamt6 0.0689935
pay_stdavgamt 1.40083
pay_avg -0.333333
pay_std 0.516398
20 Natalino Busa - @natbusa
Step 1: baseline (e.g regression)
model = Sequential()
model.add(Dense(1, input_shape=(input_dim,))
model.add(Activation('relu'))
1
87
it’s a neural network …
with no network :)
21 Natalino Busa - @natbusa
Step 2: deep learning
model = Sequential()
model.add(Dense(256, input_shape=(input_dim,), activation='relu'))
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.25))
model.add(Dense(64, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.25))
model.add(Dense(64, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.25))
model.add(Dense(10, activation='sigmoid'))
model.add(Dense(1))
model.add(Activation('sigmoid'))
256
64
64
87
256
64
64
10
1
22 Natalino Busa - @natbusa
Step 3: compare: is deep learning better?
256
64
64
87
256
64
64
10
1
1
87
Shallow
Logit Model
Deep
Learning
23 Natalino Busa - @natbusa
Step 4: picking the brain of our DL model
87
1
24 Natalino Busa - @natbusa
256
64
64
87
256
64
64
10
1
Step 4: picking the brain of our DL model
25 Natalino Busa - @natbusa
Step 5: semantic clustering
Default
Very Safe
Mixed
Group
Safe
SafeMixed
Group
26
Natalino Busa - @natbusa
Hands on with Keras and Tensorflow
27
Natalino Busa - @natbusa
Hyper-Parameters tuning
- based on scikit-learn
- 15 classifiers,
- 14 feature preprocessing methods
- 4 data preprocessing methods
- 110 hyperparameters
- Supervised classification challenge:
100 different datasets
https://arxiv.org/abs/1611.03824v1
Natalino Busa - @natbusa
28 Natalino Busa - @natbusa
The API for banking data.
Two levels:
- Transactions
- Risk Scoring
Inspiration from the Web
29
Natalino Busa - @natbusa
Card Theft: Geo-Alerting
30 Natalino Busa - @natbusa
Clustering geolocated data
using Spark and DBSCAN
How to group users’ events using machine learning and distributed computing
By Natalino Busa
Predictive API’s: Clustering Geolocated Data
@natbusa | linkedin.com: Natalino Busa
Venues and Events
@natbusa | linkedin.com: Natalino BusaEvents clustering
@natbusa | linkedin.com: Natalino Busa
Card Theft/Cloning: DBSCAN and Convex Hulls
@natbusa | linkedin.com: Natalino Busa
Fast writes
2D Data Structure
Replicated
Tunable consistency
Multi-Data centers
CassandraKafka Spark
Streaming Events
Distributed, Scalable Transport
Events are persisted
Decoupled Consumer-Producers
Topics and Partitions
Ad-Hoc Queries
Joins, Aggregate
User Defined Functions
Machine Learning,
Advanced Stats and Analytics
Kafka+Cassandra+Spark: SMACK stack
Streaming Machine Learning
@natbusa | linkedin.com: Natalino Busa
Spark: Unified Distributed Computing:
SQL + Machine Learning + Graph Analytics
Spark - RDDs
Streaming SQL MLlib Graphx
Analytics, Statistics, Data
Science, Model Training
HDFS NoSQL SQL
Data Sources
Map-Reduce
HDFS KAFKA
Hive
@natbusa | linkedin.com: Natalino Busa
Cassandra: Store all the data
Spark: Analyze all the data
DC1: replication factor 3 DC2: replication factor 3 DC3: replication factor 3 + Spark Executors
Storage! Analytics!
Data
Spark and Cassandra: distributed goodness
@natbusa | linkedin.com: Natalino Busa
Cassandra - Spark Connector
Cassandra: Store all the data
Spark: Distributed Data Processing
Executors and Workers
Cassandra-Spark Connector:
Data locality,
Reduce Shuffling
RDD’s to Cassandra Partitions
DC3: replication factor 3 +
Spark Executors
38
Natalino Busa - @natbusa
Cyber security in Finance
39
Natalino Busa - @natbusa
Network Intrusion Detection
It contains 130 million flow records involving
12,027 distinct computers over 36 days (not
the full 58 days claimed for the entire data
release).
Each record consists of: time (to nearest
second), duration, source and destination
computer ids, source and destination ports,
protocol, number of packets and number of
bytes
Techniques: TDA, Dimensionality Reduction
https://en.wikipedia.org/wiki/Nonlinear_dimensionality_reduction
40 Natalino Busa - @natbusa
AI: tools and technologies
41 Natalino Busa - @natbusa
Tools for AI and Machine (deep) Learning
… this are just a few examples ...
42 Natalino Busa - @natbusa
AI: models and algorithms
43 Natalino Busa - @natbusa
AI: an ensemble of analytical methods
SQL + Graph + Text + Machine Learning + Voice/Image/Video
44
Natalino Busa - @natbusa
AI in Finance:
Recap & Lessons Learned
45
Natalino Busa - @natbusa
Takeaways
● AI can be applied in Finance: YES
● Train your AI: Domain Experts + ML
● Use All Tools, All Data
46 Natalino Busa - @natbusa
Distributed computing Artificial Intelligence
Machine Learning Statistics Big/Fast Data
Streaming Computing
Linkedin and Twitter:
@natbusa

Contenu connexe

Plus de Natalino Busa

Data science apps: beyond notebooks
Data science apps: beyond notebooksData science apps: beyond notebooks
Data science apps: beyond notebooksNatalino Busa
 
[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditingNatalino Busa
 
Strata London 16: sightseeing, venues, and friends
Strata  London 16: sightseeing, venues, and friendsStrata  London 16: sightseeing, venues, and friends
Strata London 16: sightseeing, venues, and friendsNatalino Busa
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Real-Time Anomaly Detection  with Spark MLlib, Akka and  CassandraReal-Time Anomaly Detection  with Spark MLlib, Akka and  Cassandra
Real-Time Anomaly Detection with Spark MLlib, Akka and CassandraNatalino Busa
 
The evolution of data analytics
The evolution of data analyticsThe evolution of data analytics
The evolution of data analyticsNatalino Busa
 
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...Natalino Busa
 
Streaming Api Design with Akka, Scala and Spray
Streaming Api Design with Akka, Scala and SprayStreaming Api Design with Akka, Scala and Spray
Streaming Api Design with Akka, Scala and SprayNatalino Busa
 
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.Natalino Busa
 
Big data solutions for advanced marketing analytics
Big data solutions for advanced marketing analyticsBig data solutions for advanced marketing analytics
Big data solutions for advanced marketing analyticsNatalino Busa
 
Awesome Banking API's
Awesome Banking API'sAwesome Banking API's
Awesome Banking API'sNatalino Busa
 
Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.Natalino Busa
 
Big and fast a quest for relevant and real-time analytics
Big and fast a quest for relevant and real-time analyticsBig and fast a quest for relevant and real-time analytics
Big and fast a quest for relevant and real-time analyticsNatalino Busa
 
Big Data and APIs - a recon tour on how to successfully do Big Data analytics
Big Data and APIs - a recon tour on how to successfully do Big Data analyticsBig Data and APIs - a recon tour on how to successfully do Big Data analytics
Big Data and APIs - a recon tour on how to successfully do Big Data analyticsNatalino Busa
 
Strata 2014: Data science and big data trending topics
Strata 2014: Data science and big data trending topicsStrata 2014: Data science and big data trending topics
Strata 2014: Data science and big data trending topicsNatalino Busa
 
Streaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologiesStreaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologiesNatalino Busa
 

Plus de Natalino Busa (17)

Data science apps: beyond notebooks
Data science apps: beyond notebooksData science apps: beyond notebooks
Data science apps: beyond notebooks
 
[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing
 
Strata London 16: sightseeing, venues, and friends
Strata  London 16: sightseeing, venues, and friendsStrata  London 16: sightseeing, venues, and friends
Strata London 16: sightseeing, venues, and friends
 
Data in Action
Data in ActionData in Action
Data in Action
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Real-Time Anomaly Detection  with Spark MLlib, Akka and  CassandraReal-Time Anomaly Detection  with Spark MLlib, Akka and  Cassandra
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
 
The evolution of data analytics
The evolution of data analyticsThe evolution of data analytics
The evolution of data analytics
 
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
Towards Real-Time banking API's: Introducing Coral, a web api for realtime st...
 
Streaming Api Design with Akka, Scala and Spray
Streaming Api Design with Akka, Scala and SprayStreaming Api Design with Akka, Scala and Spray
Streaming Api Design with Akka, Scala and Spray
 
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.Hadoop + Cassandra: Fast queries on data lakes, and  wikipedia search tutorial.
Hadoop + Cassandra: Fast queries on data lakes, and wikipedia search tutorial.
 
Big data solutions for advanced marketing analytics
Big data solutions for advanced marketing analyticsBig data solutions for advanced marketing analytics
Big data solutions for advanced marketing analytics
 
Awesome Banking API's
Awesome Banking API'sAwesome Banking API's
Awesome Banking API's
 
Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.Yo. big data. understanding data science in the era of big data.
Yo. big data. understanding data science in the era of big data.
 
Big and fast a quest for relevant and real-time analytics
Big and fast a quest for relevant and real-time analyticsBig and fast a quest for relevant and real-time analytics
Big and fast a quest for relevant and real-time analytics
 
Big Data and APIs - a recon tour on how to successfully do Big Data analytics
Big Data and APIs - a recon tour on how to successfully do Big Data analyticsBig Data and APIs - a recon tour on how to successfully do Big Data analytics
Big Data and APIs - a recon tour on how to successfully do Big Data analytics
 
Strata 2014: Data science and big data trending topics
Strata 2014: Data science and big data trending topicsStrata 2014: Data science and big data trending topics
Strata 2014: Data science and big data trending topics
 
Streaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologiesStreaming computing: architectures, and tchnologies
Streaming computing: architectures, and tchnologies
 
Big data landscape
Big data landscapeBig data landscape
Big data landscape
 

Dernier

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Dernier (20)

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

AI in finance - from hype to marketing and cybersecurity use cases

  • 1. 1 Natalino Busa - @natbusa Global Artificial Intelligence Conference AI in Finance: from Hype to marketing and cyber security use cases
  • 2. www.globalbigdataconference.com Twitter : @bigdataconf Global Artificial Intelligence Conference AI in Finance: from Hype to marketing and cyber security use cases Natalino Busa Twitter : @natbusa
  • 3. 3 Natalino Busa - @natbusa Cognitive Finance Group Advisory Board Member ING Group Enterprise Architect: Cybersecurity, Fintech Teradata Head of Applied Data Science Teradata Global Evangelist on Open Sourced Technologies O’Reilly Author and Speaker Philips Senior Researcher, Data Architect Linkedin and Twitter: @natbusa
  • 4. 4 Natalino Busa - @natbusa What about AI in Finance?
  • 5. 5 Natalino Busa - @natbusa The Medici Bank: Italian: Banco Medici 1397–1494
  • 6. 6 Natalino Busa - @natbusa Data as a Relationship ● Trust ● Transparency of Use ● Customer First ● Regulations and Laws ● Respect and Protect ● Providing a Service
  • 7. 7 Natalino Busa - @natbusa An ethical approach for Actionable Financial Data Help the customer Propose, Advise, Select, Filter, Connect, Simplify1. Protect the customer Detect, Prevent, Alert, Block, Defend, Identify, Authorize 2.
  • 8. 8 Natalino Busa - @natbusa Personalized Financial
  • 9. 9 Natalino Busa - @natbusa http://www.slideshare.net/ING/4q15-media ● Innovation helps to empower people to make better financial decisions. ING, has launched several new omni-channel banking platforms. ● The platform gives customers insights into their personal finances in an easy and intuitive way. Financial personalized recommenders
  • 10. 10 Natalino Busa - @natbusa Financial personalized recommenders ● It Knows Finance ● Conversational ● Personal ● Actionable ● Predictive ● Reuse Existing Content
  • 11. 11 Natalino Busa - @natbusa Inspiration from the Web
  • 12. 12 Natalino Busa - @natbusa Credit Pre-Authorization
  • 13. 13 Natalino Busa - @natbusa ● Fintech innovation to help strengthen our lending capabilities and better serve our consumer and SME clients. ● Kabbage, one of the leading US-based technology platforms providing automated lending to SME. ● In January 2016, ING has made an investment in fintech WeLab, which provides consumer loans in China and Hong Kong in a fully automated process that just takes minutes, from application to approval. http://www.slideshare.net/ING/4q15-media Strategic data-driven initiatives
  • 14. 14 Natalino Busa - @natbusa Approaching (Almost) Any Machine Learning Problem - Abhishek Thakur, Kaggle Grandmaster - data labels raw data: tables, files Useful dataData munging Feature Engineering Tabular Data ready for ML
  • 15. 15 Natalino Busa - @natbusa Input Hand Designed Program Input Input Rule-based System Output Hand Designed Features Mapping from features Output Learned Features Mapping from features Output Classic Machine Learning Input Learned Features Learned Complex features Output Mapping from features Representational Machine Learning Deep Learning (end-to-end learning) Prof. Yoshua Bengio - Deep Learning https://youtu.be/15h6MeikZNg Predictive API’s: How to get there?
  • 16. 16 Natalino Busa - @natbusa From Feature to Architecture Engineering:
  • 17. 17 Natalino Busa - @natbusa Demo: Credit Payment Defaulting with TensorFlow and Keras Methodology This research aimed at the case of customers default payments in Taiwan and compares the predictive accuracy of probability of default among six data mining methods. From the perspective of risk management, the result of predictive accuracy of the estimated probability of default will be more valuable than the binary result of classification - credible or not credible clients https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients
  • 18. 18 Natalino Busa - @natbusa Step 0: data exploration Target variable: default payment next month Color scheme: yes, defaulting not defaulting g
  • 19. 19 Natalino Busa - @natbusa Step 1: feature engineering pay_1 -1 pay_2 0 pay_3 -1 pay_4 0 pay_5 0 pay_6 0 pay_avgamt1 0.203221 pay_avgamt2 3.72718 pay_avgamt3 1.01611 pay_avgamt4 0.914495 pay_avgamt5 0.0700097 pay_avgamt6 0.0689935 pay_stdavgamt 1.40083 pay_avg -0.333333 pay_std 0.516398
  • 20. 20 Natalino Busa - @natbusa Step 1: baseline (e.g regression) model = Sequential() model.add(Dense(1, input_shape=(input_dim,)) model.add(Activation('relu')) 1 87 it’s a neural network … with no network :)
  • 21. 21 Natalino Busa - @natbusa Step 2: deep learning model = Sequential() model.add(Dense(256, input_shape=(input_dim,), activation='relu')) model.add(Dense(256, activation='relu')) model.add(Dropout(0.25)) model.add(Dense(64, activation='relu')) model.add(Dense(64, activation='relu')) model.add(Dropout(0.25)) model.add(Dense(64, activation='relu')) model.add(Dense(64, activation='relu')) model.add(Dropout(0.25)) model.add(Dense(10, activation='sigmoid')) model.add(Dense(1)) model.add(Activation('sigmoid')) 256 64 64 87 256 64 64 10 1
  • 22. 22 Natalino Busa - @natbusa Step 3: compare: is deep learning better? 256 64 64 87 256 64 64 10 1 1 87 Shallow Logit Model Deep Learning
  • 23. 23 Natalino Busa - @natbusa Step 4: picking the brain of our DL model 87 1
  • 24. 24 Natalino Busa - @natbusa 256 64 64 87 256 64 64 10 1 Step 4: picking the brain of our DL model
  • 25. 25 Natalino Busa - @natbusa Step 5: semantic clustering Default Very Safe Mixed Group Safe SafeMixed Group
  • 26. 26 Natalino Busa - @natbusa Hands on with Keras and Tensorflow
  • 27. 27 Natalino Busa - @natbusa Hyper-Parameters tuning - based on scikit-learn - 15 classifiers, - 14 feature preprocessing methods - 4 data preprocessing methods - 110 hyperparameters - Supervised classification challenge: 100 different datasets https://arxiv.org/abs/1611.03824v1 Natalino Busa - @natbusa
  • 28. 28 Natalino Busa - @natbusa The API for banking data. Two levels: - Transactions - Risk Scoring Inspiration from the Web
  • 29. 29 Natalino Busa - @natbusa Card Theft: Geo-Alerting
  • 30. 30 Natalino Busa - @natbusa Clustering geolocated data using Spark and DBSCAN How to group users’ events using machine learning and distributed computing By Natalino Busa Predictive API’s: Clustering Geolocated Data
  • 31. @natbusa | linkedin.com: Natalino Busa Venues and Events
  • 32. @natbusa | linkedin.com: Natalino BusaEvents clustering
  • 33. @natbusa | linkedin.com: Natalino Busa Card Theft/Cloning: DBSCAN and Convex Hulls
  • 34. @natbusa | linkedin.com: Natalino Busa Fast writes 2D Data Structure Replicated Tunable consistency Multi-Data centers CassandraKafka Spark Streaming Events Distributed, Scalable Transport Events are persisted Decoupled Consumer-Producers Topics and Partitions Ad-Hoc Queries Joins, Aggregate User Defined Functions Machine Learning, Advanced Stats and Analytics Kafka+Cassandra+Spark: SMACK stack Streaming Machine Learning
  • 35. @natbusa | linkedin.com: Natalino Busa Spark: Unified Distributed Computing: SQL + Machine Learning + Graph Analytics Spark - RDDs Streaming SQL MLlib Graphx Analytics, Statistics, Data Science, Model Training HDFS NoSQL SQL Data Sources Map-Reduce HDFS KAFKA Hive
  • 36. @natbusa | linkedin.com: Natalino Busa Cassandra: Store all the data Spark: Analyze all the data DC1: replication factor 3 DC2: replication factor 3 DC3: replication factor 3 + Spark Executors Storage! Analytics! Data Spark and Cassandra: distributed goodness
  • 37. @natbusa | linkedin.com: Natalino Busa Cassandra - Spark Connector Cassandra: Store all the data Spark: Distributed Data Processing Executors and Workers Cassandra-Spark Connector: Data locality, Reduce Shuffling RDD’s to Cassandra Partitions DC3: replication factor 3 + Spark Executors
  • 38. 38 Natalino Busa - @natbusa Cyber security in Finance
  • 39. 39 Natalino Busa - @natbusa Network Intrusion Detection It contains 130 million flow records involving 12,027 distinct computers over 36 days (not the full 58 days claimed for the entire data release). Each record consists of: time (to nearest second), duration, source and destination computer ids, source and destination ports, protocol, number of packets and number of bytes Techniques: TDA, Dimensionality Reduction https://en.wikipedia.org/wiki/Nonlinear_dimensionality_reduction
  • 40. 40 Natalino Busa - @natbusa AI: tools and technologies
  • 41. 41 Natalino Busa - @natbusa Tools for AI and Machine (deep) Learning … this are just a few examples ...
  • 42. 42 Natalino Busa - @natbusa AI: models and algorithms
  • 43. 43 Natalino Busa - @natbusa AI: an ensemble of analytical methods SQL + Graph + Text + Machine Learning + Voice/Image/Video
  • 44. 44 Natalino Busa - @natbusa AI in Finance: Recap & Lessons Learned
  • 45. 45 Natalino Busa - @natbusa Takeaways ● AI can be applied in Finance: YES ● Train your AI: Domain Experts + ML ● Use All Tools, All Data
  • 46. 46 Natalino Busa - @natbusa Distributed computing Artificial Intelligence Machine Learning Statistics Big/Fast Data Streaming Computing Linkedin and Twitter: @natbusa