SlideShare une entreprise Scribd logo
1  sur  26
Télécharger pour lire hors ligne
How can you Identify Fraud in Fintech Lending using
Deep Learning
RATNAKAR PANDEY, HEAD OF INDIA ANALYTICS & DATA SCIENCE, KABBAGE
Disclaimer: The views expressed here are solely those of the presenter in his private capacity.
16th October 2018
“This series is solely for educational purposes only. This series does not intend to be complete or universal in nature and cannot be
considered as an alternative to an expert opinion on any specific issue. The series is based on views of the speaker/facilitator and
NASSCOM does not recommend/endorse the view-points per se and is primarily a medium to disseminate knowledge for the
greater good of the Products ecosystem. Any attendee who opens or otherwise accesses the content of the series at any point of
time, does so at their own risk and acknowledges and agrees that neither NASSCOM and nor its members and affiliates will not
be responsible for any loss or damage suffered by any person.
The content of this webinar series is solely for the purpose of NASSCOM members and NASSCOM digital channels and any
copying/distribution is liable for legal action.”
Legal Disclaimer
2
FRAUD IS A BIG PROBLEM
ACROSS THE WORLD
3
Outline
Demo of Multi Level Perceptron (MLP)
Classification Case Approach and Performance
Suggested Deep Learning Application Areas
Supervised Unsupervised
Need for Deep Learning
Existing Methods Why Deep Learning?
Frauds in Fintech Lending
Drivers Modus Operandi
Introduction
About Fintech About Kabbage
4
Fintech is an Integral Part of Our Life Now
$24.7 B
Invested in 2016 in
global fintech companies
1076
Deals in 2016 in
global fintech companies
Sources: KPMG, The Pulse of Fintech Q4 2016 | Capgemini World Fintech Report 2017 | PwC Global Fintech Report 2017 | www.forbes.com
50.2%
Of global customers have
done business with fintech
20%
Expected ROI on
fintech projects
20+
Global fintech
Unicorns
10K+
Global fintech
companies
Types
of
Fintech
Alternative Lending- Kabbage, Lendingclub, Prosper, Zopa
Payment / Billing Tech - Stripe, Paytm, Adyen, Ant Financial,
Square
Personal Finance / Asset Management Creditkarma, Bankrate,
NerdWallet
Robo Advisory- Wealthfront, Betterment, NerdWallet
Blockchain- Abra, 21, coinbase, Ethereum
5
Kabbage is Blazing a Trail in Big Data & Fintech
Kabbage is more than a lender for small businesses; our data and technology
platform is now being used as a fully branded product by other lenders, and
our products are expanding. We’ve received numerous awards & recognition,
including-
• CNBC Disruptors 50 list
• Inc. 500 list for three consecutive years
• The Forbes Most Promising Companies lists twice
• Glassdoor’s 2017 Best Places to Work list
6
Fraud Drivers- Superfast Decision Making and Faceless Channels
Decisioning within few minutes
Application on web and Mobile
May have higher exposure to
thin file and new to credit
More prone to invisible window
applications
Unconventional and evolving
data sources
Note: Even with these challenges the fraud rate in the industry is typically less than 20 bps for more data savvy lenders 7
How a Lending Fraud can be Classified?
Who
Commits?
How?
Who is the
Victim?
Borrower
Someone known to the
borrower- lead
generator, friends, family
employees etc.
Someone unknown to
the borrower
First Payment Default,
Bust Out, Synthetic
Identity, Stacking etc.
Friendly Fraud-
someone misuses the
trust
Fraud rings, Identity
Theft, Account Takeover
Lender Borrower, Lender Borrower, Lender
First Party Second Party Third Party
8
Sample Modus Operandi
• Stolen identity
• Synthetic identity
• May replicate best
customer (prime
and super prime)
• Falsified info
• No willingness to
pay
• Acquire multiple loans
in a short window (
invisible window)
• May provide all info
correctly
• More likely to be on
higher side in the risk
spectrum
• No or low willingness to
pay
• Mimic good payment
behavior for significant
time
• Bust out when gains
are highestCommon Fraud Related Terms- http://www.cpp.co.uk/helpful-info/fraud-glossary-of-terms
9
Current Situation- Heuristics and Regression Driven Approaches
Intuitive
Heuristics
Statistical
• Manual Reviews
• Experts Driven
• Gut feeling
• Thumb rules
• Driven by past experience
• Quick decision making
• Control/ confidence limits
• Outlier detection/ deviation from norm
• Decision tree, regression, time series
10
10,000 +
Features
Unstructured
Transactional
Social
Device
&
IP
Third Parties
Bureau
Why go Deep? Explosion of Features and Data Sources
• Uncover hard to detect patterns
(using traditional techniques) when
the incidence rate is low
• Find latent features (super variables)
without significant manual feature
engineering
• Real time fraud detection and self
learning models using streaming data
(KAFKA, MapR)
• Ensure consistent customer
experience and regulatory
compliance
• Higher operational efficiency
• Big data and data exhaust handling
capabilities
11
UNSUPERVISED DEEP LEARNING ALGORITHMS AND
USE CASES
12
Find Anomalies- Autoencoder
• Traditional techniques based on density or
distance works better with linearly separable
data
• Stacked Autoencoders (SAE) and Deep Belief
Networks ( DBN) make no assumptions about
the distribution of data and work better on non
linearly separable data
• Unsupervised learning algorithms for feature
learning, feature reduction and outlier detection
• Input vectors are used as output vectors and
reconstruction error computed
• The data points with higher reconstruction error
( MSE) are more likely to be outliers
• Helps in detecting different modus operandi of
fraudsters
Use Case- Deployment of Autoencoder for Credit Card Fraud Detection
13
Sequence Analysis- Recurrent Neural Network (LSTM)
• Recurrent Neural Network (RNN) are a special
type of feed-forward network used for
sequential data analysis where inputs are not
independent and are not of fixed length
• Rather in this case, inputs are dependent on
each other along the time dimension. In other
words, what happens in time ‘t’ may depend on
what happened in time ‘t-1’, ‘t-2’ and so on
• These are also called ‘memory’ networks as
previous inputs and states persist in the model
for doing a more optimal sequential analysis.
They can have both short term and long term
time dependence.
• Long Short Term Memory (LSTM) is one of the
most popular Deep Network used for sequential
data analysis.
• More on LSTM Here-
https://datafai.com/2018/03/08/recurrent-
neural-network-rnn-in-python/
Use Case- Use RNN (LSTM) to analyse web behaviour and logs to detect
fraudulent behavior
14
Find Networks - Clique and Links Graphs
Detect
Fraudulent
Cases
Find
Commonalities
Form Network
• Use variety of attributes (on-us/ off-us) to build linkage between known bad
customers and other customers with unknown status
• Larger the size of network, easier the detection and vice versa
• Overlap networks using enumerative approaches and find commonalities
• Use graph transduction (t-SNE) to detect potential fraudulent cases by doing peer
group (archetype) analysis to separate routine behavior from suspicious behavior -
“birds of same feather flock together”
15
SUPERVISED DEEP LEARNING ALGORITHMS AND USE
CASES
16
Real Time Detection- Convolution Neural Network (CNN)
• Convolution Neural Network (CNN) are
particularly useful for spatial data analysis, image
recognition, computer vision, natural language
processing, signal processing and variety of
other different purposes. They are biologically
motivated by functioning of neurons in visual
cortex to a visual stimuli.
• What makes CNN much more powerful
compared to the other feedback forward
networks for image recognition is the fact that
they do not require as much human
intervention and parameters as some of the
other networks such as MLP do. This is primarily
driven by the fact that CNNs have neurons
arranged in three dimensions.
• More on CNN Here-
https://datafai.com/2018/02/25/deep-learning-
convolution-neural-network-cnn-in-python/
Use Case- CNN for real time classification
17
Labeled Data- Multilayer Perceptron (MLP)
• These are the most basic networks and feed
forward the inputs to create output. They
consist of an input layer and an output layer
and many interconnected hidden layers and
neurons between the input and the output
layers.
• They can be used for any supervised regression
or classification problems
• Since they generally use some non linear
activation function such as Relu or Tanh to
compute the losses ( the difference between the
true output and computed output) such as
Mean Square Error ( MSE), Logloss, they are
more suitable for handling non linear problems.
• We will do a MLP Demo on credit card fraud
data
18
MLP Demo- Case Details
• Anonymized credit card transactions data from European customers
• 30 features ( 28 anonymized, duration elapsed, amount of transactions)
• Label- fraud or normal transaction
• 17bps incidence rate for fraudulent transactions
• 284,807 total transaction in data
Sources: http://mlg.ulb.ac.be | https://www.kaggle.com/dalpozz/creditcardfraud
19
MLP Demo- Tools and Techniques used
Python
2.7 or 3.6
Keras
2.0.2
TensorFlow
1.0.1
20
MLP Demo- Traditional Modeling Techniques Process
Manual
Feature
Engineering
After variable
treatments
drop variables
with little or no
explaining
power- WOE,
IV, Distribution
Look at WOE
to create bins
etc.
WOEDensity Dist.
21
MLP Demo- Network Training
Little or No Manual Feature Engineering
• No over or under sampling
• No variables dropped
• Only standardization of features done
• 75% training/ 25% validation
• No manual binning
Fitted Network
• Multi Layer Perceptron with three hidden layers.
o Activation function = Sigmoid
o # of neurons = 512 in the input layer
o Each consequent layer has half the neurons
o Cost function = logloss
o Optimizer = adam
o Epochs= 5
o Dropout rate = 30%
22
MLP Demo- Performance Summary
Metric Value
Accuracy Score 99.9%
Logloss 0.003
Precision Score 77%
Recall Score 75%
Area Under the
Curve (AUC)
87.4%
FScore 76.5%
23
MLP Demo- Hyperparameters Optimization
• Epochs = [5,10,15,20,25…]
• Batch Size = [5,10,20,30,40…]
• Optimizer= [‘SGD’, ’Adam’, ’RMSprop’…]
• Learning Rate = [0.01,0.05,0.1,0.2…]
• Momentum = [0.2,0.4,0.6,…]
• Weights Initiation= [‘Uniform’, ‘Normal’, …]
• Activation Function= [‘relu’,’sigmoid’, ‘tanh’, ‘softmax’,…]
• Drop-out rate= [0.0,0.2,0.4,0.5,…]
• Neurons= [5,10,20,30,40…]
Python scikit-learn gridsearch function, design of experiment( screening
design, fractional designs) needs to be combined with intutition and expertise
to come out with the best network!
24
Thank You!
Christopher McDougall- “Every morning in Africa, a gazelle wakes up, it knows it must outrun the fastest lion
or it will be killed. Every morning in Africa, a lion wakes up. It knows it must run faster than the slowest
gazelle, or it will starve. It doesn't matter whether you're the lion or a gazelle-when the sun comes up, you'd
better be running.
Working in the fraud analytics is the same way.
25
25
Next Webinar : Go-to-market strategy / Planning
Date : 2nd Nov 2018
Speaker: Ashok Munirathinam, Sr. Director, SAP Cloud Platform
SAP Asia Pacific & Japan
Queries: Ankita@nasscom.in
26

Contenu connexe

Tendances

What is Payment Tokenization?
What is Payment Tokenization?What is Payment Tokenization?
What is Payment Tokenization?Rambus Inc
 
Payment Gateway Integration: Growth Strategy for SAAS
Payment Gateway Integration: Growth Strategy for SAASPayment Gateway Integration: Growth Strategy for SAAS
Payment Gateway Integration: Growth Strategy for SAASWayne Akey
 
Digital Banking Strategy Roadmap - 3.24.15
Digital Banking Strategy Roadmap - 3.24.15Digital Banking Strategy Roadmap - 3.24.15
Digital Banking Strategy Roadmap - 3.24.15Calvin Turner
 
The Path to Open Banking
The Path to Open BankingThe Path to Open Banking
The Path to Open BankingMuleSoft
 
Understanding Digital Payments
Understanding Digital PaymentsUnderstanding Digital Payments
Understanding Digital PaymentsSantosh Potadar
 
Cash Less Society- Digital Payments
Cash Less Society- Digital PaymentsCash Less Society- Digital Payments
Cash Less Society- Digital Paymentsmahajanmanu
 
Payment System History
Payment System History Payment System History
Payment System History ARRhaman
 
FinTech and the Future of Finance
FinTech and the Future of FinanceFinTech and the Future of Finance
FinTech and the Future of FinanceRobin Teigland
 
Business plan - Mobile Payment Application
Business plan - Mobile Payment ApplicationBusiness plan - Mobile Payment Application
Business plan - Mobile Payment ApplicationPlan Writers
 
Ripple for Financial Institutions
Ripple for Financial InstitutionsRipple for Financial Institutions
Ripple for Financial InstitutionsXRPTalk
 
E financial services (payment gateway)
E financial services (payment gateway)E financial services (payment gateway)
E financial services (payment gateway)valliappan1991
 
Wiseasy Digital Banking Solution Introduction.pdf
Wiseasy Digital Banking Solution Introduction.pdfWiseasy Digital Banking Solution Introduction.pdf
Wiseasy Digital Banking Solution Introduction.pdfkjhfjfhdsjlf
 
Webinar: The Future of FinTech: Insights for 2021 | Intellectsoft
Webinar: The Future of FinTech: Insights for 2021 | IntellectsoftWebinar: The Future of FinTech: Insights for 2021 | Intellectsoft
Webinar: The Future of FinTech: Insights for 2021 | IntellectsoftIntellectsoft
 
Central Bank Digital Currencies - Deloitte .pdf
Central Bank Digital Currencies - Deloitte .pdfCentral Bank Digital Currencies - Deloitte .pdf
Central Bank Digital Currencies - Deloitte .pdfvaibhavkulkarni938086
 
Digital wallet (e-wallet)
Digital wallet  (e-wallet)Digital wallet  (e-wallet)
Digital wallet (e-wallet)Krishna Kumar
 

Tendances (20)

Bank Mobile Wallet
Bank Mobile WalletBank Mobile Wallet
Bank Mobile Wallet
 
What is Payment Tokenization?
What is Payment Tokenization?What is Payment Tokenization?
What is Payment Tokenization?
 
Payment Gateway Integration: Growth Strategy for SAAS
Payment Gateway Integration: Growth Strategy for SAASPayment Gateway Integration: Growth Strategy for SAAS
Payment Gateway Integration: Growth Strategy for SAAS
 
Digital Banking Strategy Roadmap - 3.24.15
Digital Banking Strategy Roadmap - 3.24.15Digital Banking Strategy Roadmap - 3.24.15
Digital Banking Strategy Roadmap - 3.24.15
 
The Path to Open Banking
The Path to Open BankingThe Path to Open Banking
The Path to Open Banking
 
Digital Banking
Digital BankingDigital Banking
Digital Banking
 
Understanding Digital Payments
Understanding Digital PaymentsUnderstanding Digital Payments
Understanding Digital Payments
 
Cash Less Society- Digital Payments
Cash Less Society- Digital PaymentsCash Less Society- Digital Payments
Cash Less Society- Digital Payments
 
Payment System History
Payment System History Payment System History
Payment System History
 
FinTech and the Future of Finance
FinTech and the Future of FinanceFinTech and the Future of Finance
FinTech and the Future of Finance
 
Business plan - Mobile Payment Application
Business plan - Mobile Payment ApplicationBusiness plan - Mobile Payment Application
Business plan - Mobile Payment Application
 
Ripple for Financial Institutions
Ripple for Financial InstitutionsRipple for Financial Institutions
Ripple for Financial Institutions
 
E financial services (payment gateway)
E financial services (payment gateway)E financial services (payment gateway)
E financial services (payment gateway)
 
E wallet- final
E wallet- finalE wallet- final
E wallet- final
 
Digital banking
Digital bankingDigital banking
Digital banking
 
Wiseasy Digital Banking Solution Introduction.pdf
Wiseasy Digital Banking Solution Introduction.pdfWiseasy Digital Banking Solution Introduction.pdf
Wiseasy Digital Banking Solution Introduction.pdf
 
Webinar: The Future of FinTech: Insights for 2021 | Intellectsoft
Webinar: The Future of FinTech: Insights for 2021 | IntellectsoftWebinar: The Future of FinTech: Insights for 2021 | Intellectsoft
Webinar: The Future of FinTech: Insights for 2021 | Intellectsoft
 
Central Bank Digital Currencies - Deloitte .pdf
Central Bank Digital Currencies - Deloitte .pdfCentral Bank Digital Currencies - Deloitte .pdf
Central Bank Digital Currencies - Deloitte .pdf
 
Digital wallet (e-wallet)
Digital wallet  (e-wallet)Digital wallet  (e-wallet)
Digital wallet (e-wallet)
 
Debit card
Debit cardDebit card
Debit card
 

Similaire à Nasscom how can you identify fraud in fintech lending using deep learning

Artificial Intelligence Primer
Artificial Intelligence PrimerArtificial Intelligence Primer
Artificial Intelligence PrimerImam Hoque
 
Brighterion bai july 2016 fraud white paper
Brighterion bai july 2016 fraud white paperBrighterion bai july 2016 fraud white paper
Brighterion bai july 2016 fraud white paperAndrew Morrison
 
Share Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptxShare Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptxyatintaneja6
 
Do you really need a dApp?
Do you really need a dApp? Do you really need a dApp?
Do you really need a dApp? Edward Tsang
 
Next Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4jNext Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4jNeo4j
 
[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditingNatalino Busa
 
Machine Learning in Cyber Security
Machine Learning in Cyber SecurityMachine Learning in Cyber Security
Machine Learning in Cyber SecurityRishi Kant
 
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4jAI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4jIvan Zoratti
 
Sean White- Kansas City
Sean White- Kansas CitySean White- Kansas City
Sean White- Kansas CitySplunk
 
Graph Gurus Episode 34: Graph Databases are Changing the Fraud Detection and ...
Graph Gurus Episode 34: Graph Databases are Changing the Fraud Detection and ...Graph Gurus Episode 34: Graph Databases are Changing the Fraud Detection and ...
Graph Gurus Episode 34: Graph Databases are Changing the Fraud Detection and ...TigerGraph
 
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j Neo4j
 
How Startups can leverage big data?
How Startups can leverage big data?How Startups can leverage big data?
How Startups can leverage big data?Rackspace
 
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...IRJET Journal
 
The Great Unknown - How can operators leverage big data to prevent future rev...
The Great Unknown - How can operators leverage big data to prevent future rev...The Great Unknown - How can operators leverage big data to prevent future rev...
The Great Unknown - How can operators leverage big data to prevent future rev...cVidya Networks
 
A Novel Framework for Credit Card.
A Novel Framework for Credit Card.A Novel Framework for Credit Card.
A Novel Framework for Credit Card.Shakas Technologies
 
Network security monitoring elastic webinar - 16 june 2021
Network security monitoring   elastic webinar - 16 june 2021Network security monitoring   elastic webinar - 16 june 2021
Network security monitoring elastic webinar - 16 june 2021Mouaz Alnouri
 
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017StampedeCon
 
SplunkLive! Splunk for Insider Threats and Fraud Detection
SplunkLive! Splunk for Insider Threats and Fraud DetectionSplunkLive! Splunk for Insider Threats and Fraud Detection
SplunkLive! Splunk for Insider Threats and Fraud DetectionSplunk
 
Webinar: Everyone cares about sample quality but not everyone values it!
Webinar: Everyone cares about sample quality but not everyone values it!Webinar: Everyone cares about sample quality but not everyone values it!
Webinar: Everyone cares about sample quality but not everyone values it!Matt Dusig
 

Similaire à Nasscom how can you identify fraud in fintech lending using deep learning (20)

Artificial Intelligence Primer
Artificial Intelligence PrimerArtificial Intelligence Primer
Artificial Intelligence Primer
 
Project PPT sem 2.pptx
Project PPT sem 2.pptxProject PPT sem 2.pptx
Project PPT sem 2.pptx
 
Brighterion bai july 2016 fraud white paper
Brighterion bai july 2016 fraud white paperBrighterion bai july 2016 fraud white paper
Brighterion bai july 2016 fraud white paper
 
Share Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptxShare Credit_Card_Fraud_Detection_ML_MP (1).pptx
Share Credit_Card_Fraud_Detection_ML_MP (1).pptx
 
Do you really need a dApp?
Do you really need a dApp? Do you really need a dApp?
Do you really need a dApp?
 
Next Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4jNext Generation Fraud Solutions using Neo4j
Next Generation Fraud Solutions using Neo4j
 
[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing[Ai in finance] AI in regulatory compliance, risk management, and auditing
[Ai in finance] AI in regulatory compliance, risk management, and auditing
 
Machine Learning in Cyber Security
Machine Learning in Cyber SecurityMachine Learning in Cyber Security
Machine Learning in Cyber Security
 
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4jAI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
 
Sean White- Kansas City
Sean White- Kansas CitySean White- Kansas City
Sean White- Kansas City
 
Graph Gurus Episode 34: Graph Databases are Changing the Fraud Detection and ...
Graph Gurus Episode 34: Graph Databases are Changing the Fraud Detection and ...Graph Gurus Episode 34: Graph Databases are Changing the Fraud Detection and ...
Graph Gurus Episode 34: Graph Databases are Changing the Fraud Detection and ...
 
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
 
How Startups can leverage big data?
How Startups can leverage big data?How Startups can leverage big data?
How Startups can leverage big data?
 
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...
 
The Great Unknown - How can operators leverage big data to prevent future rev...
The Great Unknown - How can operators leverage big data to prevent future rev...The Great Unknown - How can operators leverage big data to prevent future rev...
The Great Unknown - How can operators leverage big data to prevent future rev...
 
A Novel Framework for Credit Card.
A Novel Framework for Credit Card.A Novel Framework for Credit Card.
A Novel Framework for Credit Card.
 
Network security monitoring elastic webinar - 16 june 2021
Network security monitoring   elastic webinar - 16 june 2021Network security monitoring   elastic webinar - 16 june 2021
Network security monitoring elastic webinar - 16 june 2021
 
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
 
SplunkLive! Splunk for Insider Threats and Fraud Detection
SplunkLive! Splunk for Insider Threats and Fraud DetectionSplunkLive! Splunk for Insider Threats and Fraud Detection
SplunkLive! Splunk for Insider Threats and Fraud Detection
 
Webinar: Everyone cares about sample quality but not everyone values it!
Webinar: Everyone cares about sample quality but not everyone values it!Webinar: Everyone cares about sample quality but not everyone values it!
Webinar: Everyone cares about sample quality but not everyone values it!
 

Plus de Ratnakar Pandey

Computer vision and face recognition using python
Computer vision and face recognition using pythonComputer vision and face recognition using python
Computer vision and face recognition using pythonRatnakar Pandey
 
Blockchain and its impact on Data Science and Financial Services
Blockchain and its impact on Data Science and Financial ServicesBlockchain and its impact on Data Science and Financial Services
Blockchain and its impact on Data Science and Financial ServicesRatnakar Pandey
 
Key consulting frameworks_for_data_scientist
Key consulting frameworks_for_data_scientistKey consulting frameworks_for_data_scientist
Key consulting frameworks_for_data_scientistRatnakar Pandey
 
Basics of investment in Equity and Mutual Funds Markets
Basics of investment in Equity and Mutual Funds MarketsBasics of investment in Equity and Mutual Funds Markets
Basics of investment in Equity and Mutual Funds MarketsRatnakar Pandey
 

Plus de Ratnakar Pandey (6)

Computer vision and face recognition using python
Computer vision and face recognition using pythonComputer vision and face recognition using python
Computer vision and face recognition using python
 
Blockchain and its impact on Data Science and Financial Services
Blockchain and its impact on Data Science and Financial ServicesBlockchain and its impact on Data Science and Financial Services
Blockchain and its impact on Data Science and Financial Services
 
Key consulting frameworks_for_data_scientist
Key consulting frameworks_for_data_scientistKey consulting frameworks_for_data_scientist
Key consulting frameworks_for_data_scientist
 
Deep learning
Deep learningDeep learning
Deep learning
 
Basics of investment in Equity and Mutual Funds Markets
Basics of investment in Equity and Mutual Funds MarketsBasics of investment in Equity and Mutual Funds Markets
Basics of investment in Equity and Mutual Funds Markets
 
Fraud deep learning_v2
Fraud deep learning_v2Fraud deep learning_v2
Fraud deep learning_v2
 

Dernier

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptxPoojaSen20
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
Micromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersMicromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersChitralekhaTherkar
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 

Dernier (20)

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
Micromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersMicromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of Powders
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 

Nasscom how can you identify fraud in fintech lending using deep learning

  • 1. How can you Identify Fraud in Fintech Lending using Deep Learning RATNAKAR PANDEY, HEAD OF INDIA ANALYTICS & DATA SCIENCE, KABBAGE Disclaimer: The views expressed here are solely those of the presenter in his private capacity. 16th October 2018
  • 2. “This series is solely for educational purposes only. This series does not intend to be complete or universal in nature and cannot be considered as an alternative to an expert opinion on any specific issue. The series is based on views of the speaker/facilitator and NASSCOM does not recommend/endorse the view-points per se and is primarily a medium to disseminate knowledge for the greater good of the Products ecosystem. Any attendee who opens or otherwise accesses the content of the series at any point of time, does so at their own risk and acknowledges and agrees that neither NASSCOM and nor its members and affiliates will not be responsible for any loss or damage suffered by any person. The content of this webinar series is solely for the purpose of NASSCOM members and NASSCOM digital channels and any copying/distribution is liable for legal action.” Legal Disclaimer 2
  • 3. FRAUD IS A BIG PROBLEM ACROSS THE WORLD 3
  • 4. Outline Demo of Multi Level Perceptron (MLP) Classification Case Approach and Performance Suggested Deep Learning Application Areas Supervised Unsupervised Need for Deep Learning Existing Methods Why Deep Learning? Frauds in Fintech Lending Drivers Modus Operandi Introduction About Fintech About Kabbage 4
  • 5. Fintech is an Integral Part of Our Life Now $24.7 B Invested in 2016 in global fintech companies 1076 Deals in 2016 in global fintech companies Sources: KPMG, The Pulse of Fintech Q4 2016 | Capgemini World Fintech Report 2017 | PwC Global Fintech Report 2017 | www.forbes.com 50.2% Of global customers have done business with fintech 20% Expected ROI on fintech projects 20+ Global fintech Unicorns 10K+ Global fintech companies Types of Fintech Alternative Lending- Kabbage, Lendingclub, Prosper, Zopa Payment / Billing Tech - Stripe, Paytm, Adyen, Ant Financial, Square Personal Finance / Asset Management Creditkarma, Bankrate, NerdWallet Robo Advisory- Wealthfront, Betterment, NerdWallet Blockchain- Abra, 21, coinbase, Ethereum 5
  • 6. Kabbage is Blazing a Trail in Big Data & Fintech Kabbage is more than a lender for small businesses; our data and technology platform is now being used as a fully branded product by other lenders, and our products are expanding. We’ve received numerous awards & recognition, including- • CNBC Disruptors 50 list • Inc. 500 list for three consecutive years • The Forbes Most Promising Companies lists twice • Glassdoor’s 2017 Best Places to Work list 6
  • 7. Fraud Drivers- Superfast Decision Making and Faceless Channels Decisioning within few minutes Application on web and Mobile May have higher exposure to thin file and new to credit More prone to invisible window applications Unconventional and evolving data sources Note: Even with these challenges the fraud rate in the industry is typically less than 20 bps for more data savvy lenders 7
  • 8. How a Lending Fraud can be Classified? Who Commits? How? Who is the Victim? Borrower Someone known to the borrower- lead generator, friends, family employees etc. Someone unknown to the borrower First Payment Default, Bust Out, Synthetic Identity, Stacking etc. Friendly Fraud- someone misuses the trust Fraud rings, Identity Theft, Account Takeover Lender Borrower, Lender Borrower, Lender First Party Second Party Third Party 8
  • 9. Sample Modus Operandi • Stolen identity • Synthetic identity • May replicate best customer (prime and super prime) • Falsified info • No willingness to pay • Acquire multiple loans in a short window ( invisible window) • May provide all info correctly • More likely to be on higher side in the risk spectrum • No or low willingness to pay • Mimic good payment behavior for significant time • Bust out when gains are highestCommon Fraud Related Terms- http://www.cpp.co.uk/helpful-info/fraud-glossary-of-terms 9
  • 10. Current Situation- Heuristics and Regression Driven Approaches Intuitive Heuristics Statistical • Manual Reviews • Experts Driven • Gut feeling • Thumb rules • Driven by past experience • Quick decision making • Control/ confidence limits • Outlier detection/ deviation from norm • Decision tree, regression, time series 10
  • 11. 10,000 + Features Unstructured Transactional Social Device & IP Third Parties Bureau Why go Deep? Explosion of Features and Data Sources • Uncover hard to detect patterns (using traditional techniques) when the incidence rate is low • Find latent features (super variables) without significant manual feature engineering • Real time fraud detection and self learning models using streaming data (KAFKA, MapR) • Ensure consistent customer experience and regulatory compliance • Higher operational efficiency • Big data and data exhaust handling capabilities 11
  • 12. UNSUPERVISED DEEP LEARNING ALGORITHMS AND USE CASES 12
  • 13. Find Anomalies- Autoencoder • Traditional techniques based on density or distance works better with linearly separable data • Stacked Autoencoders (SAE) and Deep Belief Networks ( DBN) make no assumptions about the distribution of data and work better on non linearly separable data • Unsupervised learning algorithms for feature learning, feature reduction and outlier detection • Input vectors are used as output vectors and reconstruction error computed • The data points with higher reconstruction error ( MSE) are more likely to be outliers • Helps in detecting different modus operandi of fraudsters Use Case- Deployment of Autoencoder for Credit Card Fraud Detection 13
  • 14. Sequence Analysis- Recurrent Neural Network (LSTM) • Recurrent Neural Network (RNN) are a special type of feed-forward network used for sequential data analysis where inputs are not independent and are not of fixed length • Rather in this case, inputs are dependent on each other along the time dimension. In other words, what happens in time ‘t’ may depend on what happened in time ‘t-1’, ‘t-2’ and so on • These are also called ‘memory’ networks as previous inputs and states persist in the model for doing a more optimal sequential analysis. They can have both short term and long term time dependence. • Long Short Term Memory (LSTM) is one of the most popular Deep Network used for sequential data analysis. • More on LSTM Here- https://datafai.com/2018/03/08/recurrent- neural-network-rnn-in-python/ Use Case- Use RNN (LSTM) to analyse web behaviour and logs to detect fraudulent behavior 14
  • 15. Find Networks - Clique and Links Graphs Detect Fraudulent Cases Find Commonalities Form Network • Use variety of attributes (on-us/ off-us) to build linkage between known bad customers and other customers with unknown status • Larger the size of network, easier the detection and vice versa • Overlap networks using enumerative approaches and find commonalities • Use graph transduction (t-SNE) to detect potential fraudulent cases by doing peer group (archetype) analysis to separate routine behavior from suspicious behavior - “birds of same feather flock together” 15
  • 16. SUPERVISED DEEP LEARNING ALGORITHMS AND USE CASES 16
  • 17. Real Time Detection- Convolution Neural Network (CNN) • Convolution Neural Network (CNN) are particularly useful for spatial data analysis, image recognition, computer vision, natural language processing, signal processing and variety of other different purposes. They are biologically motivated by functioning of neurons in visual cortex to a visual stimuli. • What makes CNN much more powerful compared to the other feedback forward networks for image recognition is the fact that they do not require as much human intervention and parameters as some of the other networks such as MLP do. This is primarily driven by the fact that CNNs have neurons arranged in three dimensions. • More on CNN Here- https://datafai.com/2018/02/25/deep-learning- convolution-neural-network-cnn-in-python/ Use Case- CNN for real time classification 17
  • 18. Labeled Data- Multilayer Perceptron (MLP) • These are the most basic networks and feed forward the inputs to create output. They consist of an input layer and an output layer and many interconnected hidden layers and neurons between the input and the output layers. • They can be used for any supervised regression or classification problems • Since they generally use some non linear activation function such as Relu or Tanh to compute the losses ( the difference between the true output and computed output) such as Mean Square Error ( MSE), Logloss, they are more suitable for handling non linear problems. • We will do a MLP Demo on credit card fraud data 18
  • 19. MLP Demo- Case Details • Anonymized credit card transactions data from European customers • 30 features ( 28 anonymized, duration elapsed, amount of transactions) • Label- fraud or normal transaction • 17bps incidence rate for fraudulent transactions • 284,807 total transaction in data Sources: http://mlg.ulb.ac.be | https://www.kaggle.com/dalpozz/creditcardfraud 19
  • 20. MLP Demo- Tools and Techniques used Python 2.7 or 3.6 Keras 2.0.2 TensorFlow 1.0.1 20
  • 21. MLP Demo- Traditional Modeling Techniques Process Manual Feature Engineering After variable treatments drop variables with little or no explaining power- WOE, IV, Distribution Look at WOE to create bins etc. WOEDensity Dist. 21
  • 22. MLP Demo- Network Training Little or No Manual Feature Engineering • No over or under sampling • No variables dropped • Only standardization of features done • 75% training/ 25% validation • No manual binning Fitted Network • Multi Layer Perceptron with three hidden layers. o Activation function = Sigmoid o # of neurons = 512 in the input layer o Each consequent layer has half the neurons o Cost function = logloss o Optimizer = adam o Epochs= 5 o Dropout rate = 30% 22
  • 23. MLP Demo- Performance Summary Metric Value Accuracy Score 99.9% Logloss 0.003 Precision Score 77% Recall Score 75% Area Under the Curve (AUC) 87.4% FScore 76.5% 23
  • 24. MLP Demo- Hyperparameters Optimization • Epochs = [5,10,15,20,25…] • Batch Size = [5,10,20,30,40…] • Optimizer= [‘SGD’, ’Adam’, ’RMSprop’…] • Learning Rate = [0.01,0.05,0.1,0.2…] • Momentum = [0.2,0.4,0.6,…] • Weights Initiation= [‘Uniform’, ‘Normal’, …] • Activation Function= [‘relu’,’sigmoid’, ‘tanh’, ‘softmax’,…] • Drop-out rate= [0.0,0.2,0.4,0.5,…] • Neurons= [5,10,20,30,40…] Python scikit-learn gridsearch function, design of experiment( screening design, fractional designs) needs to be combined with intutition and expertise to come out with the best network! 24
  • 25. Thank You! Christopher McDougall- “Every morning in Africa, a gazelle wakes up, it knows it must outrun the fastest lion or it will be killed. Every morning in Africa, a lion wakes up. It knows it must run faster than the slowest gazelle, or it will starve. It doesn't matter whether you're the lion or a gazelle-when the sun comes up, you'd better be running. Working in the fraud analytics is the same way. 25 25
  • 26. Next Webinar : Go-to-market strategy / Planning Date : 2nd Nov 2018 Speaker: Ashok Munirathinam, Sr. Director, SAP Cloud Platform SAP Asia Pacific & Japan Queries: Ankita@nasscom.in 26