SlideShare une entreprise Scribd logo
1  sur  35
Télécharger pour lire hors ligne
A.I. for Dynamic Decisioning under Uncertainty
For Real-World Problems in Retail & Financial Trading
Ashwin Rao
VP Data Science at Target & Adjunct Faculty at Stanford
November 28, 2018
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 1 / 35
A.I. for Dynamic Decisioning under Uncertainty
Let’s look at some terms we use to characterize this branch of A.I.
Stochastic: Uncertainty in key quantities, evolving over time
Optimization: A well-defined metric to be maximized (“The Goal”)
Dynamic: Decisions need to a function of the changing situations
Control: Overpower uncertainty by persistent steering towards goal
Jargon overload due to confluence of Control Theory, O.R. and A.I.
For language clarity, let’s just refer to this area as Stochastic Control
I was introduced to this area through Inventory Control
At Stanford, I teach a course on A.I. for Stochastic Control in Finance
I’m developing an educational codebase for Stochastic Control
Overarching Goal: Blend Theory, Algorithms & Real-World Modeling
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 2 / 35
Overview
1 The Framework of Stochastic Control
2 Core Problem in Retail: Inventory Control
3 Core Problem in Finance: Portfolio Optimization/Asset Allocation
4 Quick look at a few other problems in Retail and Finance
5 Perspective from the Trenches
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 3 / 35
The Stochastic Control Framework
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 4 / 35
Components of the Framework
The Agent and the Environment interact in a time-sequenced loop
Agent responds to [State, Reward] by taking an Action
Environment responds by producing next step’s (random) State
Environment also produces a (random) number denoted as Reward
Goal of Agent is to maximize Expected Sum of all future Rewards
By controlling the (Policy : State → Action) function
This is a dynamic (time-sequenced control) system under uncertainty
Formally known as a Markov Decision Process (MDP)
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 5 / 35
Formal MDP Framework
The following notation is for discrete time steps. Continuous-time
formulation is analogous (often involving Stochastic Calculus)
States St ∈ S where S is the State Space
Actions At ∈ A where A is the Action Space
Rewards Rt ∈ R denoting numerical feedback
Transitions p(s′
,r s,a) = Pr{St+1 = s′
,Rt+1 = r St = s,At = a}
γ ∈ [0,1] is the Discount Factor for Reward when defining Return
Return Gt = Rt + γ ⋅ Rt+1 + γ2
⋅ Rt+1 + ...
Policy π(a s) is probability that Agent takes action a in states s
The goal is find a policy that maximizes E[Gt St = s] for all s ∈ S
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 6 / 35
Many real-world problems fit this MDP framework
Self-driving vehicle (speed/steering to optimize safety/time)
Game of Chess (Boolean Reward at end of game)
Complex Logistical Operations (eg: movements in a Warehouse)
Make a humanoid robot walk/run on difficult terrains
Manage an investment portfolio
Control a power station
Optimal decisions during a football game
Strategy to win an election (high-complexity MDP)
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 7 / 35
Why are these problems hard?
State space can be large or complex (involving many variables)
Sometimes, Action space is also large or complex
No direct feedback on “correct” Actions (only feedback is Reward)
Actions can have delayed consequences (late Rewards)
Time-sequenced complexity (eg: Actions influence future Actions)
Agent often doesn’t know the Model of the Environment
“Model” refers to probabilities of state-transitions and rewards
So, Agent has to learn the Model AND solve for the Optimal Policy
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 8 / 35
Value Function and Bellman Equations
Value function (under policy π) Vπ(s) = E[Gt St = s] for all s ∈ S
Vπ(s) = ∑
a
π(a s) ∑
s′,r
p(s′
,r s,a) ⋅ (r + γVπ(s′
)) for all s ∈ S
Optimal Value Function V∗(s) = maxπ Vπ(s) for all s ∈ S
V∗(s) = max
a
∑
s′,r
p(s′
,r s,a) ⋅ (r + γV∗(s′
)) for all s ∈ S
There exists an Optimal Policy π∗ achieving V∗(s) for all s ∈ S
The above recursive equations are called Bellman equations
In continuous time, refered to as Hamilton-Jacobi-Bellman (HJB)
The algorithms based on Bellman equations are broadly classified as:
Dynamic Programming
Reinforcement Learning
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 9 / 35
Dynamic Programming versus Reinforcement Learning
When Model is known ⇒ Dynamic Programming (DP)
DP Algorithms take advantage of knowledge of probabilities
So, DP Algorithms do not require interaction with the environment
When Model is unknown ⇒ Reinforcement Learning (RL)
RL Algorithms interact with the Environment and incrementally learn
Environment interaction could be real interaction or a simulator
RL approach: Try different actions & learn what works, what doesn’t
RL Algorithms’ key challenge is to tradeoff “explore” versus “exploit”
DP or RL, Good approximation of Value Function is vital to success
Deep Neural Networks are typically used for function approximation
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 10 / 35
Inventory Control starts with the Newsvendor Problem
Newsvendor problem is a single-period Inventory Control problem
Daily demand for newspapers is a random variable x
The newsvendor has an estimate of the PDF f (x) of daily demand
For each newspaper that stays unsold, we suffer an Excess cost h
Think of h as the purchase price minus salvage price
For each newspaper we’re short on, we suffer a Deficit Cost p
Think of p as the missed profits (sale price minus purchase price)
But p should also include potential loss of future customers
What is the optimum # of newspapers to bring in the morning?
To minimize the expected cost (function of f , h and p)
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 11 / 35
The Newsvendor Problem
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 12 / 35
Solution to the Newsvendor problem
For tractability, we assume newspapers are a continuous variable x
Then, we need to solve for the optimal supply S that maximizes
g(S) = h ∫
S
0
(S − x) ⋅ f (x) ⋅ dx + p ∫
∞
S
(x − S) ⋅ f (x) ⋅ dx
Setting g′
(S) = 0, we get:
Optimal Supply S∗
= F−1
(
p
p + h
)
where F(y) = ∫
y
0 f (x)dx is the CDF of daily demand
p
p+h is known as the critical fractile
It is the fraction of days when the newsvendor goes “out-of-stock”
Assuming the newsvendor always brings this optimal supply S∗
Solution details and connections with Financial Options Pricing here
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 13 / 35
Multi-period: Single-store, Single-item Inventory Control
The store experiences random daily demand given by PDF f (x)
The store can order daily from a supplier carrying infinite inventory
There’s a cost associated with ordering, and order arrives in L days
Like newsvendor, there’s an Excess Cost h and Deficit Cost p
This is an MDP where State is current Inventory Level at the store
State also includes current in-transit inventory (from supplier)
Action is quantity to order in any given State
Reward function has h, p (just like newsvendor), and ordering cost
Transition probabilities are governed by demand distribution f (x)
This has a closed-form solution, similar to newsvendor fomula
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 14 / 35
Optimal Policy: Order Point and Order-Up-To Level
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 15 / 35
Adding real-world frictions and constraints
Inventory is integer-valued, and orders are in casepack units
Excess/Deficit costs are not linear functions of inventory
Perishability, Obsolescence, End-of-season involve big costs
Need to factor in labor costs of handling cases and singles
Limits on shipping/receiving dates/times
Often, there is a constraint on minimum presentation quantities
Store inventory cannot exceed a threshold (eg: Shelf space)
Supplier has constraints on min and max order quantities
Uncertainty with the time for order arrival
There are approximate closed-form solutions in some cases
But general case requires generic DP or RL Algorithms
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 16 / 35
Multi-node and Multi-item Inventory Control
In practice, Inventory flows through a network of warehouses
From source (suppliers) to destination (stores or homes)
So, we have to solve a multi-“node” Inventory Control problem
State is joint inventory across all nodes (and between nodes)
Action is recommended movements of inventory between nodes
Reward is the aggregate of daily costs across the network
In addition, we have multi-item constraints
Space and Throughput constraints are multi-item constraints
So, real-world problem is multi-node and multi-item (giant MDP)
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 17 / 35
Single-period Portfolio Optimization
We start with a simple single-period portfolio optimization problem
The simple setup helps develop understanding of the core concepts
Assume we have one risky asset and one riskless asset
The return (over the single-period) of the risky asset is N(µ,σ2
)
The return of the riskless asset is deterministic (= r < µ)
Start with $1, aim to maximize our period-ending Expected Wealth
This means we invest fully in the risky asset (since µ > r)
But people are risk-averse and will trade higher returns for lower risk
The exact Risk-Return tradeoff is specified through Utility of Wealth
Utility is a concave function of Wealth describing Risk-Aversion
For an intro to Risk-Aversion and Utility Theory, see here
The goal is to maximize period-ending Expected Utility of Wealth
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 18 / 35
Concave Utility of Consumption ⇒ E[U(x)] < U(E[x])
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 19 / 35
Solution to Single-period Portfolio Optimization
W denotes end-of-period Wealth
α denotes fraction to invest in risky asset (1 − α is fraction in riskless)
W ∼ N(1 + r + α(µ − r),α2
σ2
)
Let Utility of Wealth function be U(W ) = −e−βW
for β > 0
Where β is the coefficient (extent) of Risk-Aversion
So we maximize over α, the Expected Utility
E[−e−βW
] = −e−β(1+r+α(µ−r))+ β2α2σ2
2 = g(α)
Setting
∂{log g(α)}
∂α = 0, we get:
α∗
=
µ − r
βσ2
This is the fundamental investment fraction in Portfolio Optimization
This fraction generalizes to multi-period and multiple risky assets
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 20 / 35
Multi-Period: Merton’s Portfolio Optimization Problem
You will live for (deterministic) T more years (say 20 years)
Current Wealth is W0 (say $ 10 million)
Assume you have no further income and no debts
You can invest in (allocate to) n risky assets and a riskless asset
Each asset has known normal distribution of returns
Allowed to long or short any fractional quantities of assets
Trading in continuous time 0 ≤ t < T, with no transaction costs
You can consume any fractional amount of wealth at any time
Dynamic Decision: Optimal Allocation and Consumption at each time
To maximize lifetime-aggregated Utility of Consumption
Consumption Utility assumed to have constant Relative Risk-Aversion
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 21 / 35
Problem Notation
For ease of exposition, we state the problem for 1 risky asset
Riskless asset: dRt = r ⋅ Rt ⋅ dt
Risky asset: dSt = µ ⋅ St ⋅ dt + σ ⋅ St ⋅ dzt (i.e. Geometric Brownian)
µ > r > 0,σ > 0 (for n assets, we work with a covariance matrix)
Wealth at time t is denoted by Wt > 0
Fraction of wealth allocated to risky asset denoted by π(t,Wt)
Fraction of wealth in riskless asset will then be 1 − π(t,Wt)
Wealth consumption denoted by c(t,Wt) ≥ 0
Utility of Consumption function U(x) = x1−β
1−β for 0 < β ≠ 1
β = (constant) Relative Risk-Aversion
−x⋅U′′(x)
U′(x)
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 22 / 35
Continuous-Time Stochastic Control Problem
Balance constraint implies the following process for Wealth Wt
dWt = ((πt ⋅ (µ − r) + r) ⋅ Wt − ct) ⋅ dt + πt ⋅ σ ⋅ Wt ⋅ dzt
At any time t, determine optimal [π(t,Wt),c(t,Wt)] to maximize:
E[∫
T
t
γs−t
⋅ c1−β
s
1 − β
⋅ ds Wt]
where 0 ≤ γ ≤ 1 is the annualized discount factor
Think of this as a continuous-time Stochastic Control problem
State is (t,Wt), Action is [πt,ct], Reward per unit time is U(ct)
Stochastic process for Wt above defines the transition probabilities
Find Policy (t,Wt) → [πt,ct] that maximizes the Expected Return
Note: ct ≥ 0, but πt is unconstrained
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 23 / 35
Outline of Solution
Optimal Value Function V ∗
(t,Wt) has a recursive formulation
V ∗
(t,Wt) = max
π,c
E[∫
t1
t
γs
⋅ c1−β
s
1 − β
⋅ ds + V ∗
(t1,Wt1 )]
Re-expressed as a Hamilton-Jacobi-Bellman (HJB) formulation
max
πt ,ct
E[dV ∗
(t,Wt) +
γt
⋅ c1−β
t
1 − β
⋅ dt] = 0
Standard HJB calculus (Ito’s Lemma followed by partial derivatives
w.r.t. πt,ct) gives us a PDE for V ∗
(t,Wt)
We can reduce the PDE to a tractable ODE with a guessed solution
All the gory details in this lecture
This solution outline is broadly applicable to continuous-time problems
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 24 / 35
Gaining Insights into the Solution
Optimal Allocation π∗
(t,Wt) = µ−r
σ2β
(independent of t and Wt)
Optimal Fractional Consumption
c∗(t,Wt )
Wt
depends only on t (= f (t))
With Optimal Allocation & Consumption, the Wealth process is:
dWt
Wt
= (r +
(µ − r)2
σ2β
− f (t)) ⋅ dt +
µ − r
σβ
⋅ dzt
Expected Portfolio Return is constant over time (= r +
(µ−r)2
σ2β
)
Fractional Consumption f (t) increases over time
Expected Rate of Wealth Growth r +
(µ−r)2
σ2β
− f (t) decreases over time
If r +
(µ−r)2
σ2β
> f (0), we start by Consuming < Expected Portfolio
Growth and over time, we Consume > Expected Portfolio Growth
Wealth Growth Volatility is constant (= µ−r
σβ )
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 25 / 35
Portfolio Return versus Consumption Rate
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 26 / 35
Porting this to Real-World Portfolio Optimization
Analytical tractability in Merton’s formulation was due to:
Normal distribution of asset returns
Constant Relative Risk-Aversion
Frictionless, continuous trading
However, real-world situation involves:
Discrete amounts of assets to hold and discrete quantities of trades
Transaction costs
Locked-out days for trading
Non-stationary/arbitrary/correlated processes of multiple assets
Changing/uncertain risk-free rate
Consumption constraints
Arbitrary Risk-Aversion/Utility specification
⇒ Approximate Dynamic Programming or Reinforcement Learning
Large Action Space points to Policy Gradient Algorithms
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 27 / 35
Overview of a few other problems
Financial Trading: American Options Pricing
Financial Trading: Trade Order Execution
Retail: Clearance Pricing
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 28 / 35
American Options Pricing
American option can be exercised anytime before option maturity
Key decision at any time is to exercise or continue
The default algorithm is Backward Induction on a tree/grid
But it doesn’t work for path-dependent options
Also, it’s not feasible when state dimension is large
Industry-Standard: Longstaff-Schwartz’s simulation-based algorithm
RL is an attractive alternative to Longstaff-Schwartz
RL is straightforward once Optimal Exercise is modeled as an MDP
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 29 / 35
MDP for Optimal Options Exercise
State is [Current Time, History of Underlying Security Prices]
Action is Boolean: Exercise (i.e., Payoff and Stop) or Continue
Reward always 0, except upon Exercise (= Payoff)
State-transitions governed by Underlying Prices’ Stochastic Process
Optimal Policy ⇒ Optimal Stopping ⇒ Option Price
All the details in this lecture
Can be generalized to other Optimal Stopping problems
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 30 / 35
Optimal Trade Order Execution (controlling Price Impact)
You are tasked with selling a large qty of a (relatively less-liquid) stock
You have a fixed horizon over which to complete the sale
The goal is to maximize aggregate sales proceeds over the horizon
If you sell too fast, Price Impact will result in poor sales proceeds
If you sell too slow, you risk running out of time
We need to model temporary and permanent Price Impacts
Objective should incorporate penalty for variance of sales proceeds
Which is equivalent to maximizing aggregate Utility of sales proceeds
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 31 / 35
MDP for Optimal Trade Order Execution
State is [Time Remaining, Stock Remaining to be Sold, Market Info]
Action is Quantity of Stock to Sell at current time
Reward is Utility of Sales Proceeds (i.e., Variance-adjusted-Proceeds)
Reward & State-transitions governed by Price Impact Model
Real-world Model can be quite complex (Order Book Dynamics)
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 32 / 35
Clearance Pricing
You are a few weeks away from end-of-season (eg: Christmas Trees)
Assume you have too much inventory in your store
What is the optimal sequence of price markdowns?
So as to maximize your total profit (sales revenue minus costs)
Note: There is a non-trivial cost of performing a markdown
If price markdowns are small, we end up with surplus at season-end
Surplus often needs to be disposed at poor salvage price
If price reductions are large, we run out of Christmas trees early
“Stockout” cost is considered to be large during holiday season
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 33 / 35
MDP for Clearance Pricing
State is [Days Left, Current Inventory, Current Price, Market Info]
Action is Price Markdown
Reward includes Sales revenue, markdown cost, stockout cost, salvage
Reward & State-transitions governed by Price Elasticity of Demand
Real-world Model can be quite complex (eg: competitor pricing)
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 34 / 35
Perspective (and a bit of “advice”) from the Trenches
I always start with a simple version of problem to develop intuition
My first line of attack is DP customized to the problem structure
RL Algorithms that are my personal favorites (links to my lectures):
Deep Q-Network (DQN): Experience Replay, 2nd Target Network
Least Squares Policy Iteration (LSPI) - Batch Linear System
Exact Gradient Temporal-Difference (GTD)
Policy Gradient (esp. Natural Gradient, TRPO)
I prefer to separate Model Estimation from Policy Optimization
So we can customize RL algorithms to take advantage of:
Knowledge of transition probabilities
Knowledge of reward function
Any problem-specific structure that simplifies the algorithm
Feature Engineering based on known closed-form approximations
Many real-world, large-scale problems ultimately come down to
suitable choices of DNN architectures and hyperparameter tuning
Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 35 / 35

Contenu connexe

Tendances

IFTA2020 Kei Nakagawa
IFTA2020 Kei NakagawaIFTA2020 Kei Nakagawa
IFTA2020 Kei NakagawaKei Nakagawa
 
Modeling Transaction Costs for Algorithmic Strategies
Modeling Transaction Costs for Algorithmic StrategiesModeling Transaction Costs for Algorithmic Strategies
Modeling Transaction Costs for Algorithmic StrategiesQuantopian
 
Arbitrage pricing theory & Efficient market hypothesis
Arbitrage pricing theory & Efficient market hypothesisArbitrage pricing theory & Efficient market hypothesis
Arbitrage pricing theory & Efficient market hypothesisHari Ram
 
Bba 3274 qm week 6 part 2 forecasting
Bba 3274 qm week 6 part 2 forecastingBba 3274 qm week 6 part 2 forecasting
Bba 3274 qm week 6 part 2 forecastingStephen Ong
 
Market behavior research @ bec doms
Market behavior research @ bec domsMarket behavior research @ bec doms
Market behavior research @ bec domsBabasab Patil
 
Technical Interview Workshop Sept 17 2007 V2 Website
Technical Interview Workshop Sept 17 2007 V2 WebsiteTechnical Interview Workshop Sept 17 2007 V2 Website
Technical Interview Workshop Sept 17 2007 V2 WebsiteMcGill Investment Club
 
Forecasting Techniques
Forecasting TechniquesForecasting Techniques
Forecasting Techniquesguest865c0e0c
 
Chapter 19 decision-making under risk
Chapter 19   decision-making under riskChapter 19   decision-making under risk
Chapter 19 decision-making under riskBich Lien Pham
 
Capital asset pricing_model
Capital asset pricing_modelCapital asset pricing_model
Capital asset pricing_modelenock nyoni
 
Stock price prediction using k* nearest neighbors and indexing dynamic time w...
Stock price prediction using k* nearest neighbors and indexing dynamic time w...Stock price prediction using k* nearest neighbors and indexing dynamic time w...
Stock price prediction using k* nearest neighbors and indexing dynamic time w...Kei Nakagawa
 
Automatic algorithms for time series forecasting
Automatic algorithms for time series forecastingAutomatic algorithms for time series forecasting
Automatic algorithms for time series forecastingRob Hyndman
 
Economic Causal Chain and Predictable Stock Returns
Economic Causal Chain and Predictable Stock ReturnsEconomic Causal Chain and Predictable Stock Returns
Economic Causal Chain and Predictable Stock ReturnsKei Nakagawa
 
Intro to Quantitative Investment (Lecture 1 of 6)
Intro to Quantitative Investment (Lecture 1 of 6)Intro to Quantitative Investment (Lecture 1 of 6)
Intro to Quantitative Investment (Lecture 1 of 6)Adrian Aley
 
RIC-NN: A Robust Transferable Deep Learning Framework for Cross-sectional Inv...
RIC-NN: A Robust Transferable Deep Learning Framework for Cross-sectional Inv...RIC-NN: A Robust Transferable Deep Learning Framework for Cross-sectional Inv...
RIC-NN: A Robust Transferable Deep Learning Framework for Cross-sectional Inv...Kei Nakagawa
 

Tendances (19)

IFTA2020 Kei Nakagawa
IFTA2020 Kei NakagawaIFTA2020 Kei Nakagawa
IFTA2020 Kei Nakagawa
 
Modeling Transaction Costs for Algorithmic Strategies
Modeling Transaction Costs for Algorithmic StrategiesModeling Transaction Costs for Algorithmic Strategies
Modeling Transaction Costs for Algorithmic Strategies
 
Arbitrage pricing theory & Efficient market hypothesis
Arbitrage pricing theory & Efficient market hypothesisArbitrage pricing theory & Efficient market hypothesis
Arbitrage pricing theory & Efficient market hypothesis
 
Bba 3274 qm week 6 part 2 forecasting
Bba 3274 qm week 6 part 2 forecastingBba 3274 qm week 6 part 2 forecasting
Bba 3274 qm week 6 part 2 forecasting
 
Market behavior research @ bec doms
Market behavior research @ bec domsMarket behavior research @ bec doms
Market behavior research @ bec doms
 
Technical Interview Workshop Sept 17 2007 V2 Website
Technical Interview Workshop Sept 17 2007 V2 WebsiteTechnical Interview Workshop Sept 17 2007 V2 Website
Technical Interview Workshop Sept 17 2007 V2 Website
 
Security analysis
Security analysisSecurity analysis
Security analysis
 
Forecasting Techniques
Forecasting TechniquesForecasting Techniques
Forecasting Techniques
 
Chapter 19 decision-making under risk
Chapter 19   decision-making under riskChapter 19   decision-making under risk
Chapter 19 decision-making under risk
 
Demand Forcasting
Demand ForcastingDemand Forcasting
Demand Forcasting
 
Capital asset pricing_model
Capital asset pricing_modelCapital asset pricing_model
Capital asset pricing_model
 
Stock price prediction using k* nearest neighbors and indexing dynamic time w...
Stock price prediction using k* nearest neighbors and indexing dynamic time w...Stock price prediction using k* nearest neighbors and indexing dynamic time w...
Stock price prediction using k* nearest neighbors and indexing dynamic time w...
 
Automatic algorithms for time series forecasting
Automatic algorithms for time series forecastingAutomatic algorithms for time series forecasting
Automatic algorithms for time series forecasting
 
Ch1 slides
Ch1 slidesCh1 slides
Ch1 slides
 
Demand forecasting 12
Demand forecasting 12Demand forecasting 12
Demand forecasting 12
 
Forecasting
ForecastingForecasting
Forecasting
 
Economic Causal Chain and Predictable Stock Returns
Economic Causal Chain and Predictable Stock ReturnsEconomic Causal Chain and Predictable Stock Returns
Economic Causal Chain and Predictable Stock Returns
 
Intro to Quantitative Investment (Lecture 1 of 6)
Intro to Quantitative Investment (Lecture 1 of 6)Intro to Quantitative Investment (Lecture 1 of 6)
Intro to Quantitative Investment (Lecture 1 of 6)
 
RIC-NN: A Robust Transferable Deep Learning Framework for Cross-sectional Inv...
RIC-NN: A Robust Transferable Deep Learning Framework for Cross-sectional Inv...RIC-NN: A Robust Transferable Deep Learning Framework for Cross-sectional Inv...
RIC-NN: A Robust Transferable Deep Learning Framework for Cross-sectional Inv...
 

Similaire à A.I. for Dynamic Decision-Making in Retail and Finance

Price optimization for high-mix, low-volume environments | Using R and Tablea...
Price optimization for high-mix, low-volume environments | Using R and Tablea...Price optimization for high-mix, low-volume environments | Using R and Tablea...
Price optimization for high-mix, low-volume environments | Using R and Tablea...Wil Davis
 
Session 3
Session 3Session 3
Session 3thangv
 
"Quantitative Trading as a Mathematical Science" by Dr. Haksun Li, Founder an...
"Quantitative Trading as a Mathematical Science" by Dr. Haksun Li, Founder an..."Quantitative Trading as a Mathematical Science" by Dr. Haksun Li, Founder an...
"Quantitative Trading as a Mathematical Science" by Dr. Haksun Li, Founder an...Quantopian
 
Forecasting Slides
Forecasting SlidesForecasting Slides
Forecasting Slidesknksmart
 
Intro to Quant Trading Strategies (Lecture 1 of 10)
Intro to Quant Trading Strategies (Lecture 1 of 10)Intro to Quant Trading Strategies (Lecture 1 of 10)
Intro to Quant Trading Strategies (Lecture 1 of 10)Adrian Aley
 
Marketing Research Approaches .docx
Marketing Research Approaches .docxMarketing Research Approaches .docx
Marketing Research Approaches .docxalfredacavx97
 
Chapter 16
Chapter 16Chapter 16
Chapter 16bmcfad01
 
Class notes forecasting
Class notes forecastingClass notes forecasting
Class notes forecastingArun Kumar
 
Quantitative methods
Quantitative methodsQuantitative methods
Quantitative methodsHimanshu Shah
 
Andrii Prysiazhnyk: Why the amazon sellers are buiyng the RTX 3080: Dynamic p...
Andrii Prysiazhnyk: Why the amazon sellers are buiyng the RTX 3080: Dynamic p...Andrii Prysiazhnyk: Why the amazon sellers are buiyng the RTX 3080: Dynamic p...
Andrii Prysiazhnyk: Why the amazon sellers are buiyng the RTX 3080: Dynamic p...Lviv Startup Club
 
Industrial engineering notes for gate
Industrial engineering notes for gateIndustrial engineering notes for gate
Industrial engineering notes for gateSoumith V
 
Risk Concept And Management 5
Risk Concept And Management 5Risk Concept And Management 5
Risk Concept And Management 5rajeevgupta
 
Chapter 13 breakeven analysis
Chapter 13   breakeven analysisChapter 13   breakeven analysis
Chapter 13 breakeven analysisBich Lien Pham
 
Chapter 13 breakeven analysis
Chapter 13   breakeven analysisChapter 13   breakeven analysis
Chapter 13 breakeven analysisBich Lien Pham
 
Use of quantitative techniques in economics
Use of quantitative techniques in economicsUse of quantitative techniques in economics
Use of quantitative techniques in economicsBalaji P
 

Similaire à A.I. for Dynamic Decision-Making in Retail and Finance (20)

Price optimization for high-mix, low-volume environments | Using R and Tablea...
Price optimization for high-mix, low-volume environments | Using R and Tablea...Price optimization for high-mix, low-volume environments | Using R and Tablea...
Price optimization for high-mix, low-volume environments | Using R and Tablea...
 
Session 3
Session 3Session 3
Session 3
 
"Quantitative Trading as a Mathematical Science" by Dr. Haksun Li, Founder an...
"Quantitative Trading as a Mathematical Science" by Dr. Haksun Li, Founder an..."Quantitative Trading as a Mathematical Science" by Dr. Haksun Li, Founder an...
"Quantitative Trading as a Mathematical Science" by Dr. Haksun Li, Founder an...
 
Forecasting Slides
Forecasting SlidesForecasting Slides
Forecasting Slides
 
HFTperformEvalPRES
HFTperformEvalPRESHFTperformEvalPRES
HFTperformEvalPRES
 
Intro to Quant Trading Strategies (Lecture 1 of 10)
Intro to Quant Trading Strategies (Lecture 1 of 10)Intro to Quant Trading Strategies (Lecture 1 of 10)
Intro to Quant Trading Strategies (Lecture 1 of 10)
 
Marketing Research Approaches .docx
Marketing Research Approaches .docxMarketing Research Approaches .docx
Marketing Research Approaches .docx
 
Chapter 16
Chapter 16Chapter 16
Chapter 16
 
Class notes forecasting
Class notes forecastingClass notes forecasting
Class notes forecasting
 
Quantitative methods
Quantitative methodsQuantitative methods
Quantitative methods
 
Andrii Prysiazhnyk: Why the amazon sellers are buiyng the RTX 3080: Dynamic p...
Andrii Prysiazhnyk: Why the amazon sellers are buiyng the RTX 3080: Dynamic p...Andrii Prysiazhnyk: Why the amazon sellers are buiyng the RTX 3080: Dynamic p...
Andrii Prysiazhnyk: Why the amazon sellers are buiyng the RTX 3080: Dynamic p...
 
Industrial engineering notes for gate
Industrial engineering notes for gateIndustrial engineering notes for gate
Industrial engineering notes for gate
 
Decision theory
Decision theoryDecision theory
Decision theory
 
Risk Concept And Management 5
Risk Concept And Management 5Risk Concept And Management 5
Risk Concept And Management 5
 
Ch5 - Forecasting in marketing engineering
Ch5 - Forecasting in marketing engineeringCh5 - Forecasting in marketing engineering
Ch5 - Forecasting in marketing engineering
 
New Topics in Revenue Management
New Topics in Revenue ManagementNew Topics in Revenue Management
New Topics in Revenue Management
 
Chapter 13 breakeven analysis
Chapter 13   breakeven analysisChapter 13   breakeven analysis
Chapter 13 breakeven analysis
 
Chapter 13 breakeven analysis
Chapter 13   breakeven analysisChapter 13   breakeven analysis
Chapter 13 breakeven analysis
 
Use of quantitative techniques in economics
Use of quantitative techniques in economicsUse of quantitative techniques in economics
Use of quantitative techniques in economics
 
forecasting methods
forecasting methodsforecasting methods
forecasting methods
 

Plus de Ashwin Rao

Stochastic Control/Reinforcement Learning for Optimal Market Making
Stochastic Control/Reinforcement Learning for Optimal Market MakingStochastic Control/Reinforcement Learning for Optimal Market Making
Stochastic Control/Reinforcement Learning for Optimal Market MakingAshwin Rao
 
Adaptive Multistage Sampling Algorithm: The Origins of Monte Carlo Tree Search
Adaptive Multistage Sampling Algorithm: The Origins of Monte Carlo Tree SearchAdaptive Multistage Sampling Algorithm: The Origins of Monte Carlo Tree Search
Adaptive Multistage Sampling Algorithm: The Origins of Monte Carlo Tree SearchAshwin Rao
 
Fundamental Theorems of Asset Pricing
Fundamental Theorems of Asset PricingFundamental Theorems of Asset Pricing
Fundamental Theorems of Asset PricingAshwin Rao
 
Evolutionary Strategies as an alternative to Reinforcement Learning
Evolutionary Strategies as an alternative to Reinforcement LearningEvolutionary Strategies as an alternative to Reinforcement Learning
Evolutionary Strategies as an alternative to Reinforcement LearningAshwin Rao
 
Understanding Dynamic Programming through Bellman Operators
Understanding Dynamic Programming through Bellman OperatorsUnderstanding Dynamic Programming through Bellman Operators
Understanding Dynamic Programming through Bellman OperatorsAshwin Rao
 
Stochastic Control of Optimal Trade Order Execution
Stochastic Control of Optimal Trade Order ExecutionStochastic Control of Optimal Trade Order Execution
Stochastic Control of Optimal Trade Order ExecutionAshwin Rao
 
Overview of Stochastic Calculus Foundations
Overview of Stochastic Calculus FoundationsOverview of Stochastic Calculus Foundations
Overview of Stochastic Calculus FoundationsAshwin Rao
 
Risk-Aversion, Risk-Premium and Utility Theory
Risk-Aversion, Risk-Premium and Utility TheoryRisk-Aversion, Risk-Premium and Utility Theory
Risk-Aversion, Risk-Premium and Utility TheoryAshwin Rao
 
Value Function Geometry and Gradient TD
Value Function Geometry and Gradient TDValue Function Geometry and Gradient TD
Value Function Geometry and Gradient TDAshwin Rao
 
HJB Equation and Merton's Portfolio Problem
HJB Equation and Merton's Portfolio ProblemHJB Equation and Merton's Portfolio Problem
HJB Equation and Merton's Portfolio ProblemAshwin Rao
 
Policy Gradient Theorem
Policy Gradient TheoremPolicy Gradient Theorem
Policy Gradient TheoremAshwin Rao
 
A Quick and Terse Introduction to Efficient Frontier Mathematics
A Quick and Terse Introduction to Efficient Frontier MathematicsA Quick and Terse Introduction to Efficient Frontier Mathematics
A Quick and Terse Introduction to Efficient Frontier MathematicsAshwin Rao
 
Recursive Formulation of Gradient in a Dense Feed-Forward Deep Neural Network
Recursive Formulation of Gradient in a Dense Feed-Forward Deep Neural NetworkRecursive Formulation of Gradient in a Dense Feed-Forward Deep Neural Network
Recursive Formulation of Gradient in a Dense Feed-Forward Deep Neural NetworkAshwin Rao
 
Demystifying the Bias-Variance Tradeoff
Demystifying the Bias-Variance TradeoffDemystifying the Bias-Variance Tradeoff
Demystifying the Bias-Variance TradeoffAshwin Rao
 
Category Theory made easy with (ugly) pictures
Category Theory made easy with (ugly) picturesCategory Theory made easy with (ugly) pictures
Category Theory made easy with (ugly) picturesAshwin Rao
 
Risk Pooling sensitivity to Correlation
Risk Pooling sensitivity to CorrelationRisk Pooling sensitivity to Correlation
Risk Pooling sensitivity to CorrelationAshwin Rao
 
Abstract Algebra in 3 Hours
Abstract Algebra in 3 HoursAbstract Algebra in 3 Hours
Abstract Algebra in 3 HoursAshwin Rao
 
OmniChannelNewsvendor
OmniChannelNewsvendorOmniChannelNewsvendor
OmniChannelNewsvendorAshwin Rao
 
The Newsvendor meets the Options Trader
The Newsvendor meets the Options TraderThe Newsvendor meets the Options Trader
The Newsvendor meets the Options TraderAshwin Rao
 
The Fuss about || Haskell | Scala | F# ||
The Fuss about || Haskell | Scala | F# ||The Fuss about || Haskell | Scala | F# ||
The Fuss about || Haskell | Scala | F# ||Ashwin Rao
 

Plus de Ashwin Rao (20)

Stochastic Control/Reinforcement Learning for Optimal Market Making
Stochastic Control/Reinforcement Learning for Optimal Market MakingStochastic Control/Reinforcement Learning for Optimal Market Making
Stochastic Control/Reinforcement Learning for Optimal Market Making
 
Adaptive Multistage Sampling Algorithm: The Origins of Monte Carlo Tree Search
Adaptive Multistage Sampling Algorithm: The Origins of Monte Carlo Tree SearchAdaptive Multistage Sampling Algorithm: The Origins of Monte Carlo Tree Search
Adaptive Multistage Sampling Algorithm: The Origins of Monte Carlo Tree Search
 
Fundamental Theorems of Asset Pricing
Fundamental Theorems of Asset PricingFundamental Theorems of Asset Pricing
Fundamental Theorems of Asset Pricing
 
Evolutionary Strategies as an alternative to Reinforcement Learning
Evolutionary Strategies as an alternative to Reinforcement LearningEvolutionary Strategies as an alternative to Reinforcement Learning
Evolutionary Strategies as an alternative to Reinforcement Learning
 
Understanding Dynamic Programming through Bellman Operators
Understanding Dynamic Programming through Bellman OperatorsUnderstanding Dynamic Programming through Bellman Operators
Understanding Dynamic Programming through Bellman Operators
 
Stochastic Control of Optimal Trade Order Execution
Stochastic Control of Optimal Trade Order ExecutionStochastic Control of Optimal Trade Order Execution
Stochastic Control of Optimal Trade Order Execution
 
Overview of Stochastic Calculus Foundations
Overview of Stochastic Calculus FoundationsOverview of Stochastic Calculus Foundations
Overview of Stochastic Calculus Foundations
 
Risk-Aversion, Risk-Premium and Utility Theory
Risk-Aversion, Risk-Premium and Utility TheoryRisk-Aversion, Risk-Premium and Utility Theory
Risk-Aversion, Risk-Premium and Utility Theory
 
Value Function Geometry and Gradient TD
Value Function Geometry and Gradient TDValue Function Geometry and Gradient TD
Value Function Geometry and Gradient TD
 
HJB Equation and Merton's Portfolio Problem
HJB Equation and Merton's Portfolio ProblemHJB Equation and Merton's Portfolio Problem
HJB Equation and Merton's Portfolio Problem
 
Policy Gradient Theorem
Policy Gradient TheoremPolicy Gradient Theorem
Policy Gradient Theorem
 
A Quick and Terse Introduction to Efficient Frontier Mathematics
A Quick and Terse Introduction to Efficient Frontier MathematicsA Quick and Terse Introduction to Efficient Frontier Mathematics
A Quick and Terse Introduction to Efficient Frontier Mathematics
 
Recursive Formulation of Gradient in a Dense Feed-Forward Deep Neural Network
Recursive Formulation of Gradient in a Dense Feed-Forward Deep Neural NetworkRecursive Formulation of Gradient in a Dense Feed-Forward Deep Neural Network
Recursive Formulation of Gradient in a Dense Feed-Forward Deep Neural Network
 
Demystifying the Bias-Variance Tradeoff
Demystifying the Bias-Variance TradeoffDemystifying the Bias-Variance Tradeoff
Demystifying the Bias-Variance Tradeoff
 
Category Theory made easy with (ugly) pictures
Category Theory made easy with (ugly) picturesCategory Theory made easy with (ugly) pictures
Category Theory made easy with (ugly) pictures
 
Risk Pooling sensitivity to Correlation
Risk Pooling sensitivity to CorrelationRisk Pooling sensitivity to Correlation
Risk Pooling sensitivity to Correlation
 
Abstract Algebra in 3 Hours
Abstract Algebra in 3 HoursAbstract Algebra in 3 Hours
Abstract Algebra in 3 Hours
 
OmniChannelNewsvendor
OmniChannelNewsvendorOmniChannelNewsvendor
OmniChannelNewsvendor
 
The Newsvendor meets the Options Trader
The Newsvendor meets the Options TraderThe Newsvendor meets the Options Trader
The Newsvendor meets the Options Trader
 
The Fuss about || Haskell | Scala | F# ||
The Fuss about || Haskell | Scala | F# ||The Fuss about || Haskell | Scala | F# ||
The Fuss about || Haskell | Scala | F# ||
 

Dernier

Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONjhunlian
 
Engineering Drawing section of solid
Engineering Drawing     section of solidEngineering Drawing     section of solid
Engineering Drawing section of solidnamansinghjarodiya
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
National Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfNational Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfRajuKanojiya4
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communicationpanditadesh123
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxsiddharthjain2303
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Crushers to screens in aggregate production
Crushers to screens in aggregate productionCrushers to screens in aggregate production
Crushers to screens in aggregate productionChinnuNinan
 
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxInternet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxVelmuruganTECE
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating SystemRashmi Bhat
 
Risk Management in Engineering Construction Project
Risk Management in Engineering Construction ProjectRisk Management in Engineering Construction Project
Risk Management in Engineering Construction ProjectErbil Polytechnic University
 
Crystal Structure analysis and detailed information pptx
Crystal Structure analysis and detailed information pptxCrystal Structure analysis and detailed information pptx
Crystal Structure analysis and detailed information pptxachiever3003
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdfCaalaaAbdulkerim
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 

Dernier (20)

Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
 
Engineering Drawing section of solid
Engineering Drawing     section of solidEngineering Drawing     section of solid
Engineering Drawing section of solid
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
National Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfNational Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdf
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communication
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptx
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Crushers to screens in aggregate production
Crushers to screens in aggregate productionCrushers to screens in aggregate production
Crushers to screens in aggregate production
 
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptxInternet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptx
 
Main Memory Management in Operating System
Main Memory Management in Operating SystemMain Memory Management in Operating System
Main Memory Management in Operating System
 
Risk Management in Engineering Construction Project
Risk Management in Engineering Construction ProjectRisk Management in Engineering Construction Project
Risk Management in Engineering Construction Project
 
Crystal Structure analysis and detailed information pptx
Crystal Structure analysis and detailed information pptxCrystal Structure analysis and detailed information pptx
Crystal Structure analysis and detailed information pptx
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdf
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 

A.I. for Dynamic Decision-Making in Retail and Finance

  • 1. A.I. for Dynamic Decisioning under Uncertainty For Real-World Problems in Retail & Financial Trading Ashwin Rao VP Data Science at Target & Adjunct Faculty at Stanford November 28, 2018 Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 1 / 35
  • 2. A.I. for Dynamic Decisioning under Uncertainty Let’s look at some terms we use to characterize this branch of A.I. Stochastic: Uncertainty in key quantities, evolving over time Optimization: A well-defined metric to be maximized (“The Goal”) Dynamic: Decisions need to a function of the changing situations Control: Overpower uncertainty by persistent steering towards goal Jargon overload due to confluence of Control Theory, O.R. and A.I. For language clarity, let’s just refer to this area as Stochastic Control I was introduced to this area through Inventory Control At Stanford, I teach a course on A.I. for Stochastic Control in Finance I’m developing an educational codebase for Stochastic Control Overarching Goal: Blend Theory, Algorithms & Real-World Modeling Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 2 / 35
  • 3. Overview 1 The Framework of Stochastic Control 2 Core Problem in Retail: Inventory Control 3 Core Problem in Finance: Portfolio Optimization/Asset Allocation 4 Quick look at a few other problems in Retail and Finance 5 Perspective from the Trenches Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 3 / 35
  • 4. The Stochastic Control Framework Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 4 / 35
  • 5. Components of the Framework The Agent and the Environment interact in a time-sequenced loop Agent responds to [State, Reward] by taking an Action Environment responds by producing next step’s (random) State Environment also produces a (random) number denoted as Reward Goal of Agent is to maximize Expected Sum of all future Rewards By controlling the (Policy : State → Action) function This is a dynamic (time-sequenced control) system under uncertainty Formally known as a Markov Decision Process (MDP) Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 5 / 35
  • 6. Formal MDP Framework The following notation is for discrete time steps. Continuous-time formulation is analogous (often involving Stochastic Calculus) States St ∈ S where S is the State Space Actions At ∈ A where A is the Action Space Rewards Rt ∈ R denoting numerical feedback Transitions p(s′ ,r s,a) = Pr{St+1 = s′ ,Rt+1 = r St = s,At = a} γ ∈ [0,1] is the Discount Factor for Reward when defining Return Return Gt = Rt + γ ⋅ Rt+1 + γ2 ⋅ Rt+1 + ... Policy π(a s) is probability that Agent takes action a in states s The goal is find a policy that maximizes E[Gt St = s] for all s ∈ S Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 6 / 35
  • 7. Many real-world problems fit this MDP framework Self-driving vehicle (speed/steering to optimize safety/time) Game of Chess (Boolean Reward at end of game) Complex Logistical Operations (eg: movements in a Warehouse) Make a humanoid robot walk/run on difficult terrains Manage an investment portfolio Control a power station Optimal decisions during a football game Strategy to win an election (high-complexity MDP) Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 7 / 35
  • 8. Why are these problems hard? State space can be large or complex (involving many variables) Sometimes, Action space is also large or complex No direct feedback on “correct” Actions (only feedback is Reward) Actions can have delayed consequences (late Rewards) Time-sequenced complexity (eg: Actions influence future Actions) Agent often doesn’t know the Model of the Environment “Model” refers to probabilities of state-transitions and rewards So, Agent has to learn the Model AND solve for the Optimal Policy Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 8 / 35
  • 9. Value Function and Bellman Equations Value function (under policy π) Vπ(s) = E[Gt St = s] for all s ∈ S Vπ(s) = ∑ a π(a s) ∑ s′,r p(s′ ,r s,a) ⋅ (r + γVπ(s′ )) for all s ∈ S Optimal Value Function V∗(s) = maxπ Vπ(s) for all s ∈ S V∗(s) = max a ∑ s′,r p(s′ ,r s,a) ⋅ (r + γV∗(s′ )) for all s ∈ S There exists an Optimal Policy π∗ achieving V∗(s) for all s ∈ S The above recursive equations are called Bellman equations In continuous time, refered to as Hamilton-Jacobi-Bellman (HJB) The algorithms based on Bellman equations are broadly classified as: Dynamic Programming Reinforcement Learning Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 9 / 35
  • 10. Dynamic Programming versus Reinforcement Learning When Model is known ⇒ Dynamic Programming (DP) DP Algorithms take advantage of knowledge of probabilities So, DP Algorithms do not require interaction with the environment When Model is unknown ⇒ Reinforcement Learning (RL) RL Algorithms interact with the Environment and incrementally learn Environment interaction could be real interaction or a simulator RL approach: Try different actions & learn what works, what doesn’t RL Algorithms’ key challenge is to tradeoff “explore” versus “exploit” DP or RL, Good approximation of Value Function is vital to success Deep Neural Networks are typically used for function approximation Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 10 / 35
  • 11. Inventory Control starts with the Newsvendor Problem Newsvendor problem is a single-period Inventory Control problem Daily demand for newspapers is a random variable x The newsvendor has an estimate of the PDF f (x) of daily demand For each newspaper that stays unsold, we suffer an Excess cost h Think of h as the purchase price minus salvage price For each newspaper we’re short on, we suffer a Deficit Cost p Think of p as the missed profits (sale price minus purchase price) But p should also include potential loss of future customers What is the optimum # of newspapers to bring in the morning? To minimize the expected cost (function of f , h and p) Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 11 / 35
  • 12. The Newsvendor Problem Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 12 / 35
  • 13. Solution to the Newsvendor problem For tractability, we assume newspapers are a continuous variable x Then, we need to solve for the optimal supply S that maximizes g(S) = h ∫ S 0 (S − x) ⋅ f (x) ⋅ dx + p ∫ ∞ S (x − S) ⋅ f (x) ⋅ dx Setting g′ (S) = 0, we get: Optimal Supply S∗ = F−1 ( p p + h ) where F(y) = ∫ y 0 f (x)dx is the CDF of daily demand p p+h is known as the critical fractile It is the fraction of days when the newsvendor goes “out-of-stock” Assuming the newsvendor always brings this optimal supply S∗ Solution details and connections with Financial Options Pricing here Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 13 / 35
  • 14. Multi-period: Single-store, Single-item Inventory Control The store experiences random daily demand given by PDF f (x) The store can order daily from a supplier carrying infinite inventory There’s a cost associated with ordering, and order arrives in L days Like newsvendor, there’s an Excess Cost h and Deficit Cost p This is an MDP where State is current Inventory Level at the store State also includes current in-transit inventory (from supplier) Action is quantity to order in any given State Reward function has h, p (just like newsvendor), and ordering cost Transition probabilities are governed by demand distribution f (x) This has a closed-form solution, similar to newsvendor fomula Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 14 / 35
  • 15. Optimal Policy: Order Point and Order-Up-To Level Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 15 / 35
  • 16. Adding real-world frictions and constraints Inventory is integer-valued, and orders are in casepack units Excess/Deficit costs are not linear functions of inventory Perishability, Obsolescence, End-of-season involve big costs Need to factor in labor costs of handling cases and singles Limits on shipping/receiving dates/times Often, there is a constraint on minimum presentation quantities Store inventory cannot exceed a threshold (eg: Shelf space) Supplier has constraints on min and max order quantities Uncertainty with the time for order arrival There are approximate closed-form solutions in some cases But general case requires generic DP or RL Algorithms Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 16 / 35
  • 17. Multi-node and Multi-item Inventory Control In practice, Inventory flows through a network of warehouses From source (suppliers) to destination (stores or homes) So, we have to solve a multi-“node” Inventory Control problem State is joint inventory across all nodes (and between nodes) Action is recommended movements of inventory between nodes Reward is the aggregate of daily costs across the network In addition, we have multi-item constraints Space and Throughput constraints are multi-item constraints So, real-world problem is multi-node and multi-item (giant MDP) Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 17 / 35
  • 18. Single-period Portfolio Optimization We start with a simple single-period portfolio optimization problem The simple setup helps develop understanding of the core concepts Assume we have one risky asset and one riskless asset The return (over the single-period) of the risky asset is N(µ,σ2 ) The return of the riskless asset is deterministic (= r < µ) Start with $1, aim to maximize our period-ending Expected Wealth This means we invest fully in the risky asset (since µ > r) But people are risk-averse and will trade higher returns for lower risk The exact Risk-Return tradeoff is specified through Utility of Wealth Utility is a concave function of Wealth describing Risk-Aversion For an intro to Risk-Aversion and Utility Theory, see here The goal is to maximize period-ending Expected Utility of Wealth Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 18 / 35
  • 19. Concave Utility of Consumption ⇒ E[U(x)] < U(E[x]) Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 19 / 35
  • 20. Solution to Single-period Portfolio Optimization W denotes end-of-period Wealth α denotes fraction to invest in risky asset (1 − α is fraction in riskless) W ∼ N(1 + r + α(µ − r),α2 σ2 ) Let Utility of Wealth function be U(W ) = −e−βW for β > 0 Where β is the coefficient (extent) of Risk-Aversion So we maximize over α, the Expected Utility E[−e−βW ] = −e−β(1+r+α(µ−r))+ β2α2σ2 2 = g(α) Setting ∂{log g(α)} ∂α = 0, we get: α∗ = µ − r βσ2 This is the fundamental investment fraction in Portfolio Optimization This fraction generalizes to multi-period and multiple risky assets Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 20 / 35
  • 21. Multi-Period: Merton’s Portfolio Optimization Problem You will live for (deterministic) T more years (say 20 years) Current Wealth is W0 (say $ 10 million) Assume you have no further income and no debts You can invest in (allocate to) n risky assets and a riskless asset Each asset has known normal distribution of returns Allowed to long or short any fractional quantities of assets Trading in continuous time 0 ≤ t < T, with no transaction costs You can consume any fractional amount of wealth at any time Dynamic Decision: Optimal Allocation and Consumption at each time To maximize lifetime-aggregated Utility of Consumption Consumption Utility assumed to have constant Relative Risk-Aversion Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 21 / 35
  • 22. Problem Notation For ease of exposition, we state the problem for 1 risky asset Riskless asset: dRt = r ⋅ Rt ⋅ dt Risky asset: dSt = µ ⋅ St ⋅ dt + σ ⋅ St ⋅ dzt (i.e. Geometric Brownian) µ > r > 0,σ > 0 (for n assets, we work with a covariance matrix) Wealth at time t is denoted by Wt > 0 Fraction of wealth allocated to risky asset denoted by π(t,Wt) Fraction of wealth in riskless asset will then be 1 − π(t,Wt) Wealth consumption denoted by c(t,Wt) ≥ 0 Utility of Consumption function U(x) = x1−β 1−β for 0 < β ≠ 1 β = (constant) Relative Risk-Aversion −x⋅U′′(x) U′(x) Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 22 / 35
  • 23. Continuous-Time Stochastic Control Problem Balance constraint implies the following process for Wealth Wt dWt = ((πt ⋅ (µ − r) + r) ⋅ Wt − ct) ⋅ dt + πt ⋅ σ ⋅ Wt ⋅ dzt At any time t, determine optimal [π(t,Wt),c(t,Wt)] to maximize: E[∫ T t γs−t ⋅ c1−β s 1 − β ⋅ ds Wt] where 0 ≤ γ ≤ 1 is the annualized discount factor Think of this as a continuous-time Stochastic Control problem State is (t,Wt), Action is [πt,ct], Reward per unit time is U(ct) Stochastic process for Wt above defines the transition probabilities Find Policy (t,Wt) → [πt,ct] that maximizes the Expected Return Note: ct ≥ 0, but πt is unconstrained Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 23 / 35
  • 24. Outline of Solution Optimal Value Function V ∗ (t,Wt) has a recursive formulation V ∗ (t,Wt) = max π,c E[∫ t1 t γs ⋅ c1−β s 1 − β ⋅ ds + V ∗ (t1,Wt1 )] Re-expressed as a Hamilton-Jacobi-Bellman (HJB) formulation max πt ,ct E[dV ∗ (t,Wt) + γt ⋅ c1−β t 1 − β ⋅ dt] = 0 Standard HJB calculus (Ito’s Lemma followed by partial derivatives w.r.t. πt,ct) gives us a PDE for V ∗ (t,Wt) We can reduce the PDE to a tractable ODE with a guessed solution All the gory details in this lecture This solution outline is broadly applicable to continuous-time problems Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 24 / 35
  • 25. Gaining Insights into the Solution Optimal Allocation π∗ (t,Wt) = µ−r σ2β (independent of t and Wt) Optimal Fractional Consumption c∗(t,Wt ) Wt depends only on t (= f (t)) With Optimal Allocation & Consumption, the Wealth process is: dWt Wt = (r + (µ − r)2 σ2β − f (t)) ⋅ dt + µ − r σβ ⋅ dzt Expected Portfolio Return is constant over time (= r + (µ−r)2 σ2β ) Fractional Consumption f (t) increases over time Expected Rate of Wealth Growth r + (µ−r)2 σ2β − f (t) decreases over time If r + (µ−r)2 σ2β > f (0), we start by Consuming < Expected Portfolio Growth and over time, we Consume > Expected Portfolio Growth Wealth Growth Volatility is constant (= µ−r σβ ) Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 25 / 35
  • 26. Portfolio Return versus Consumption Rate Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 26 / 35
  • 27. Porting this to Real-World Portfolio Optimization Analytical tractability in Merton’s formulation was due to: Normal distribution of asset returns Constant Relative Risk-Aversion Frictionless, continuous trading However, real-world situation involves: Discrete amounts of assets to hold and discrete quantities of trades Transaction costs Locked-out days for trading Non-stationary/arbitrary/correlated processes of multiple assets Changing/uncertain risk-free rate Consumption constraints Arbitrary Risk-Aversion/Utility specification ⇒ Approximate Dynamic Programming or Reinforcement Learning Large Action Space points to Policy Gradient Algorithms Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 27 / 35
  • 28. Overview of a few other problems Financial Trading: American Options Pricing Financial Trading: Trade Order Execution Retail: Clearance Pricing Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 28 / 35
  • 29. American Options Pricing American option can be exercised anytime before option maturity Key decision at any time is to exercise or continue The default algorithm is Backward Induction on a tree/grid But it doesn’t work for path-dependent options Also, it’s not feasible when state dimension is large Industry-Standard: Longstaff-Schwartz’s simulation-based algorithm RL is an attractive alternative to Longstaff-Schwartz RL is straightforward once Optimal Exercise is modeled as an MDP Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 29 / 35
  • 30. MDP for Optimal Options Exercise State is [Current Time, History of Underlying Security Prices] Action is Boolean: Exercise (i.e., Payoff and Stop) or Continue Reward always 0, except upon Exercise (= Payoff) State-transitions governed by Underlying Prices’ Stochastic Process Optimal Policy ⇒ Optimal Stopping ⇒ Option Price All the details in this lecture Can be generalized to other Optimal Stopping problems Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 30 / 35
  • 31. Optimal Trade Order Execution (controlling Price Impact) You are tasked with selling a large qty of a (relatively less-liquid) stock You have a fixed horizon over which to complete the sale The goal is to maximize aggregate sales proceeds over the horizon If you sell too fast, Price Impact will result in poor sales proceeds If you sell too slow, you risk running out of time We need to model temporary and permanent Price Impacts Objective should incorporate penalty for variance of sales proceeds Which is equivalent to maximizing aggregate Utility of sales proceeds Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 31 / 35
  • 32. MDP for Optimal Trade Order Execution State is [Time Remaining, Stock Remaining to be Sold, Market Info] Action is Quantity of Stock to Sell at current time Reward is Utility of Sales Proceeds (i.e., Variance-adjusted-Proceeds) Reward & State-transitions governed by Price Impact Model Real-world Model can be quite complex (Order Book Dynamics) Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 32 / 35
  • 33. Clearance Pricing You are a few weeks away from end-of-season (eg: Christmas Trees) Assume you have too much inventory in your store What is the optimal sequence of price markdowns? So as to maximize your total profit (sales revenue minus costs) Note: There is a non-trivial cost of performing a markdown If price markdowns are small, we end up with surplus at season-end Surplus often needs to be disposed at poor salvage price If price reductions are large, we run out of Christmas trees early “Stockout” cost is considered to be large during holiday season Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 33 / 35
  • 34. MDP for Clearance Pricing State is [Days Left, Current Inventory, Current Price, Market Info] Action is Price Markdown Reward includes Sales revenue, markdown cost, stockout cost, salvage Reward & State-transitions governed by Price Elasticity of Demand Real-world Model can be quite complex (eg: competitor pricing) Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 34 / 35
  • 35. Perspective (and a bit of “advice”) from the Trenches I always start with a simple version of problem to develop intuition My first line of attack is DP customized to the problem structure RL Algorithms that are my personal favorites (links to my lectures): Deep Q-Network (DQN): Experience Replay, 2nd Target Network Least Squares Policy Iteration (LSPI) - Batch Linear System Exact Gradient Temporal-Difference (GTD) Policy Gradient (esp. Natural Gradient, TRPO) I prefer to separate Model Estimation from Policy Optimization So we can customize RL algorithms to take advantage of: Knowledge of transition probabilities Knowledge of reward function Any problem-specific structure that simplifies the algorithm Feature Engineering based on known closed-form approximations Many real-world, large-scale problems ultimately come down to suitable choices of DNN architectures and hyperparameter tuning Ashwin Rao (Target/Stanford) A.I For Retail/Finance November 28, 2018 35 / 35