The document proposes a design for an event-based, cross-product risk management system using Apache Geode. Key elements include splitting data into regions for trades, markets, and results; placing regions to optimize performance of risk calculations; using PDX to bridge languages; integrating market and trade data streams; running a proprietary math library on a shared compute grid or inside Geode; and building real-time risk views. The system aims to provide a consolidated risk view across trading products and systems.
Potential of AI (Generative AI) in Business: Learnings and Insights
Wall Street Derivative Risk Solutions Using Apache Geode
1. Wall Street Derivative Risk
Solutions Using
Apache Geode (Incubating)
By Andre Langevin & Mike Stolz
2. Safe Harbor Statement
The following is intended to outline the general direction of Pivotal's offerings. It is
intended for information purposes only and may not be incorporated into any
contract. Any information regarding pre-release of Pivotal offerings, future updates
or other planned modifications is subject to ongoing evaluation by Pivotal and is
subject to change. This information is provided without warranty or any kind, express
or implied, and is not a commitment to deliver any material, code, or functionality,
and should not be relied upon in making purchasing decisions regarding Pivotal's
offerings. These purchasing decisions should only be based on features currently
available. The development, release, and timing of any features or functionality
described for Pivotal's offerings in this presentation remain at the sole discretion of
Pivotal. Pivotal has no obligation to update forward looking information in this
presentation.
2
3. Mike Stolz
Principal Engineer, Pivotal
• Mike Stolz is a Principal Engineer at Pivotal and leads the
Product Management team for GemFire. This involves
constant communication with users to learn what features
are needed, prioritization of those features, and managing
the product roadmap.
• Mike brings a twenty year history in Financial Services that
culminated in ten years at Merrill Lynch where he was
Director and Chief Architect for Fixed Income, Currencies,
Commodities, Liquidity and Risk technology globally.
• Mike became the first customer to deploy GemFire in
production as an important component of Merrill’s Currency
Trading, Credit Trading and Enterprise Risk systems.
• Mike left Merrill and joined the GemFire team as VP of
Architecture in 2007 and has served in several roles ever
since during the acquisition of GemStone by SpringSoft, and
VMware, and the spin off to Pivotal.
Andre Langevin
Banking Industry Consultant
• Andre has spent the past fifteen years providing technology
solutions to a broad spectrum of firms in the financial
industry, including banks, asset managers and stock
exchanges.
• An “expert’s expert” on Risk Systems, Andre has led teams
developing innovative real-time risk solutions for the fixed
income and derivatives trading desks at several banks,
including RBC Capital Markets, BMO Capital Markets and
CIBC World Markets. Many of these systems implemented
the industry leading designs Mike Stolz pioneered using
GemFire.
• Andre’s Risk Systems team at BMO Financial Group was a
leading banking industry user of GemFire, deploying
applications in market risk, counterparty credit risk, CVA and
pre-deal credit.
• Andre plans to one day write a book entitled, “The History
of Risk Systems.”
Introduction to Presenters
3
5. A Crash Course in Wall Street Trading
• Big Wall Street firms have “FICC” trading business organized by market:
• Each business will trade “cash” and derivative products, but traders specialize in one or the other
• There may be a team of traders working to trade a single product
• Trading systems are product specific and often highly specialized:
• May have up to 50 different booking points for transactions
• Multiple instances of the same trading system, deployed in different regions
• Electronic markets mean that there are often external booking points to consolidate
• Managing these businesses requires a consolidated view of risk:
• Risk factors span products and markets – it is not sufficient to just look at the risk by trade or book
• Risk measures must be both fast and detailed to be relevant on the trading floor
• Desk heads aggregate risk from individual trades to stay within desk limits for risk
• Business heads aggregate risk across desk to stay within the firm’s risk appetite and regulatory limits
FICC
”Fixed Income
Commodities &
Currencies”
6. Calculating Risk
• What is the “risk” that we are trying to measure?
• Trades are valued by discounting their estimated future cash flows
• Discount factors are based on observable market data
• Movement in markets can change the value of your trades
• “Trade Risk” is the sensitivity of each trade to changes in market data
• Markets are represented using curves:
• A curve is defined using observable rates and prices and then “built” into a smooth, consistent “zero curve” using
interpolation
• Consistency is paramount:
• Most firms have a proprietary math library used for valuation and risk
• Use the same market data in all calculations to avoid basis differences
7. Technology Solutions that Work Badly
• The easiest thing to do is just book all of your trades using one trading system!
• Trading systems are product specific for many very good reasons, so this idea is a non-starter
• How about booking all of the hedges into the primary trading system?
• Cash product systems can’t price derivatives, so you have to invent simple “proxies” for them
• Have to build live feeds from one trading system into another – or book duplicates by hand
• The back office has to remove the duplicates before settlement and accounting
• How about adding up all of the risk from each trading system into a report?
• Almost impossible to make the valuations consistent across systems:
o Different yield curve definitions, and different market data sources feeding curves
o Different math libraries, and often a technology challenge to make them run fast enough
o Different calculation methodologies (is relative risk up or down?)
• Difficult to achieve speed needed to accurately compute hedge requirements
Cash Products
Cash products are
securities that are
settled
immediately for a
cash payment,
such as stocks and
bonds.
8. Filling in the Details of the Design
Event-based cross-product risk system using Geode
9. Designing and Naming Data Objects
• The trade data model serves two distinct purposes:
• Descriptive data is only used for aggregation and viewing
• Model parameters are only needed to calculate risk
• Can be split into two regions to optimize performance
• Market data should follow the calculation design:
• Model data to align to the calculation engine’s math library to
reduce format conversions downstream
• Use “dot” notation to give business-friendly keys to objects:
• Create compound keys like “USD.LIBOR.3M” and ”USD.LIBOR.6M”
to allow business users to “guess” a key easily – promotes use of
Geode data in secondary applications and spreadsheets
• Values in the “dot” name are repeated as attributes of the object
10. Region Design
• Trade and market data regions:
• Both may be high velocity, but with a low number of contributors
• Curve definitions are updated slowly but used constantly
• Typically a curve embeds a list of rates – leave it
denormalized if rates are updated slowly
• If calculation engine supports it, create a second region to cache
built interest rate and credit curves (building a credit curve is 80%
of the valuation time for a credit default swap)
• Consider splitting model parameters from descriptive data to
reduce amount of data flowing to compute grid
• Foreign exchange quotes are typically small and updated daily
• Interest rates change slowly and are referenced constantly
• Computational results and aggregation:
• Risk results will be the the largest and highest traffic region
• Pre-aggregate risk inside Geode to support lower powered
consumers (e.g. web pages)
11. Placing Regions on the Cluster
• Region placement optimizes the solution’s performance:
• Consider placement of market data and trades holistically to make the risk
calculation efficient – keep all data for a given calculation on one machine
• Partition the trades regions to balance the cluster:
• Partition trade region to maximize parallel execution during compute
• Use a business term (e.g. valuation curve, currency, industry) that can be
used to partition both the trade and market data sets
• Partition or replicate market data to optimize computations:
• Replicate interest rates and foreign exchange rates to all nodes
• Replicate or partition curve data to maximize collocation of trades with
their market data to minimize cross-member network traffic
• When using an external compute grid, if you can host Geode servers on
the same hardware that is hosting the compute grid nodes, this technique
should also be applied to the local Geode caches on the compute grid
12. PDX Integration Glue
• PDX serialization is an effective cross-language bridge:
• PDX data objects bridge solution components in Java, C# and
C++
• Avoid language-specific data types (e.g. C# date types) that
don’t have equivalents in other languages
• Structure PDX objects to optimize performance:
• Need to strike the right balance between containment
relationships and externalization of sub-objects or lists into
separate objects
• Balance speed of lookup with memory consumption
• Need to consider data colocation
• JSON is a good message format:
• PDX natively handles JSON, but not XML
• C# works well with JSON, so the calculation engine and the
downstream consumers should consume easily
13. Getting Trade Data into Geode
• Message formats vary by product type:
• OTC derivatives typically are usually captured in XML
documents
• Bond trading systems use FIX or similar (e.g. TOMS)
• Proprietary formats from legacy trading systems
• Broker messages in an application server:
• Transactional message consumer is best pattern
• XML-to-object parsing tools readily available
• Spring Cloud Data Flow is ideal for this purpose
• Trade data capture is transactional:
• Best practice is to make end-to-end process a
transaction, but may need to split into two legs
based on source of messages
14. Getting Market Data into Geode
• Market data feeds have many proprietary formats
• Market data is often exceptionally fast moving:
• Foreign exchange quotes for the major current pairs can reach 70,000
messages/second
• Market data can also be very slow moving:
• Rate fixings like LIBOR are once per day
• Illiquid securities may not even be quoted daily
• Conflate fast market data by sampling:
• Discard inbound ticks that don’t move the market sufficiently
• Sample down to a rate that your compute farm can accommodate
• External client required to conflate within message queue
• Gate market data into batches:
• Push complete update of all market data at pre-determined intervals
• Day open and close by trading location (New York, London, Hong Kong)
15. Crunching Numbers on a Shared Grid
• Most trading firms have a proprietary math library:
• Developed by internal quantitative teams to ensure
consistency
• Usually coded in C++ or C# to take advantage of Intel
compute grid
• Pushing Geode events to an external compute grid:
• Typical compute grid has a “head node” or “broker”
• Use Asynchronous Event Queue (“AEQ”) to collect events
for grid’s broker to process
• Stateless grid nodes can synchronously put results back to
Geode regions to ensure results are captured
• Caching locally on the grid to accelerate performance:
• Grid nodes can use Geode client-side caching proxies
• Use client-side region interest registration to ensure
updates are pushed to grid nodes
o Can use wildcards on keys (see dot notation)
16. Crunching Numbers Inside Geode
• Running the math inside Geode is dramatically faster:
• Securities Technology Analysis Center (STAC) Report Issue
0.99 found that trade valuations running inside GemFire were
76 times faster than a traditional grid
• Using the Geode grid as a compute grid:
• Math library must be coded in java (most are C++ or C#)
• Try to use function parameters to define data model
• Opportunities to cache frequently used derived results
• Using cache listeners to propagate valuation events:
• Use cache listener to detect data updates in regions that
contain valuation inputs (e.g. new trade, market data updates)
o Do not listen to “jittery” regions, such as exchange rates
• Encapsulate math into functions that cache listener can
execute
• Ensure regions are properly partitioned in order to get parallel
execution across the grid
17. Ticking Risk Views
• Roll-your-own client applications to view ticking risk:
• Desktop applications can use the client libraries to
receive events from the cluster using Continuous
Queries, which can then be displayed in real time
• Server hosted applications can use Continuous Queries
or Asynchronous Event Queues
• Integrating packaged products:
• Some specialty products handle streaming risk:
o Armanta TheatreTM
o ION Enterprise RiskTM
• Integrate using custom java components
• The traders will always want spreadsheets:
• Write an Excel a plug-in
18. White Paper
Want to learn more?
Download the whitepaper at:
http://pivotal.io/building-wall-street-risk-applications
18
19. Geode Community
19
Join the Apache Geode Community!
Check out: http://geode.incubator.apache.org
Subscribe: user-subscribe@geode.incubator.apache.org
Download: http://geode.incubator.apache.org/releases
20. Learn More. Stay Connected.
Related session: Spring Data and In-Memory Data Management In Action
https://2016.event.springoneplatform.io/schedule/sessions/spring_data_and_in_memory_data_management_in_action.html
John Blum and Luke Shannon – August 4, 9:00 AM
@springcentral
spring.io/blog
@pivotal
pivotal.io/blog
@pivotalcf
http://engineering.pivotal.io