SlideShare une entreprise Scribd logo
1  sur  73
Télécharger pour lire hors ligne
Josh Clemm
www.linkedin.com/in/joshclemm
SCALING LINKEDIN
A BRIEF HISTORY
Scaling = replacing all the components
of a car while driving it at 100mph
“
Via Mike Krieger, “Scaling Instagram”
LinkedIn started back in 2003 to
“connect to your network for better job
opportunities.”
It had 2700 members in first week.
First week growth guesses from founding team
0M
50M
300M
250M
200M
150M
100M
400M
32M
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
5
400M
350M
Fast forward to today...
LinkedIn is a global site with over 400 million
members
Web pages and mobile traffic are served at
tens of thousands of queries per second
Backend systems serve millions of queries per
second
LINKEDIN SCALE TODAY
7
How did we get there?
Let’s start from
the beginning
LEO
DB
DB
LEO
● Huge monolithic app
called Leo
● Java, JSP, Servlets,
JDBC
● Served every page,
same SQL database
LEO
Circa 2003
LINKEDIN’S ORIGINAL ARCHITECTURE
So far so good, but two areas to improve:
1. The growing member to member
connection graph
2. The ability to search those members
● Needed to live in-memory for top
performance
● Used graph traversal queries not suitable for
the shared SQL database.
● Different usage profile than other parts of site
MEMBER CONNECTION GRAPH
MEMBER CONNECTION GRAPH
So, a dedicated service was created.
LinkedIn’s first service.
● Needed to live in-memory for top
performance
● Used graph traversal queries not suitable for
the shared SQL database.
● Different usage profile than other parts of site
● Social networks need powerful search
● Lucene was used on top of our member graph
MEMBER SEARCH
● Social networks need powerful search
● Lucene was used on top of our member graph
MEMBER SEARCH
LinkedIn’s second service.
LINKEDIN WITH CONNECTION GRAPH
AND SEARCH
Member
GraphLEO
DB
RPC
Circa 2004
Lucene
Connection / Profile Updates
Getting better, but the single database was
under heavy load.
Vertically scaling helped, but we needed to
offload the read traffic...
● Master/slave concept
● Read-only traffic from replica
● Writes go to main DB
● Early version of Databus kept DBs in sync
REPLICA DBs
Main DB
Replica
ReplicaDatabus
relay Replica DB
● Good medium term solution
● We could vertically scale servers for a while
● Master DBs have finite scaling limits
● These days, LinkedIn DBs use partitioning
REPLICA DBs TAKEAWAYS
Main DB
Replica
ReplicaDatabus
relay Replica DB
Member
GraphLEO
RPC
Main DB
ReplicaReplicaDatabus relay Replica DB
Connection
Updates
R/WR/O
Circa 2006
LINKEDIN WITH REPLICA DBs
Search
Profile
Updates
As LinkedIn continued to grow, the
monolithic application Leo was becoming
problematic.
Leo was difficult to release, debug, and the
site kept going down...
Scaling LinkedIn - A Brief History
Scaling LinkedIn - A Brief History
Scaling LinkedIn - A Brief History
Kill LEOIT WAS TIME TO...
Public Profile
Web App
Profile Service
LEO
Recruiter Web
App
Yet another
Service
Extracting services (Java Spring MVC) from
legacy Leo monolithic application
Circa 2008 on
SERVICE ORIENTED ARCHITECTURE
● Goal - create vertical stack of
stateless services
● Frontend servers fetch data
from many domains, build
HTML or JSON response
● Mid-tier services host APIs,
business logic
● Data-tier or back-tier services
encapsulate data domains
Profile Web
App
Profile
Service
Profile DB
SERVICE ORIENTED ARCHITECTURE
Scaling LinkedIn - A Brief History
Groups
Content
Service
Connections
Content
Service
Profile
Content
Service
Browser / App
Frontend
Web App
Mid-tier
Service
Mid-tier
Service
Mid-tier
Service
Edu Data
Service
Data
Service
Hadoop
DB Voldemort
EXAMPLE MULTI-TIER ARCHITECTURE AT LINKEDIN
Kafka
PROS
● Stateless services
easily scale
● Decoupled domains
● Build and deploy
independently
CONS
● Ops overhead
● Introduces backwards
compatibility issues
● Leads to complex call
graphs and fanout
SERVICE ORIENTED ARCHITECTURE COMPARISON
bash$ eh -e %%prod | awk -F. '{ print $2 }' | sort | uniq | wc -l
756
● In 2003, LinkedIn had one service (Leo)
● By 2010, LinkedIn had over 150 services
● Today in 2015, LinkedIn has over 750 services
SERVICES AT LINKEDIN
Getting better, but LinkedIn was
experiencing hypergrowth...
Scaling LinkedIn - A Brief History
● Simple way to reduce load on
servers and speed up responses
● Mid-tier caches store derived
objects from different domains,
reduce fanout
● Caches in the data layer
● We use memcache, couchbase,
even Voldemort
Frontend
Web App
Mid-tier
Service
Cache
DB
Cache
CACHING
There are only two hard problems in
Computer Science:
Cache invalidation, naming things, and
off-by-one errors.
“
Via Twitter by Kellan Elliott-McCrea
and later Jonathan Feinberg
CACHING TAKEAWAYS
● Caches are easy to add in the beginning, but
complexity adds up over time.
● Over time LinkedIn removed many mid-tier
caches because of the complexity around
invalidation
● We kept caches closer to data layer
CACHING TAKEAWAYS (cont.)
● Services must handle full load - caches
improve speed, not permanent load bearing
solutions
● We’ll use a low latency solution like
Voldemort when appropriate and precompute
results
LinkedIn’s hypergrowth was extending to
the vast amounts of data it collected.
Individual pipelines to route that data
weren’t scaling. A better solution was
needed...
Scaling LinkedIn - A Brief History
KAFKA MOTIVATIONS
● LinkedIn generates a ton of data
○ Pageviews
○ Edits on profile, companies, schools
○ Logging, timing
○ Invites, messaging
○ Tracking
● Billions of events everyday
● Separate and independently created pipelines
routed this data
A WHOLE LOT OF CUSTOM PIPELINES...
A WHOLE LOT OF CUSTOM PIPELINES...
As LinkedIn needed to scale, each pipeline
needed to scale.
Distributed pub-sub messaging platform as LinkedIn’s
universal data pipeline
KAFKA
Kafka
Frontend
service
Frontend
service
Backend
Service
DWH Monitoring Analytics HadoopOracle
BENEFITS
● Enabled near realtime access to any data source
● Empowered Hadoop jobs
● Allowed LinkedIn to build realtime analytics
● Vastly improved site monitoring capability
● Enabled devs to visualize and track call graphs
● Over 1 trillion messages published per day, 10 million
messages per second
KAFKA AT LINKEDIN
OVER 1 TRILLION PUBLISHED DAILY
OVER 1 TRILLION PUBLISHED DAILY
Let’s end with
the modern years
Scaling LinkedIn - A Brief History
● Services extracted from Leo or created new
were inconsistent and often tightly coupled
● Rest.li was our move to a data model centric
architecture
● It ensured a consistent stateless Restful API
model across the company.
REST.LI
● By using JSON over HTTP, our new APIs
supported non-Java-based clients.
● By using Dynamic Discovery (D2), we got
load balancing, discovery, and scalability of
each service API.
● Today, LinkedIn has 1130+ Rest.li resources
and over 100 billion Rest.li calls per day
REST.LI (cont.)
Rest.li Automatic API-documentation
REST.LI (cont.)
Rest.li R2/D2 tech stack
REST.LI (cont.)
LinkedIn’s success with Data infrastructure
like Kafka and Databus led to the
development of more and more scalable
Data infrastructure solutions...
● It was clear LinkedIn could build data
infrastructure that enables long term growth
● LinkedIn doubled down on infra solutions like:
○ Storage solutions
■ Espresso, Voldemort, Ambry (media)
○ Analytics solutions like Pinot
○ Streaming solutions
■ Kafka, Databus, and Samza
○ Cloud solutions like Helix and Nuage
DATA INFRASTRUCTURE
DATABUS
LinkedIn is a global company and was
continuing to see large growth. How else
to scale?
● Natural progression of horizontally scaling
● Replicate data across many data centers using
storage technology like Espresso
● Pin users to geographically close data center
● Difficult but necessary
MULTIPLE DATA CENTERS
● Multiple data centers are imperative to
maintain high availability.
● You need to avoid any single point of failure
not just for each service, but the entire site.
● LinkedIn runs out of three main data centers,
additional PoPs around the globe, and more
coming online every day...
MULTIPLE DATA CENTERS
MULTIPLE DATA CENTERS
LinkedIn's operational setup as of 2015
(circles represent data centers, diamonds represent PoPs)
Of course LinkedIn’s scaling story is never
this simple, so what else have we done?
● Each of LinkedIn’s critical systems have
undergone their own rich history of scale
(graph, search, analytics, profile backend,
comms, feed)
● LinkedIn uses Hadoop / Voldemort for insights
like People You May Know, Similar profiles,
Notable Alumni, and profile browse maps.
WHAT ELSE HAVE WE DONE?
● Re-architected frontend approach using
○ Client templates
○ BigPipe
○ Play Framework
● LinkedIn added multiple tiers of proxies using
Apache Traffic Server and HAProxy
● We improved the performance of servers with
new hardware, advanced system tuning, and
newer Java runtimes.
WHAT ELSE HAVE WE DONE? (cont.)
Scaling sounds easy and quick to do, right?
Hofstadter's Law: It always takes longer
than you expect, even when you take
into account Hofstadter's Law.
“
Via  Douglas Hofstadter,
Gödel, Escher, Bach: An Eternal Golden Braid
Josh Clemm
www.linkedin.com/in/joshclemm
THANKS!
● Blog version of this slide deck
https://engineering.linkedin.com/architecture/brief-history-scaling-linkedin
● Visual story of LinkedIn’s history
https://ourstory.linkedin.com/
● LinkedIn Engineering blog
https://engineering.linkedin.com
● LinkedIn Open-Source
https://engineering.linkedin.com/open-source
● LinkedIn’s communication system slides which
include earliest LinkedIn architecture http://www.slideshare.
net/linkedin/linkedins-communication-architecture
● Slides which include earliest LinkedIn data infra work
http://www.slideshare.net/r39132/linkedin-data-infrastructure-qcon-london-2012
LEARN MORE
● Project Inversion - internal project to enable developer
productivity (trunk based model), faster deploys, unified
services
http://www.bloomberg.com/bw/articles/2013-04-10/inside-operation-inversion-the-code-
freeze-that-saved-linkedin
● LinkedIn’s use of Apache Traffic server
http://www.slideshare.net/thenickberry/reflecting-a-year-after-migrating-to-apache-traffic-
server
● Multi Data Center - testing fail overs
https://www.linkedin.com/pulse/armen-hamstra-how-he-broke-linkedin-got-promoted-
angel-au-yeung
LEARN MORE (cont.)
● History and motivation around Kafka
http://www.confluent.io/blog/stream-data-platform-1/
● Thinking about streaming solutions as a commit log
https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-
should-know-about-real-time-datas-unifying
● Kafka enabling monitoring and alerting
http://engineering.linkedin.com/52/autometrics-self-service-metrics-collection
● Kafka enabling real-time analytics (Pinot)
http://engineering.linkedin.com/analytics/real-time-analytics-massive-scale-pinot
● Kafka’s current use and future at LinkedIn
http://engineering.linkedin.com/kafka/kafka-linkedin-current-and-future
● Kafka processing 1 trillion events per day
https://engineering.linkedin.com/apache-kafka/how-we_re-improving-and-advancing-
kafka-linkedin
LEARN MORE - KAFKA
● Open sourcing Databus
https://engineering.linkedin.com/data-replication/open-sourcing-databus-linkedins-low-
latency-change-data-capture-system
● Samza streams to help LinkedIn view call graphs
https://engineering.linkedin.com/samza/real-time-insights-linkedins-performance-using-
apache-samza
● Real-time analytics (Pinot)
http://engineering.linkedin.com/analytics/real-time-analytics-massive-scale-pinot
● Introducing Espresso data store
http://engineering.linkedin.com/espresso/introducing-espresso-linkedins-hot-new-
distributed-document-store
LEARN MORE - DATA INFRASTRUCTURE
● LinkedIn’s use of client templates
○ Dust.js
http://www.slideshare.net/brikis98/dustjs
○ Profile
http://engineering.linkedin.com/profile/engineering-new-linkedin-profile
● Big Pipe on LinkedIn’s homepage
http://engineering.linkedin.com/frontend/new-technologies-new-linkedin-home-page
● Play Framework
○ Introduction at LinkedIn https://engineering.linkedin.
com/play/composable-and-streamable-play-apps
○ Switching to non-block asynchronous model
https://engineering.linkedin.com/play/play-framework-async-io-without-thread-pool-
and-callback-hell
LEARN MORE - FRONTEND TECH
● Introduction to Rest.li and how it helps LinkedIn scale
http://engineering.linkedin.com/architecture/restli-restful-service-architecture-scale
● How Rest.li expanded across the company
http://engineering.linkedin.com/restli/linkedins-restli-moment
LEARN MORE - REST.LI
● JVM memory tuning
http://engineering.linkedin.com/garbage-collection/garbage-collection-optimization-high-
throughput-and-low-latency-java-applications
● System tuning
http://engineering.linkedin.com/performance/optimizing-linux-memory-management-
low-latency-high-throughput-databases
● Optimizing JVM tuning automatically
https://engineering.linkedin.com/java/optimizing-java-cms-garbage-collections-its-
difficulties-and-using-jtune-solution
LEARN MORE - SYSTEM TUNING
LinkedIn continues to grow quickly and there’s
still a ton of work we can do to improve.
We’re working on problems that very few ever
get to solve - come join us!
WE’RE HIRING
Scaling LinkedIn - A Brief History

Contenu connexe

Tendances

Understanding Reddit: The Social Media Superpower You've Probably Never Heard Of
Understanding Reddit: The Social Media Superpower You've Probably Never Heard OfUnderstanding Reddit: The Social Media Superpower You've Probably Never Heard Of
Understanding Reddit: The Social Media Superpower You've Probably Never Heard OfBrent Csutoras
 
Painting a Vision for Your Product
Painting a Vision for Your ProductPainting a Vision for Your Product
Painting a Vision for Your ProductAtlassian
 
Product Led Growth: The Rise of the User
Product Led Growth: The Rise of the UserProduct Led Growth: The Rise of the User
Product Led Growth: The Rise of the UserOpenView
 
Agile for Marketing
Agile for MarketingAgile for Marketing
Agile for MarketingHubSpot
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionSri Ambati
 
How to Define Your Product Roadmap by Dan Olsen
How to Define Your Product Roadmap by Dan OlsenHow to Define Your Product Roadmap by Dan Olsen
How to Define Your Product Roadmap by Dan OlsenDan Olsen
 
Growth Hacking Fundamentals @ Echelon Jakarta (by Growth Hacking Asia)
Growth Hacking Fundamentals @ Echelon Jakarta (by Growth Hacking Asia)Growth Hacking Fundamentals @ Echelon Jakarta (by Growth Hacking Asia)
Growth Hacking Fundamentals @ Echelon Jakarta (by Growth Hacking Asia)Growth Hacking Asia
 
LinkedIn Marketing Strategy
LinkedIn Marketing StrategyLinkedIn Marketing Strategy
LinkedIn Marketing StrategyFisher Laishram
 
From Lab to Life - Lessons from Developing and Deploying Real World LLM Appli...
From Lab to Life - Lessons from Developing and Deploying Real World LLM Appli...From Lab to Life - Lessons from Developing and Deploying Real World LLM Appli...
From Lab to Life - Lessons from Developing and Deploying Real World LLM Appli...StephenLeo7
 
How to create pro slides in less time: don't worry, be crappy!
How to create pro slides in less time:  don't worry, be crappy!How to create pro slides in less time:  don't worry, be crappy!
How to create pro slides in less time: don't worry, be crappy!Chiara Ojeda
 
Product Roadmaps - Tips on how to create and manage roadmaps
Product Roadmaps - Tips on how to create and manage roadmapsProduct Roadmaps - Tips on how to create and manage roadmaps
Product Roadmaps - Tips on how to create and manage roadmapsMarc Abraham
 
Content Guide - How To Improve Content Creation
Content Guide - How To Improve Content CreationContent Guide - How To Improve Content Creation
Content Guide - How To Improve Content CreationIlya Bilbao
 
Blueprint ChatGPT Lunch & Learn
Blueprint ChatGPT Lunch & LearnBlueprint ChatGPT Lunch & Learn
Blueprint ChatGPT Lunch & Learngnakan
 
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapEpisode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapAnant Corporation
 
From Idea to Execution: Spotify's Discover Weekly
From Idea to Execution: Spotify's Discover WeeklyFrom Idea to Execution: Spotify's Discover Weekly
From Idea to Execution: Spotify's Discover WeeklyChris Johnson
 
Storytelling For Product Managers
Storytelling For Product ManagersStorytelling For Product Managers
Storytelling For Product ManagersProduct School
 
Tableau Conference 2018: Binging on Data - Enabling Analytics at Netflix
Tableau Conference 2018: Binging on Data - Enabling Analytics at NetflixTableau Conference 2018: Binging on Data - Enabling Analytics at Netflix
Tableau Conference 2018: Binging on Data - Enabling Analytics at NetflixBlake Irvine
 
API Security Best Practices & Guidelines
API Security Best Practices & GuidelinesAPI Security Best Practices & Guidelines
API Security Best Practices & GuidelinesPrabath Siriwardena
 
Launching a Rocketship Off Someone Else's Back
Launching a Rocketship Off Someone Else's BackLaunching a Rocketship Off Someone Else's Back
Launching a Rocketship Off Someone Else's Backjoshelman
 

Tendances (20)

Understanding Reddit: The Social Media Superpower You've Probably Never Heard Of
Understanding Reddit: The Social Media Superpower You've Probably Never Heard OfUnderstanding Reddit: The Social Media Superpower You've Probably Never Heard Of
Understanding Reddit: The Social Media Superpower You've Probably Never Heard Of
 
Painting a Vision for Your Product
Painting a Vision for Your ProductPainting a Vision for Your Product
Painting a Vision for Your Product
 
Product Led Growth: The Rise of the User
Product Led Growth: The Rise of the UserProduct Led Growth: The Rise of the User
Product Led Growth: The Rise of the User
 
Agile for Marketing
Agile for MarketingAgile for Marketing
Agile for Marketing
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
How to Define Your Product Roadmap by Dan Olsen
How to Define Your Product Roadmap by Dan OlsenHow to Define Your Product Roadmap by Dan Olsen
How to Define Your Product Roadmap by Dan Olsen
 
Growth Hacking Fundamentals @ Echelon Jakarta (by Growth Hacking Asia)
Growth Hacking Fundamentals @ Echelon Jakarta (by Growth Hacking Asia)Growth Hacking Fundamentals @ Echelon Jakarta (by Growth Hacking Asia)
Growth Hacking Fundamentals @ Echelon Jakarta (by Growth Hacking Asia)
 
LinkedIn Marketing Strategy
LinkedIn Marketing StrategyLinkedIn Marketing Strategy
LinkedIn Marketing Strategy
 
ChatGPT, Generative AI and Microsoft Copilot: Step Into the Future - Geoff Ab...
ChatGPT, Generative AI and Microsoft Copilot: Step Into the Future - Geoff Ab...ChatGPT, Generative AI and Microsoft Copilot: Step Into the Future - Geoff Ab...
ChatGPT, Generative AI and Microsoft Copilot: Step Into the Future - Geoff Ab...
 
From Lab to Life - Lessons from Developing and Deploying Real World LLM Appli...
From Lab to Life - Lessons from Developing and Deploying Real World LLM Appli...From Lab to Life - Lessons from Developing and Deploying Real World LLM Appli...
From Lab to Life - Lessons from Developing and Deploying Real World LLM Appli...
 
How to create pro slides in less time: don't worry, be crappy!
How to create pro slides in less time:  don't worry, be crappy!How to create pro slides in less time:  don't worry, be crappy!
How to create pro slides in less time: don't worry, be crappy!
 
Product Roadmaps - Tips on how to create and manage roadmaps
Product Roadmaps - Tips on how to create and manage roadmapsProduct Roadmaps - Tips on how to create and manage roadmaps
Product Roadmaps - Tips on how to create and manage roadmaps
 
Content Guide - How To Improve Content Creation
Content Guide - How To Improve Content CreationContent Guide - How To Improve Content Creation
Content Guide - How To Improve Content Creation
 
Blueprint ChatGPT Lunch & Learn
Blueprint ChatGPT Lunch & LearnBlueprint ChatGPT Lunch & Learn
Blueprint ChatGPT Lunch & Learn
 
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapEpisode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
 
From Idea to Execution: Spotify's Discover Weekly
From Idea to Execution: Spotify's Discover WeeklyFrom Idea to Execution: Spotify's Discover Weekly
From Idea to Execution: Spotify's Discover Weekly
 
Storytelling For Product Managers
Storytelling For Product ManagersStorytelling For Product Managers
Storytelling For Product Managers
 
Tableau Conference 2018: Binging on Data - Enabling Analytics at Netflix
Tableau Conference 2018: Binging on Data - Enabling Analytics at NetflixTableau Conference 2018: Binging on Data - Enabling Analytics at Netflix
Tableau Conference 2018: Binging on Data - Enabling Analytics at Netflix
 
API Security Best Practices & Guidelines
API Security Best Practices & GuidelinesAPI Security Best Practices & Guidelines
API Security Best Practices & Guidelines
 
Launching a Rocketship Off Someone Else's Back
Launching a Rocketship Off Someone Else's BackLaunching a Rocketship Off Someone Else's Back
Launching a Rocketship Off Someone Else's Back
 

En vedette

Lessons learned from growing LinkedIn to 400m members - Growth Hackers Confer...
Lessons learned from growing LinkedIn to 400m members - Growth Hackers Confer...Lessons learned from growing LinkedIn to 400m members - Growth Hackers Confer...
Lessons learned from growing LinkedIn to 400m members - Growth Hackers Confer...Aatif Awan
 
LinkedIn Networking for Professionals
LinkedIn Networking for ProfessionalsLinkedIn Networking for Professionals
LinkedIn Networking for ProfessionalsChristine Dubyts
 
LinkedIn presentation
LinkedIn presentationLinkedIn presentation
LinkedIn presentationjkwong5
 
A Business case study on LinkedIn
A Business case study on LinkedInA Business case study on LinkedIn
A Business case study on LinkedInMayank Banerjee
 
How LinkedIn built a Community of Half a Billion
How LinkedIn built a Community of Half a BillionHow LinkedIn built a Community of Half a Billion
How LinkedIn built a Community of Half a BillionAatif Awan
 
Linkedin Series B Pitch Deck
Linkedin Series B Pitch DeckLinkedin Series B Pitch Deck
Linkedin Series B Pitch DeckJoseph Hsieh
 

En vedette (6)

Lessons learned from growing LinkedIn to 400m members - Growth Hackers Confer...
Lessons learned from growing LinkedIn to 400m members - Growth Hackers Confer...Lessons learned from growing LinkedIn to 400m members - Growth Hackers Confer...
Lessons learned from growing LinkedIn to 400m members - Growth Hackers Confer...
 
LinkedIn Networking for Professionals
LinkedIn Networking for ProfessionalsLinkedIn Networking for Professionals
LinkedIn Networking for Professionals
 
LinkedIn presentation
LinkedIn presentationLinkedIn presentation
LinkedIn presentation
 
A Business case study on LinkedIn
A Business case study on LinkedInA Business case study on LinkedIn
A Business case study on LinkedIn
 
How LinkedIn built a Community of Half a Billion
How LinkedIn built a Community of Half a BillionHow LinkedIn built a Community of Half a Billion
How LinkedIn built a Community of Half a Billion
 
Linkedin Series B Pitch Deck
Linkedin Series B Pitch DeckLinkedin Series B Pitch Deck
Linkedin Series B Pitch Deck
 

Similaire à Scaling LinkedIn - A Brief History

Common Characteristics Of Wireless Devices
Common Characteristics Of Wireless DevicesCommon Characteristics Of Wireless Devices
Common Characteristics Of Wireless DevicesMichelle Benedict
 
Managing Large Flask Applications On Google App Engine (GAE)
Managing Large Flask Applications On Google App Engine (GAE)Managing Large Flask Applications On Google App Engine (GAE)
Managing Large Flask Applications On Google App Engine (GAE)Emmanuel Olowosulu
 
LinkedIn Graph Presentation
LinkedIn Graph PresentationLinkedIn Graph Presentation
LinkedIn Graph PresentationAmy W. Tang
 
Security Policies For Schema Less Or Dynamic Schema...
Security Policies For Schema Less Or Dynamic Schema...Security Policies For Schema Less Or Dynamic Schema...
Security Policies For Schema Less Or Dynamic Schema...Christina Boetel
 
Achieving cyber mission assurance with near real-time impact
Achieving cyber mission assurance with near real-time impactAchieving cyber mission assurance with near real-time impact
Achieving cyber mission assurance with near real-time impactElasticsearch
 
Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3) Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3) Adrian Cockcroft
 
The great migration embracing serverless first
The great migration  embracing serverless first The great migration  embracing serverless first
The great migration embracing serverless first AngelaTimofte1
 
Building data pipelines at Shopee with DEC
Building data pipelines at Shopee with DECBuilding data pipelines at Shopee with DEC
Building data pipelines at Shopee with DECRim Zaidullin
 
#dbhouseparty - Should I be building Microservices?
#dbhouseparty - Should I be building Microservices?#dbhouseparty - Should I be building Microservices?
#dbhouseparty - Should I be building Microservices?Tammy Bednar
 
LinkedIn Infrastructure (analytics@webscale, at fb 2013)
LinkedIn Infrastructure (analytics@webscale, at fb 2013)LinkedIn Infrastructure (analytics@webscale, at fb 2013)
LinkedIn Infrastructure (analytics@webscale, at fb 2013)Jun Rao
 
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...SnapLogic
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksDatabricks
 
From Monoliths to Services: Paying Your Technical Debt
From Monoliths to Services: Paying Your Technical DebtFrom Monoliths to Services: Paying Your Technical Debt
From Monoliths to Services: Paying Your Technical DebtTechWell
 
Linked in stream experimentation framework
Linked in stream experimentation frameworkLinked in stream experimentation framework
Linked in stream experimentation frameworkJoseph Adler
 
Lightbend Fast Data Platform
Lightbend Fast Data PlatformLightbend Fast Data Platform
Lightbend Fast Data PlatformLightbend
 
CQRS recipes or how to cook your architecture
CQRS recipes or how to cook your architectureCQRS recipes or how to cook your architecture
CQRS recipes or how to cook your architectureThomas Jaskula
 
Running Business Analytics for a Serverless Insurance Company - Joe Emison & ...
Running Business Analytics for a Serverless Insurance Company - Joe Emison & ...Running Business Analytics for a Serverless Insurance Company - Joe Emison & ...
Running Business Analytics for a Serverless Insurance Company - Joe Emison & ...Daniel Zivkovic
 

Similaire à Scaling LinkedIn - A Brief History (20)

Symphony Driver Essay
Symphony Driver EssaySymphony Driver Essay
Symphony Driver Essay
 
Common Characteristics Of Wireless Devices
Common Characteristics Of Wireless DevicesCommon Characteristics Of Wireless Devices
Common Characteristics Of Wireless Devices
 
Managing Large Flask Applications On Google App Engine (GAE)
Managing Large Flask Applications On Google App Engine (GAE)Managing Large Flask Applications On Google App Engine (GAE)
Managing Large Flask Applications On Google App Engine (GAE)
 
LinkedIn Graph Presentation
LinkedIn Graph PresentationLinkedIn Graph Presentation
LinkedIn Graph Presentation
 
Just do it!
Just do it!Just do it!
Just do it!
 
Security Policies For Schema Less Or Dynamic Schema...
Security Policies For Schema Less Or Dynamic Schema...Security Policies For Schema Less Or Dynamic Schema...
Security Policies For Schema Less Or Dynamic Schema...
 
Achieving cyber mission assurance with near real-time impact
Achieving cyber mission assurance with near real-time impactAchieving cyber mission assurance with near real-time impact
Achieving cyber mission assurance with near real-time impact
 
Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3) Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3)
 
The great migration embracing serverless first
The great migration  embracing serverless first The great migration  embracing serverless first
The great migration embracing serverless first
 
Building data pipelines at Shopee with DEC
Building data pipelines at Shopee with DECBuilding data pipelines at Shopee with DEC
Building data pipelines at Shopee with DEC
 
#dbhouseparty - Should I be building Microservices?
#dbhouseparty - Should I be building Microservices?#dbhouseparty - Should I be building Microservices?
#dbhouseparty - Should I be building Microservices?
 
LinkedIn Infrastructure (analytics@webscale, at fb 2013)
LinkedIn Infrastructure (analytics@webscale, at fb 2013)LinkedIn Infrastructure (analytics@webscale, at fb 2013)
LinkedIn Infrastructure (analytics@webscale, at fb 2013)
 
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
Weathering the Data Storm – How SnapLogic and AWS Deliver Analytics in the Cl...
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
 
From Monoliths to Services: Paying Your Technical Debt
From Monoliths to Services: Paying Your Technical DebtFrom Monoliths to Services: Paying Your Technical Debt
From Monoliths to Services: Paying Your Technical Debt
 
ESGYN Overview
ESGYN OverviewESGYN Overview
ESGYN Overview
 
Linked in stream experimentation framework
Linked in stream experimentation frameworkLinked in stream experimentation framework
Linked in stream experimentation framework
 
Lightbend Fast Data Platform
Lightbend Fast Data PlatformLightbend Fast Data Platform
Lightbend Fast Data Platform
 
CQRS recipes or how to cook your architecture
CQRS recipes or how to cook your architectureCQRS recipes or how to cook your architecture
CQRS recipes or how to cook your architecture
 
Running Business Analytics for a Serverless Insurance Company - Joe Emison & ...
Running Business Analytics for a Serverless Insurance Company - Joe Emison & ...Running Business Analytics for a Serverless Insurance Company - Joe Emison & ...
Running Business Analytics for a Serverless Insurance Company - Joe Emison & ...
 

Dernier

Best-NO1 Best Rohani Amil In Lahore Kala Ilam In Lahore Kala Jadu Amil In Lah...
Best-NO1 Best Rohani Amil In Lahore Kala Ilam In Lahore Kala Jadu Amil In Lah...Best-NO1 Best Rohani Amil In Lahore Kala Ilam In Lahore Kala Jadu Amil In Lah...
Best-NO1 Best Rohani Amil In Lahore Kala Ilam In Lahore Kala Jadu Amil In Lah...Amil baba
 
SUMMER TRAINING REPORT ON BUILDING CONSTRUCTION.docx
SUMMER TRAINING REPORT ON BUILDING CONSTRUCTION.docxSUMMER TRAINING REPORT ON BUILDING CONSTRUCTION.docx
SUMMER TRAINING REPORT ON BUILDING CONSTRUCTION.docxNaveenVerma126
 
Clutches and brkesSelect any 3 position random motion out of real world and d...
Clutches and brkesSelect any 3 position random motion out of real world and d...Clutches and brkesSelect any 3 position random motion out of real world and d...
Clutches and brkesSelect any 3 position random motion out of real world and d...sahb78428
 
دليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratory
دليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratoryدليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratory
دليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide LaboratoryBahzad5
 
Renewable Energy & Entrepreneurship Workshop_21Feb2024.pdf
Renewable Energy & Entrepreneurship Workshop_21Feb2024.pdfRenewable Energy & Entrepreneurship Workshop_21Feb2024.pdf
Renewable Energy & Entrepreneurship Workshop_21Feb2024.pdfodunowoeminence2019
 
solar wireless electric vechicle charging system
solar wireless electric vechicle charging systemsolar wireless electric vechicle charging system
solar wireless electric vechicle charging systemgokuldongala
 
A Seminar on Electric Vehicle Software Simulation
A Seminar on Electric Vehicle Software SimulationA Seminar on Electric Vehicle Software Simulation
A Seminar on Electric Vehicle Software SimulationMohsinKhanA
 
Strategies of Urban Morphologyfor Improving Outdoor Thermal Comfort and Susta...
Strategies of Urban Morphologyfor Improving Outdoor Thermal Comfort and Susta...Strategies of Urban Morphologyfor Improving Outdoor Thermal Comfort and Susta...
Strategies of Urban Morphologyfor Improving Outdoor Thermal Comfort and Susta...amrabdallah9
 
Modelling Guide for Timber Structures - FPInnovations
Modelling Guide for Timber Structures - FPInnovationsModelling Guide for Timber Structures - FPInnovations
Modelling Guide for Timber Structures - FPInnovationsYusuf Yıldız
 
Power System electrical and electronics .pptx
Power System electrical and electronics .pptxPower System electrical and electronics .pptx
Power System electrical and electronics .pptxMUKULKUMAR210
 
Multicomponent Spiral Wound Membrane Separation Model.pdf
Multicomponent Spiral Wound Membrane Separation Model.pdfMulticomponent Spiral Wound Membrane Separation Model.pdf
Multicomponent Spiral Wound Membrane Separation Model.pdfGiovanaGhasary1
 
Dev.bg DevOps March 2024 Monitoring & Logging
Dev.bg DevOps March 2024 Monitoring & LoggingDev.bg DevOps March 2024 Monitoring & Logging
Dev.bg DevOps March 2024 Monitoring & LoggingMarian Marinov
 
Engineering Mechanics Chapter 5 Equilibrium of a Rigid Body
Engineering Mechanics  Chapter 5  Equilibrium of a Rigid BodyEngineering Mechanics  Chapter 5  Equilibrium of a Rigid Body
Engineering Mechanics Chapter 5 Equilibrium of a Rigid BodyAhmadHajasad2
 
Mohs Scale of Hardness, Hardness Scale.pptx
Mohs Scale of Hardness, Hardness Scale.pptxMohs Scale of Hardness, Hardness Scale.pptx
Mohs Scale of Hardness, Hardness Scale.pptxKISHAN KUMAR
 
nvidia AI-gtc 2024 partial slide deck.pptx
nvidia AI-gtc 2024 partial slide deck.pptxnvidia AI-gtc 2024 partial slide deck.pptx
nvidia AI-gtc 2024 partial slide deck.pptxjasonsedano2
 
UNIT4_ESD_wfffffggggggggggggith_ARM.pptx
UNIT4_ESD_wfffffggggggggggggith_ARM.pptxUNIT4_ESD_wfffffggggggggggggith_ARM.pptx
UNIT4_ESD_wfffffggggggggggggith_ARM.pptxrealme6igamerr
 
Phase noise transfer functions.pptx
Phase noise transfer      functions.pptxPhase noise transfer      functions.pptx
Phase noise transfer functions.pptxSaiGouthamSunkara
 
SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....
SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....
SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....santhyamuthu1
 

Dernier (20)

Best-NO1 Best Rohani Amil In Lahore Kala Ilam In Lahore Kala Jadu Amil In Lah...
Best-NO1 Best Rohani Amil In Lahore Kala Ilam In Lahore Kala Jadu Amil In Lah...Best-NO1 Best Rohani Amil In Lahore Kala Ilam In Lahore Kala Jadu Amil In Lah...
Best-NO1 Best Rohani Amil In Lahore Kala Ilam In Lahore Kala Jadu Amil In Lah...
 
SUMMER TRAINING REPORT ON BUILDING CONSTRUCTION.docx
SUMMER TRAINING REPORT ON BUILDING CONSTRUCTION.docxSUMMER TRAINING REPORT ON BUILDING CONSTRUCTION.docx
SUMMER TRAINING REPORT ON BUILDING CONSTRUCTION.docx
 
Clutches and brkesSelect any 3 position random motion out of real world and d...
Clutches and brkesSelect any 3 position random motion out of real world and d...Clutches and brkesSelect any 3 position random motion out of real world and d...
Clutches and brkesSelect any 3 position random motion out of real world and d...
 
دليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratory
دليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratoryدليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratory
دليل تجارب الاسفلت المختبرية - Asphalt Experiments Guide Laboratory
 
Renewable Energy & Entrepreneurship Workshop_21Feb2024.pdf
Renewable Energy & Entrepreneurship Workshop_21Feb2024.pdfRenewable Energy & Entrepreneurship Workshop_21Feb2024.pdf
Renewable Energy & Entrepreneurship Workshop_21Feb2024.pdf
 
solar wireless electric vechicle charging system
solar wireless electric vechicle charging systemsolar wireless electric vechicle charging system
solar wireless electric vechicle charging system
 
A Seminar on Electric Vehicle Software Simulation
A Seminar on Electric Vehicle Software SimulationA Seminar on Electric Vehicle Software Simulation
A Seminar on Electric Vehicle Software Simulation
 
Présentation IIRB 2024 Chloe Dufrane.pdf
Présentation IIRB 2024 Chloe Dufrane.pdfPrésentation IIRB 2024 Chloe Dufrane.pdf
Présentation IIRB 2024 Chloe Dufrane.pdf
 
Strategies of Urban Morphologyfor Improving Outdoor Thermal Comfort and Susta...
Strategies of Urban Morphologyfor Improving Outdoor Thermal Comfort and Susta...Strategies of Urban Morphologyfor Improving Outdoor Thermal Comfort and Susta...
Strategies of Urban Morphologyfor Improving Outdoor Thermal Comfort and Susta...
 
Modelling Guide for Timber Structures - FPInnovations
Modelling Guide for Timber Structures - FPInnovationsModelling Guide for Timber Structures - FPInnovations
Modelling Guide for Timber Structures - FPInnovations
 
Power System electrical and electronics .pptx
Power System electrical and electronics .pptxPower System electrical and electronics .pptx
Power System electrical and electronics .pptx
 
Multicomponent Spiral Wound Membrane Separation Model.pdf
Multicomponent Spiral Wound Membrane Separation Model.pdfMulticomponent Spiral Wound Membrane Separation Model.pdf
Multicomponent Spiral Wound Membrane Separation Model.pdf
 
Dev.bg DevOps March 2024 Monitoring & Logging
Dev.bg DevOps March 2024 Monitoring & LoggingDev.bg DevOps March 2024 Monitoring & Logging
Dev.bg DevOps March 2024 Monitoring & Logging
 
Présentation IIRB 2024 Marine Cordonnier.pdf
Présentation IIRB 2024 Marine Cordonnier.pdfPrésentation IIRB 2024 Marine Cordonnier.pdf
Présentation IIRB 2024 Marine Cordonnier.pdf
 
Engineering Mechanics Chapter 5 Equilibrium of a Rigid Body
Engineering Mechanics  Chapter 5  Equilibrium of a Rigid BodyEngineering Mechanics  Chapter 5  Equilibrium of a Rigid Body
Engineering Mechanics Chapter 5 Equilibrium of a Rigid Body
 
Mohs Scale of Hardness, Hardness Scale.pptx
Mohs Scale of Hardness, Hardness Scale.pptxMohs Scale of Hardness, Hardness Scale.pptx
Mohs Scale of Hardness, Hardness Scale.pptx
 
nvidia AI-gtc 2024 partial slide deck.pptx
nvidia AI-gtc 2024 partial slide deck.pptxnvidia AI-gtc 2024 partial slide deck.pptx
nvidia AI-gtc 2024 partial slide deck.pptx
 
UNIT4_ESD_wfffffggggggggggggith_ARM.pptx
UNIT4_ESD_wfffffggggggggggggith_ARM.pptxUNIT4_ESD_wfffffggggggggggggith_ARM.pptx
UNIT4_ESD_wfffffggggggggggggith_ARM.pptx
 
Phase noise transfer functions.pptx
Phase noise transfer      functions.pptxPhase noise transfer      functions.pptx
Phase noise transfer functions.pptx
 
SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....
SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....
SATELITE COMMUNICATION UNIT 1 CEC352 REGULATION 2021 PPT BASICS OF SATELITE ....
 

Scaling LinkedIn - A Brief History

  • 2. Scaling = replacing all the components of a car while driving it at 100mph “ Via Mike Krieger, “Scaling Instagram”
  • 3. LinkedIn started back in 2003 to “connect to your network for better job opportunities.” It had 2700 members in first week.
  • 4. First week growth guesses from founding team
  • 5. 0M 50M 300M 250M 200M 150M 100M 400M 32M 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 5 400M 350M Fast forward to today...
  • 6. LinkedIn is a global site with over 400 million members Web pages and mobile traffic are served at tens of thousands of queries per second Backend systems serve millions of queries per second LINKEDIN SCALE TODAY
  • 7. 7 How did we get there?
  • 10. DB LEO ● Huge monolithic app called Leo ● Java, JSP, Servlets, JDBC ● Served every page, same SQL database LEO Circa 2003 LINKEDIN’S ORIGINAL ARCHITECTURE
  • 11. So far so good, but two areas to improve: 1. The growing member to member connection graph 2. The ability to search those members
  • 12. ● Needed to live in-memory for top performance ● Used graph traversal queries not suitable for the shared SQL database. ● Different usage profile than other parts of site MEMBER CONNECTION GRAPH
  • 13. MEMBER CONNECTION GRAPH So, a dedicated service was created. LinkedIn’s first service. ● Needed to live in-memory for top performance ● Used graph traversal queries not suitable for the shared SQL database. ● Different usage profile than other parts of site
  • 14. ● Social networks need powerful search ● Lucene was used on top of our member graph MEMBER SEARCH
  • 15. ● Social networks need powerful search ● Lucene was used on top of our member graph MEMBER SEARCH LinkedIn’s second service.
  • 16. LINKEDIN WITH CONNECTION GRAPH AND SEARCH Member GraphLEO DB RPC Circa 2004 Lucene Connection / Profile Updates
  • 17. Getting better, but the single database was under heavy load. Vertically scaling helped, but we needed to offload the read traffic...
  • 18. ● Master/slave concept ● Read-only traffic from replica ● Writes go to main DB ● Early version of Databus kept DBs in sync REPLICA DBs Main DB Replica ReplicaDatabus relay Replica DB
  • 19. ● Good medium term solution ● We could vertically scale servers for a while ● Master DBs have finite scaling limits ● These days, LinkedIn DBs use partitioning REPLICA DBs TAKEAWAYS Main DB Replica ReplicaDatabus relay Replica DB
  • 20. Member GraphLEO RPC Main DB ReplicaReplicaDatabus relay Replica DB Connection Updates R/WR/O Circa 2006 LINKEDIN WITH REPLICA DBs Search Profile Updates
  • 21. As LinkedIn continued to grow, the monolithic application Leo was becoming problematic. Leo was difficult to release, debug, and the site kept going down...
  • 25. Kill LEOIT WAS TIME TO...
  • 26. Public Profile Web App Profile Service LEO Recruiter Web App Yet another Service Extracting services (Java Spring MVC) from legacy Leo monolithic application Circa 2008 on SERVICE ORIENTED ARCHITECTURE
  • 27. ● Goal - create vertical stack of stateless services ● Frontend servers fetch data from many domains, build HTML or JSON response ● Mid-tier services host APIs, business logic ● Data-tier or back-tier services encapsulate data domains Profile Web App Profile Service Profile DB SERVICE ORIENTED ARCHITECTURE
  • 29. Groups Content Service Connections Content Service Profile Content Service Browser / App Frontend Web App Mid-tier Service Mid-tier Service Mid-tier Service Edu Data Service Data Service Hadoop DB Voldemort EXAMPLE MULTI-TIER ARCHITECTURE AT LINKEDIN Kafka
  • 30. PROS ● Stateless services easily scale ● Decoupled domains ● Build and deploy independently CONS ● Ops overhead ● Introduces backwards compatibility issues ● Leads to complex call graphs and fanout SERVICE ORIENTED ARCHITECTURE COMPARISON
  • 31. bash$ eh -e %%prod | awk -F. '{ print $2 }' | sort | uniq | wc -l 756 ● In 2003, LinkedIn had one service (Leo) ● By 2010, LinkedIn had over 150 services ● Today in 2015, LinkedIn has over 750 services SERVICES AT LINKEDIN
  • 32. Getting better, but LinkedIn was experiencing hypergrowth...
  • 34. ● Simple way to reduce load on servers and speed up responses ● Mid-tier caches store derived objects from different domains, reduce fanout ● Caches in the data layer ● We use memcache, couchbase, even Voldemort Frontend Web App Mid-tier Service Cache DB Cache CACHING
  • 35. There are only two hard problems in Computer Science: Cache invalidation, naming things, and off-by-one errors. “ Via Twitter by Kellan Elliott-McCrea and later Jonathan Feinberg
  • 36. CACHING TAKEAWAYS ● Caches are easy to add in the beginning, but complexity adds up over time. ● Over time LinkedIn removed many mid-tier caches because of the complexity around invalidation ● We kept caches closer to data layer
  • 37. CACHING TAKEAWAYS (cont.) ● Services must handle full load - caches improve speed, not permanent load bearing solutions ● We’ll use a low latency solution like Voldemort when appropriate and precompute results
  • 38. LinkedIn’s hypergrowth was extending to the vast amounts of data it collected. Individual pipelines to route that data weren’t scaling. A better solution was needed...
  • 40. KAFKA MOTIVATIONS ● LinkedIn generates a ton of data ○ Pageviews ○ Edits on profile, companies, schools ○ Logging, timing ○ Invites, messaging ○ Tracking ● Billions of events everyday ● Separate and independently created pipelines routed this data
  • 41. A WHOLE LOT OF CUSTOM PIPELINES...
  • 42. A WHOLE LOT OF CUSTOM PIPELINES... As LinkedIn needed to scale, each pipeline needed to scale.
  • 43. Distributed pub-sub messaging platform as LinkedIn’s universal data pipeline KAFKA Kafka Frontend service Frontend service Backend Service DWH Monitoring Analytics HadoopOracle
  • 44. BENEFITS ● Enabled near realtime access to any data source ● Empowered Hadoop jobs ● Allowed LinkedIn to build realtime analytics ● Vastly improved site monitoring capability ● Enabled devs to visualize and track call graphs ● Over 1 trillion messages published per day, 10 million messages per second KAFKA AT LINKEDIN
  • 45. OVER 1 TRILLION PUBLISHED DAILY OVER 1 TRILLION PUBLISHED DAILY
  • 46. Let’s end with the modern years
  • 48. ● Services extracted from Leo or created new were inconsistent and often tightly coupled ● Rest.li was our move to a data model centric architecture ● It ensured a consistent stateless Restful API model across the company. REST.LI
  • 49. ● By using JSON over HTTP, our new APIs supported non-Java-based clients. ● By using Dynamic Discovery (D2), we got load balancing, discovery, and scalability of each service API. ● Today, LinkedIn has 1130+ Rest.li resources and over 100 billion Rest.li calls per day REST.LI (cont.)
  • 51. Rest.li R2/D2 tech stack REST.LI (cont.)
  • 52. LinkedIn’s success with Data infrastructure like Kafka and Databus led to the development of more and more scalable Data infrastructure solutions...
  • 53. ● It was clear LinkedIn could build data infrastructure that enables long term growth ● LinkedIn doubled down on infra solutions like: ○ Storage solutions ■ Espresso, Voldemort, Ambry (media) ○ Analytics solutions like Pinot ○ Streaming solutions ■ Kafka, Databus, and Samza ○ Cloud solutions like Helix and Nuage DATA INFRASTRUCTURE
  • 55. LinkedIn is a global company and was continuing to see large growth. How else to scale?
  • 56. ● Natural progression of horizontally scaling ● Replicate data across many data centers using storage technology like Espresso ● Pin users to geographically close data center ● Difficult but necessary MULTIPLE DATA CENTERS
  • 57. ● Multiple data centers are imperative to maintain high availability. ● You need to avoid any single point of failure not just for each service, but the entire site. ● LinkedIn runs out of three main data centers, additional PoPs around the globe, and more coming online every day... MULTIPLE DATA CENTERS
  • 58. MULTIPLE DATA CENTERS LinkedIn's operational setup as of 2015 (circles represent data centers, diamonds represent PoPs)
  • 59. Of course LinkedIn’s scaling story is never this simple, so what else have we done?
  • 60. ● Each of LinkedIn’s critical systems have undergone their own rich history of scale (graph, search, analytics, profile backend, comms, feed) ● LinkedIn uses Hadoop / Voldemort for insights like People You May Know, Similar profiles, Notable Alumni, and profile browse maps. WHAT ELSE HAVE WE DONE?
  • 61. ● Re-architected frontend approach using ○ Client templates ○ BigPipe ○ Play Framework ● LinkedIn added multiple tiers of proxies using Apache Traffic Server and HAProxy ● We improved the performance of servers with new hardware, advanced system tuning, and newer Java runtimes. WHAT ELSE HAVE WE DONE? (cont.)
  • 62. Scaling sounds easy and quick to do, right?
  • 63. Hofstadter's Law: It always takes longer than you expect, even when you take into account Hofstadter's Law. “ Via  Douglas Hofstadter, Gödel, Escher, Bach: An Eternal Golden Braid
  • 65. ● Blog version of this slide deck https://engineering.linkedin.com/architecture/brief-history-scaling-linkedin ● Visual story of LinkedIn’s history https://ourstory.linkedin.com/ ● LinkedIn Engineering blog https://engineering.linkedin.com ● LinkedIn Open-Source https://engineering.linkedin.com/open-source ● LinkedIn’s communication system slides which include earliest LinkedIn architecture http://www.slideshare. net/linkedin/linkedins-communication-architecture ● Slides which include earliest LinkedIn data infra work http://www.slideshare.net/r39132/linkedin-data-infrastructure-qcon-london-2012 LEARN MORE
  • 66. ● Project Inversion - internal project to enable developer productivity (trunk based model), faster deploys, unified services http://www.bloomberg.com/bw/articles/2013-04-10/inside-operation-inversion-the-code- freeze-that-saved-linkedin ● LinkedIn’s use of Apache Traffic server http://www.slideshare.net/thenickberry/reflecting-a-year-after-migrating-to-apache-traffic- server ● Multi Data Center - testing fail overs https://www.linkedin.com/pulse/armen-hamstra-how-he-broke-linkedin-got-promoted- angel-au-yeung LEARN MORE (cont.)
  • 67. ● History and motivation around Kafka http://www.confluent.io/blog/stream-data-platform-1/ ● Thinking about streaming solutions as a commit log https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer- should-know-about-real-time-datas-unifying ● Kafka enabling monitoring and alerting http://engineering.linkedin.com/52/autometrics-self-service-metrics-collection ● Kafka enabling real-time analytics (Pinot) http://engineering.linkedin.com/analytics/real-time-analytics-massive-scale-pinot ● Kafka’s current use and future at LinkedIn http://engineering.linkedin.com/kafka/kafka-linkedin-current-and-future ● Kafka processing 1 trillion events per day https://engineering.linkedin.com/apache-kafka/how-we_re-improving-and-advancing- kafka-linkedin LEARN MORE - KAFKA
  • 68. ● Open sourcing Databus https://engineering.linkedin.com/data-replication/open-sourcing-databus-linkedins-low- latency-change-data-capture-system ● Samza streams to help LinkedIn view call graphs https://engineering.linkedin.com/samza/real-time-insights-linkedins-performance-using- apache-samza ● Real-time analytics (Pinot) http://engineering.linkedin.com/analytics/real-time-analytics-massive-scale-pinot ● Introducing Espresso data store http://engineering.linkedin.com/espresso/introducing-espresso-linkedins-hot-new- distributed-document-store LEARN MORE - DATA INFRASTRUCTURE
  • 69. ● LinkedIn’s use of client templates ○ Dust.js http://www.slideshare.net/brikis98/dustjs ○ Profile http://engineering.linkedin.com/profile/engineering-new-linkedin-profile ● Big Pipe on LinkedIn’s homepage http://engineering.linkedin.com/frontend/new-technologies-new-linkedin-home-page ● Play Framework ○ Introduction at LinkedIn https://engineering.linkedin. com/play/composable-and-streamable-play-apps ○ Switching to non-block asynchronous model https://engineering.linkedin.com/play/play-framework-async-io-without-thread-pool- and-callback-hell LEARN MORE - FRONTEND TECH
  • 70. ● Introduction to Rest.li and how it helps LinkedIn scale http://engineering.linkedin.com/architecture/restli-restful-service-architecture-scale ● How Rest.li expanded across the company http://engineering.linkedin.com/restli/linkedins-restli-moment LEARN MORE - REST.LI
  • 71. ● JVM memory tuning http://engineering.linkedin.com/garbage-collection/garbage-collection-optimization-high- throughput-and-low-latency-java-applications ● System tuning http://engineering.linkedin.com/performance/optimizing-linux-memory-management- low-latency-high-throughput-databases ● Optimizing JVM tuning automatically https://engineering.linkedin.com/java/optimizing-java-cms-garbage-collections-its- difficulties-and-using-jtune-solution LEARN MORE - SYSTEM TUNING
  • 72. LinkedIn continues to grow quickly and there’s still a ton of work we can do to improve. We’re working on problems that very few ever get to solve - come join us! WE’RE HIRING