2. Giv ing a @twitter t alk at Columbia
Uni versity talking a bout Twitter’s
Numbers!
22 Feb via Twitter for iPhone
ty
from Mudd Building at Columbia Universi
500 West 120th Street
New York, New York
View Tweets at this place
17. How big are they?
1 tweet text = 140 characters
≈ 200 bytes
18. 1200 tweets per
≈ 230 KB/sec
second
≈ 14 MB/min
≈ 19 GB/day
Just tweet text!
19. MySQL
Can’t generate IDs fast enough
Centralized and a single point of failure
snowflake
Highly available and uncoordinated (10kqps)
Compatible with the ecosystem
http://github.com/twitter/snowflake
20. ampura
mons from ch
used under Creative Com
Photo
1 TB generated 10 TB generated
per day per day
21. 10 TB
per day in total
≈ 120 MB per sec
80 MB
= per sec
Photo used u
n der Creative C
ommons from
Mac Users G
uide
22.
23. Where do they go?
Followed by
Following
Asymmetric Digraph
24. 1
Digraph 2
Need to represent this
4
1 2 3 4 3
1
Matrix
2
Naïve implementation is not scalable
3
4
26. Photo used under Creative Commons from jurvetson
Distributed graph database
flockdb High rate of CRUD operations
Complex set arithmetic queries
http://github.com/twitter/flockdb
27. @ladygaga
mother mons†er
8.3 million followers
@justinbieber
Justin Bieber
7.5 million followers
@BarakObama
44th President of the United States
6.7 million followers
@raffi
me!
0.007 million followers
28.
29. How do they get out?
10B API calls 100,000 calls
per day ≈ per second
30. REST API
XML/JSON API over HTTP
Poll-based system / pseudo real-time
hosebird
Streaming API
Long poll HTTP
Near real-time delivery of Tweets
35. Where do we want to be?
Today - 200M people generate ~1200 TPS
Tomorrow - we want to support half the world and all its devices
(5B phones and 6B people)
36. Real challenges in front of us
Real time
Indexing, search, and analytics
Relevance systems
Graph databases
Storage
Scalability and efficiency