6. Same reason most of you do. It’s new and cool and we wanted to check it out. We become cool by association. But mostly because we like learning new things.
7. That last slide was kind of a lie too. We started with Cassandra. Cassandra was written by Facebook and Facebook is really cool, we wanted to be as cool as them.
8. Why Not Cassandra? Thrift. “Thrift is a software framework for scalable cross-language services development. It combines a software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, and OCaml.” Eff that. We’re a startup.
9. So MongoDB it Was. Also, MongoDB Happened to be in NYC. We are in NYC. NYC is Cool. Proof that NYC is cool.
10. What You Should Know MongoDB is not relational. It’s also not schemaless even though they love to say that. (applications always have schemas/data models). Right tool for right job. Logging Queues Aggregate Analytics Don’t get confused with ORM. Return what you need. Don’t worry about document size limits.
11. Aggregate Analytics Lots of “Stuff” happens at Buddy Media. Need to keep track of it all. Need to it to be real time. Need to be able to group it by various levels and resolutions. Need to be able to create new metrics on the fly. Write heavy, Read light.
14. The Event Listener Node.js is the perfect event listener. Evented IO like Twisted or Event Machine. 2 days of development (maybe ~100 lines of JS). 0 lost events 0 downtime. Just don’t upgrade
17. Creating a Metric A pageview happened and I want to update metrics for the client the page belongs to. metrics.update( { 'name’:client.pageview', 'period':'minute', 'start_date':'2010-05-12 12:50:00' }, { '$inc': {'aggregates.1034':1} }, upsert=True );
19. What about another client? If a second pageview comes in for a different client, we end up updating the exact same record. Thus our last metric becomes: { "_id" : ObjectId("4da45cf6306a22719829b71b"), "aggregates" : { ”1034" : 1, “1213”: 1 }, "end_date" : "2010-05-12 12:54:59", "name" : ”client.pageview", "period" : "minute", "start_date" : "2010-05-12 12:50:00", "total" : 11 }
20. Some Queries 1. Get pageviews forallclientsthatoccurred on May 12 between 12:50 and 12:51 db.metrics.find({ name:"client.pageview", period:"minute", start_date:"2010-05-12 12:50:00” }); 2. Get pageviews forclient 1034 thatoccurred on May 12 between 12:50 and 12:51 db.metrics.find({ name:"client.pageview", period:"minute", start_date:"2010-05-12 12:50:00” },{“aggregates.1034”:1}); 1 Document, n entries. 1 Document, 1 entry.
21. More Queries 1. Get pageviews forallclientsthatoccurred on May 12 andgraphbyhour. db.metrics.find({ name:"client.pageview", period:”hour", start_date:”/2010-05-12/” }); 2. Get pageviews forclient 1034 thatoccurred on May 12 andgraphby minute. db.metrics.find({ name:"client.pageview", period:"minute", start_date:”/2010-05-/” },{“aggregates.1034”:1}); 24 Documents, n entries. 1440 Documents, 1 entry.