SlideShare une entreprise Scribd logo
1  sur  58
No SQL?




          Image credit: http://browsertoolkit.com/fault-tolerance.png
No SQL?




          Image credit: http://browsertoolkit.com/fault-tolerance.png
No SQL?




          Image credit: http://browsertoolkit.com/fault-tolerance.png
Neo4j       and a
                 NOSQL overview



                                  #neo4j
Emil Eifrem                       @emileifrem
CEO, Neo Technology               emil@neotechnology.com
What's the plan?

  Why now? – Four trends

  NOSQL overview

  Graph databases && Neo4j

  Conclusions
Trend 1:
data set size




40
2007            Source: IDC 2007
988

Trend 1:
data set size




40
2007            2010   Source: IDC 2007
Trend 2: connectedness
                                                                                           Giant
                                                                                          Global
                                                                                          Graph
                                                                                          (GGG)
 Information connectivity




                                                                             Ontologies


                                                                      RDF

                                                                                  Folksonomies
                                                                  Tagging


                                                      Wikis             User-
                                                                      generated
                                                                       content
                                                              Blogs


                                                     RSS


                                         Hypertext


                               Text
                            documents     web 1.0             web 2.0             “web 3.0”
                                        1990         2000                   2010                   2020
Trend 3: semi-structure
  Individualization of content!
     In the salary lists of the 1970s, all elements had exactly
     one job
     In the salary lists of the 2000s, we need 5 job columns!
     Or 8? Or 15?

  Trend accelerated by the decentralization of content
  generation that is the hallmark of the age of participation
  (“web 2.0”)
Aside: RDBMS performance
                                                            Relational database
               Salary List
 Performance




                             Majority of
                             Webapps



                                           Social network

                                                                 Semantic Trading




                                                  }           custom


                                                Data complexity
Trend 4: architecture


         1990s: Database as integration hub


          Application   Application   Application




                           DB
Trend 4: architecture


        2000s:   (Slowly towards...)   Decoupled services
                    with their own backend

            Service             Service        Service




             DB                   DB            DB
Why NOSQL 2009?

 Trend 1: Size.

 Trend 2: Connectivity.

 Trend 3: Semi-structure.

 Trend 4: Architecture.
NOSQL
overview
First off: the name

  NoSQL is NOT “Never SQL”

  NoSQL is NOT “No To SQL”
NOSQL
      is simply


N ot O nly S Q L !
Four (emerging) NOSQL categories
 Key-value stores
   Based on Amazon's Dynamo paper
   Data model: (global) collection of K-V pairs
   Example: Dynomite, Voldemort, Tokyo*

 ColumnFamily / BigTable clones
   Based on Google's BigTable paper
   Data model: big table, column families
   Example: HBase, Hypertable, Cassandra
Four (emerging) NOSQL categories
 Document databases
   Inspired by Lotus Notes
   Data model: collections of K-V collections
   Example: CouchDB, MongoDB, Riak

 Graph databases
   Inspired by Euler & graph theory
   Data model: nodes, rels, K-V on both
   Example: AllegroGraph, Sones, Neo4j
NOSQL data models
            Key-value stores
Data size




                          Bigtable clones


                                            Document
                                            databases


                                                           Graph databases




                                                        Data complexity
NOSQL data models
               Key-value stores
   Data size




                             Bigtable clones


                                               Document
                                               databases


                                                              Graph databases

                                                                          (This is still billio ns of
 90%                                                                      nodes & relationships)
  of
 use
cases




                                                           Data complexity
Graph DBs
& Neo4j intro
The Graph DB model: representation
 Core abstractions:
                                              name = “Emil”

   Nodes
                                              age = 29
                                              sex = “yes”


   Relationships between nodes
   Properties on both                     1                         2



                         type = KNOWS
                         time = 4 years                       3

                                                                  type = car
                                                                  vendor = “SAAB”
                                                                  model = “95 Aero”
Example: The Matrix
                                                                           name = “The Architect”
                             name = “Morpheus”
                             rank = “Captain”
name = “Thomas Anderson”
                             occupation = “Total badass”                                            42
age = 29
                                                 disclosure = public


                 KNOWS                            KNOWS                                              CODED_BY
                                                                                KNO
     1                                  7                              3           W   S

                 KN
                                                                                                13
                                             S

                    OW                                   name = “Cypher”
                                        KNOW



                         S                               last name = “Reagan”
                                                                                             name = “Agent Smith”
                                                                       disclosure = secret   version = 1.0b
         age = 3 days                                                  age = 6 months        language = C++
                                    2

                             name = “Trinity”
Code (1): Building a node space
GraphDatabaseService graphDb = ... // Get factory


// Create Thomas 'Neo' Anderson
Node mrAnderson = graphDb.createNode();
mrAnderson.setProperty( "name", "Thomas Anderson" );
mrAnderson.setProperty( "age", 29 );

// Create Morpheus
Node morpheus = graphDb.createNode();
morpheus.setProperty( "name", "Morpheus" );
morpheus.setProperty( "rank", "Captain" );
morpheus.setProperty( "occupation", "Total bad ass" );

// Create a relationship representing that they know each other
mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS );
// ...create Trinity, Cypher, Agent Smith, Architect similarly
Code (1): Building a node space
GraphDatabaseService graphDb = ... // Get factory
Transaction tx = graphDb.beginTx();

// Create Thomas 'Neo' Anderson
Node mrAnderson = graphDb.createNode();
mrAnderson.setProperty( "name", "Thomas Anderson" );
mrAnderson.setProperty( "age", 29 );

// Create Morpheus
Node morpheus = graphDb.createNode();
morpheus.setProperty( "name", "Morpheus" );
morpheus.setProperty( "rank", "Captain" );
morpheus.setProperty( "occupation", "Total bad ass" );

// Create a relationship representing that they know each other
mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS );
// ...create Trinity, Cypher, Agent Smith, Architect similarly

tx.commit();
Code (1b): Def ning RelationshipTypes
             i
// In package org.neo4j.graphdb
public interface RelationshipType
{
   String name();
}

// In package org.yourdomain.yourapp
// Example on how to roll dynamic RelationshipTypes
class MyDynamicRelType implements RelationshipType
{
   private final String name;
   MyDynamicRelType( String name ){ this.name = name; }
   public String name() { return this.name; }
}

// Example on how to kick it, static-RelationshipType-like
enum MyStaticRelTypes implements RelationshipType
{
   KNOWS,
   WORKS_FOR,
}
Whiteboard friendly



                                 owns
                      Björn                  Big Car
                        build             drives


                                DayCare
The Graph DB model: traversal
 Traverser framework for
 high-performance traversing                    name = “Emil”
                                                age = 31

 across the node space                          sex = “yes”




                                            1                         2



                           type = KNOWS
                           time = 4 years                       3

                                                                    type = car
                                                                    vendor = “SAAB”
                                                                    model = “95 Aero”
Example: Mr Anderson’s friends
                                                                           name = “The Architect”
                             name = “Morpheus”
                             rank = “Captain”
name = “Thomas Anderson”
                             occupation = “Total badass”                                            42
age = 29
                                                 disclosure = public


                 KNOWS                            KNOWS                                              CODED_BY
                                                                                KNO
     1                                  7                              3           W   S

                 KN
                                                                                                13
                                             S

                    OW                                   name = “Cypher”
                                        KNOW



                         S                               last name = “Reagan”
                                                                                             name = “Agent Smith”
                                                                       disclosure = secret   version = 1.0b
         age = 3 days                                                  age = 6 months        language = C++
                                    2

                             name = “Trinity”
Code (2): Traversing a node space
// Instantiate a traverser that returns Mr Anderson's friends
Traverser friendsTraverser = mrAnderson.traverse(
   Traverser.Order.BREADTH_FIRST,
   StopEvaluator.END_OF_GRAPH,
   ReturnableEvaluator.ALL_BUT_START_NODE,
   RelTypes.KNOWS,
   Direction.OUTGOING );

// Traverse the node space and print out the result
System.out.println( "Mr Anderson's friends:" );
for ( Node friend : friendsTraverser )
{
   System.out.printf( "At depth %d => %s%n",
       friendsTraverser.currentPosition().getDepth(),
       friend.getProperty( "name" ) );
}
name = “The Architect”
                              name = “Morpheus”
                              rank = “Captain”
 name = “Thomas Anderson”
                              occupation = “Total badass”                                            42
 age = 29
                                                  disclosure = public


                  KNOWS                            KNOWS                                              CODED_BY
                                                                                 KNO
      1                                  7                              3           W     S

                  KN                                                                              13

                                              S
                     OW                                   name = “Cypher”

                                         KNOW
                          S                               last name = “Reagan”
                                                                                               name = “Agent Smith”
                                                                        disclosure = secret    version = 1.0b
          age = 3 days                                                  age = 6 months         language = C++
                                     2

                              name = “Trinity”
                                                                    $ bin/start-neo-example
                                                                    Mr Anderson's friends:

                                                                    At      depth     1   =>   Morpheus
friendsTraverser = mrAnderson.traverse(
  Traverser.Order.BREADTH_FIRST,                                    At      depth     1   =>   Trinity
  StopEvaluator.END_OF_GRAPH,                                       At      depth     2   =>   Cypher
  ReturnableEvaluator.ALL_BUT_START_NODE,
  RelTypes.KNOWS,
                                                                    At      depth     3   =>   Agent Smith
  Direction.OUTGOING );                                             $
Example: Friends in love?
                                                                            name = “The Architect”
                               name = “Morpheus”
                               rank = “Captain”
name = “Thomas Anderson”
                               occupation = “Total badass”                                           42
age = 29
                                                  disclosure = public


                   KNOWS                           KNOWS                                              CODED_BY
                                                                                 KNO
     1                                    7                             3           W   S

                   KN                                                                            13
                                             S
                                        KNOW


                      OW                                  name = “Cypher”
                           S                              last name = “Reagan”
                                                                                              name = “Agent Smith”
         LO                                                             disclosure = secret   version = 1.0b
            VE                                                          age = 6 months        language = C++
               S
                                      2

                               name = “Trinity”
Code (3a): Custom traverser
// Create a traverser that returns all “friends in love”
Traverser loveTraverser = mrAnderson.traverse(
   Traverser.Order.BREADTH_FIRST,
   StopEvaluator.END_OF_GRAPH,
   new ReturnableEvaluator()
   {
       public boolean isReturnableNode( TraversalPosition pos )
       {
          return pos.currentNode().hasRelationship(
              RelTypes.LOVES, Direction.OUTGOING );
       }
   },
   RelTypes.KNOWS,
   Direction.OUTGOING );
Code (3a): Custom traverser
// Traverse the node space and print out the result
System.out.println( "Who’s a lover?" );
for ( Node person : loveTraverser )
{
   System.out.printf( "At depth %d => %s%n",
       loveTraverser.currentPosition().getDepth(),
       person.getProperty( "name" ) );
}
name = “The Architect”
                                name = “Morpheus”
                                rank = “Captain”
 name = “Thomas Anderson”
                                occupation = “Total badass”                                            42
 age = 29
                                                    disclosure = public


                    KNOWS                            KNOWS                         KNO                  CODED_BY
      1                                    7                              3           W   S

                    KN                                                                             13

                                                S
                                                            name = “Cypher”

                                           KNOW
                       OW
                            S                               last name = “Reagan”
                                                                                                name = “Agent Smith”
          LO                                                              disclosure = secret   version = 1.0b
             VE                                                           age = 6 months        language = C++
                S
                                       2

                                name = “Trinity”
                                                                     $ bin/start-neo-example
new ReturnableEvaluator()
                                                                     Who’s a lover?
{
  public boolean isReturnableNode(
    TraversalPosition pos)
                                                                     At depth 1 => Trinity
  {                                                                  $
    return pos.currentNode().
      hasRelationship( RelTypes.LOVES,
         Direction.OUTGOING );
  }
},
Bonus code: domain model
    How do you implement your domain model?
    Use the delegator pattern, i.e. every domain entity wraps a
    Neo4j primitive:
// In package org.yourdomain.yourapp
class PersonImpl implements Person
{
   private final Node underlyingNode;
   PersonImpl( Node node ){ this.underlyingNode = node; }

    public String getName()
    {
       return (String) this.underlyingNode.getProperty( "name" );
    }
    public void setName( String name )
    {
       this.underlyingNode.setProperty( "name", name );
    }
}
Domain layer frameworks
 Qi4j (www.qi4j.org)
   Framework for doing DDD in pure Java5
   Def nes Entities / Associations / Properties
      i
      Sound familiar? Nodes / Rel’s / Properties!
   Neo4j is an “EntityStore” backend

 Jo4neo (http://code.google.com/p/jo4neo)
   Annotation driven
   Weaves Neo4j-backed persistence into domain objects
   at runtime
Neo4j system characteristics
  Disk-based
     Native graph storage engine with custom binary on-disk
     format
  Transactional
    JTA/JTS, XA, 2PC, Tx recovery, deadlock detection,
    MVCC, etc
  Scales up
    Many billions of nodes/rels/props on single JVM
  Robust
    6+ years in 24/7 production
Social network pathE xists()
          12
                               ~1k persons
7         1
                           3   Avg 50 friends per
                               person
                               pathExists(a, b) limit
                               depth 4
    36
         41
                 5
                      77
                               Two backends
                               Eliminate disk IO so
                               warm up caches
Social network pathE xists()

                 2
                Emil
         1                                    5
                                      7
        Mike                                Kevin
                           3        John
                         Marcus
                   9                4
                 Bruce            Leigh

                                  # persons query time
Relational database                   1 000 2 000 ms
Graph database (Neo4j)                1 000      2 ms
Graph database (Neo4j)            1 000 000      2 ms
Pros & Cons compared to RDBMS
+ No O/R impedance mismatch (whiteboard friendly)
+ Can easily evolve schemas
+ Can represent semi-structured info
+ Can represent graphs/networks (with performance)

-   Lacks in tool and framework support
-   Few other implementations => potential lock in
-   No support for ad-hoc queries
Query languages
 SPARQL – “SQL for linked data”
    Ex: ”SELECT ?person WHERE {
          ?person neo4j:KNOWS ?friend .
          ?friend neo4j:KNOWS ?foe .
          ?foe neo4j:name “Larry Ellison” .
     }”


 Gremlin – “perl for graphs”
    Ex: ”./outE[@label='KNOWS']/inV[@age   > 30]/@name”
    Chec it out at http://try.neo4j.org
The Neo4j ecosystem
 Neo4j is an embedded database
   Tiny teeny lil jar f le
                      i
 Component ecosystem
   index
   meta-model
   graph-matching
   remote-graphdb
   sparql-engine
   ...
 See http://components.neo4j.org
Language bindings
 Neo4j.py – bindings for Jython and CPython
   http://components.neo4j.org/neo4j.py
 Neo4jrb – bindings for JRuby (incl RESTful API)
   http://wiki.neo4j.org/content/Ruby
 Neo4jrb-simple
   http://github.com/mdeiters/neo4jr-simple
 Clojure
   http://wiki.neo4j.org/content/Clojure
 Scala (incl RESTful API)
   http://wiki.neo4j.org/content/Scala
Grails Neoclipse screendump
Scale out – replication
  Neo4j HA in internal beta testing with cust's now
  Master-slave replication, 1st configuration
     MySQL style... ish
     Except all instances can write, synchronously between
     writing slave & master (strong consistency)
     Updates are asynchronously propagated to the other
     slaves (eventual consistency)
  This can handle billions of entities...
  … but not 100B
Scale out – partitioning
  “Sharding possible today”
     … but you have to do manual work
     … just as with MySQL
  Transparent partitioning? Neo4j 2.0
    100B? Easy to say. Sliiiiightly harder to do.
    Fundamentals: BASE & eventual consistency
    Generic clustering algorithm as base case, but give lots of
    knobs for developers
While we're on future stuff
  Neo4j 1.1 [July 2010]
     Full JMX / SNMP support
     Event framework (feedback plz!)
     Traverser framework iteration
     …
  Neo4j HA FCS this summer, GA this fall
  REST-ful API (already out in 0.8, please give feedback for 1.0!)
  Framework integration: Roo, Grails
  Database integration: CouchDB, MySQL
  Neo4j Spatial (uDig integr, {R,Quad}-tree etc, OSM imports)
  Client tools: Webling, Neoclipse, Monitoring console
How ego are you? (aka other impls?)
  Franz’ A lleg roG ra ph      (http://agraph.franz.com)

     Proprietary, Lisp, RDF-oriented but real graphdb
  Sones g ra phD B      (http://sones.com)

     Proprietary, .NET, in beta
  Twitter's Floc k D B    (http://github.com/twitter/flockdb)

     Twitter's graph database for large and shallow graphs
  Google P reg el   (http://bit.ly/dP9IP)

    We are oh-so-secret
  Some academic papers from ~10 years ago
     G = {V, E} #FAIL
Conclusion
 Graphs && Neo4j => teh awesome!
 Available NOW under AGPLv3 / commercial license
   AGPLv3: “if you’re open source, we’re open source”
   If you have proprietary software? Must buy a commercial
   license
   But the first one is free!
 Download
   http://neo4j.org
 Feedback
   http://lists.neo4j.org
Questions?




             Image credit: lost again! Sorry :(
http://neotechnology.com

Contenu connexe

En vedette

Graph Databases for SQL Server Professionals
Graph Databases for SQL Server ProfessionalsGraph Databases for SQL Server Professionals
Graph Databases for SQL Server ProfessionalsStéphane Fréchette
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL ServerStéphane Fréchette
 
Intro to Spring Data Neo4j
Intro to Spring Data Neo4jIntro to Spring Data Neo4j
Intro to Spring Data Neo4jjexp
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentationjexp
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph DatabasesMax De Marzi
 
Neo4j Partner Tag Berlin - Potential für System-Integratoren und Berater
Neo4j Partner Tag Berlin - Potential für System-Integratoren und Berater Neo4j Partner Tag Berlin - Potential für System-Integratoren und Berater
Neo4j Partner Tag Berlin - Potential für System-Integratoren und Berater Neo4j
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenLorenzo Alberton
 

En vedette (7)

Graph Databases for SQL Server Professionals
Graph Databases for SQL Server ProfessionalsGraph Databases for SQL Server Professionals
Graph Databases for SQL Server Professionals
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL Server
 
Intro to Spring Data Neo4j
Intro to Spring Data Neo4jIntro to Spring Data Neo4j
Intro to Spring Data Neo4j
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentation
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 
Neo4j Partner Tag Berlin - Potential für System-Integratoren und Berater
Neo4j Partner Tag Berlin - Potential für System-Integratoren und Berater Neo4j Partner Tag Berlin - Potential für System-Integratoren und Berater
Neo4j Partner Tag Berlin - Potential für System-Integratoren und Berater
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 

Similaire à NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)

A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)
A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)
A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)Emil Eifrem
 
Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011jexp
 
Django and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assDjango and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assTobias Lindaaker
 
NOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jNOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jTobias Lindaaker
 
Graph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaGraph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaConnected Data World
 
using Spring and MongoDB on Cloud Foundry
using Spring and MongoDB on Cloud Foundryusing Spring and MongoDB on Cloud Foundry
using Spring and MongoDB on Cloud FoundryJoshua Long
 
CSC 8101 Non Relational Databases
CSC 8101 Non Relational DatabasesCSC 8101 Non Relational Databases
CSC 8101 Non Relational Databasessjwoodman
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBWilliam LaForest
 
No Sql Movement
No Sql MovementNo Sql Movement
No Sql MovementAjit Koti
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jDebanjan Mahata
 
Graph database in sv meetup
Graph database in sv meetupGraph database in sv meetup
Graph database in sv meetupJoshua Bae
 
20130204 graph to-pacer-xml
20130204 graph to-pacer-xml20130204 graph to-pacer-xml
20130204 graph to-pacer-xmlDavid Colebatch
 
SQL Server 2008 Overview
SQL Server 2008 OverviewSQL Server 2008 Overview
SQL Server 2008 OverviewDavid Chou
 
Introduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseIntroduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseDipti Borkar
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsGeorge Stathis
 

Similaire à NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010) (20)

A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)
A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)
A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)
 
Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011
 
Django and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assDjango and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks ass
 
NOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jNOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4j
 
No Sql
No SqlNo Sql
No Sql
 
Anti-social Databases
Anti-social DatabasesAnti-social Databases
Anti-social Databases
 
Graph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaGraph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora Lassila
 
Grails goes Graph
Grails goes GraphGrails goes Graph
Grails goes Graph
 
using Spring and MongoDB on Cloud Foundry
using Spring and MongoDB on Cloud Foundryusing Spring and MongoDB on Cloud Foundry
using Spring and MongoDB on Cloud Foundry
 
CSC 8101 Non Relational Databases
CSC 8101 Non Relational DatabasesCSC 8101 Non Relational Databases
CSC 8101 Non Relational Databases
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDB
 
STI Summit 2011 - Digital Worlds
STI Summit 2011 - Digital WorldsSTI Summit 2011 - Digital Worlds
STI Summit 2011 - Digital Worlds
 
No Sql Movement
No Sql MovementNo Sql Movement
No Sql Movement
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4j
 
Graph database in sv meetup
Graph database in sv meetupGraph database in sv meetup
Graph database in sv meetup
 
20130204 graph to-pacer-xml
20130204 graph to-pacer-xml20130204 graph to-pacer-xml
20130204 graph to-pacer-xml
 
Graph Theory and Databases
Graph Theory and DatabasesGraph Theory and Databases
Graph Theory and Databases
 
SQL Server 2008 Overview
SQL Server 2008 OverviewSQL Server 2008 Overview
SQL Server 2008 Overview
 
Introduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseIntroduction to NoSQL and Couchbase
Introduction to NoSQL and Couchbase
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data Lessons
 

Plus de Emil Eifrem

Startups in Sweden vs Startups in Silicon Valley, 2015 edition
Startups in Sweden vs Startups in Silicon Valley, 2015 editionStartups in Sweden vs Startups in Silicon Valley, 2015 edition
Startups in Sweden vs Startups in Silicon Valley, 2015 editionEmil Eifrem
 
GraphConnect SF 2013 Keynote
GraphConnect SF 2013 KeynoteGraphConnect SF 2013 Keynote
GraphConnect SF 2013 KeynoteEmil Eifrem
 
An Overview of the Emerging Graph Landscape (Oct 2013)
An Overview of the Emerging Graph Landscape (Oct 2013)An Overview of the Emerging Graph Landscape (Oct 2013)
An Overview of the Emerging Graph Landscape (Oct 2013)Emil Eifrem
 
Startups in Sweden vs Startups in Silicon Valley
Startups in Sweden vs Startups in Silicon ValleyStartups in Sweden vs Startups in Silicon Valley
Startups in Sweden vs Startups in Silicon ValleyEmil Eifrem
 
An intro to Neo4j and some use cases (JFokus 2011)
An intro to Neo4j and some use cases (JFokus 2011)An intro to Neo4j and some use cases (JFokus 2011)
An intro to Neo4j and some use cases (JFokus 2011)Emil Eifrem
 
An overview of NOSQL (JFokus 2011)
An overview of NOSQL (JFokus 2011)An overview of NOSQL (JFokus 2011)
An overview of NOSQL (JFokus 2011)Emil Eifrem
 
NOSQL part of the SpringOne 2GX 2010 keynote
NOSQL part of the SpringOne 2GX 2010 keynoteNOSQL part of the SpringOne 2GX 2010 keynote
NOSQL part of the SpringOne 2GX 2010 keynoteEmil Eifrem
 

Plus de Emil Eifrem (7)

Startups in Sweden vs Startups in Silicon Valley, 2015 edition
Startups in Sweden vs Startups in Silicon Valley, 2015 editionStartups in Sweden vs Startups in Silicon Valley, 2015 edition
Startups in Sweden vs Startups in Silicon Valley, 2015 edition
 
GraphConnect SF 2013 Keynote
GraphConnect SF 2013 KeynoteGraphConnect SF 2013 Keynote
GraphConnect SF 2013 Keynote
 
An Overview of the Emerging Graph Landscape (Oct 2013)
An Overview of the Emerging Graph Landscape (Oct 2013)An Overview of the Emerging Graph Landscape (Oct 2013)
An Overview of the Emerging Graph Landscape (Oct 2013)
 
Startups in Sweden vs Startups in Silicon Valley
Startups in Sweden vs Startups in Silicon ValleyStartups in Sweden vs Startups in Silicon Valley
Startups in Sweden vs Startups in Silicon Valley
 
An intro to Neo4j and some use cases (JFokus 2011)
An intro to Neo4j and some use cases (JFokus 2011)An intro to Neo4j and some use cases (JFokus 2011)
An intro to Neo4j and some use cases (JFokus 2011)
 
An overview of NOSQL (JFokus 2011)
An overview of NOSQL (JFokus 2011)An overview of NOSQL (JFokus 2011)
An overview of NOSQL (JFokus 2011)
 
NOSQL part of the SpringOne 2GX 2010 keynote
NOSQL part of the SpringOne 2GX 2010 keynoteNOSQL part of the SpringOne 2GX 2010 keynote
NOSQL part of the SpringOne 2GX 2010 keynote
 

Dernier

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Dernier (20)

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)

  • 1. No SQL? Image credit: http://browsertoolkit.com/fault-tolerance.png
  • 2. No SQL? Image credit: http://browsertoolkit.com/fault-tolerance.png
  • 3. No SQL? Image credit: http://browsertoolkit.com/fault-tolerance.png
  • 4. Neo4j and a NOSQL overview #neo4j Emil Eifrem @emileifrem CEO, Neo Technology emil@neotechnology.com
  • 5. What's the plan? Why now? – Four trends NOSQL overview Graph databases && Neo4j Conclusions
  • 6. Trend 1: data set size 40 2007 Source: IDC 2007
  • 7. 988 Trend 1: data set size 40 2007 2010 Source: IDC 2007
  • 8. Trend 2: connectedness Giant Global Graph (GGG) Information connectivity Ontologies RDF Folksonomies Tagging Wikis User- generated content Blogs RSS Hypertext Text documents web 1.0 web 2.0 “web 3.0” 1990 2000 2010 2020
  • 9. Trend 3: semi-structure Individualization of content! In the salary lists of the 1970s, all elements had exactly one job In the salary lists of the 2000s, we need 5 job columns! Or 8? Or 15? Trend accelerated by the decentralization of content generation that is the hallmark of the age of participation (“web 2.0”)
  • 10. Aside: RDBMS performance Relational database Salary List Performance Majority of Webapps Social network Semantic Trading } custom Data complexity
  • 11. Trend 4: architecture 1990s: Database as integration hub Application Application Application DB
  • 12. Trend 4: architecture 2000s: (Slowly towards...) Decoupled services with their own backend Service Service Service DB DB DB
  • 13. Why NOSQL 2009? Trend 1: Size. Trend 2: Connectivity. Trend 3: Semi-structure. Trend 4: Architecture.
  • 15. First off: the name NoSQL is NOT “Never SQL” NoSQL is NOT “No To SQL”
  • 16. NOSQL is simply N ot O nly S Q L !
  • 17. Four (emerging) NOSQL categories Key-value stores Based on Amazon's Dynamo paper Data model: (global) collection of K-V pairs Example: Dynomite, Voldemort, Tokyo* ColumnFamily / BigTable clones Based on Google's BigTable paper Data model: big table, column families Example: HBase, Hypertable, Cassandra
  • 18. Four (emerging) NOSQL categories Document databases Inspired by Lotus Notes Data model: collections of K-V collections Example: CouchDB, MongoDB, Riak Graph databases Inspired by Euler & graph theory Data model: nodes, rels, K-V on both Example: AllegroGraph, Sones, Neo4j
  • 19. NOSQL data models Key-value stores Data size Bigtable clones Document databases Graph databases Data complexity
  • 20. NOSQL data models Key-value stores Data size Bigtable clones Document databases Graph databases (This is still billio ns of 90% nodes & relationships) of use cases Data complexity
  • 22. The Graph DB model: representation Core abstractions: name = “Emil” Nodes age = 29 sex = “yes” Relationships between nodes Properties on both 1 2 type = KNOWS time = 4 years 3 type = car vendor = “SAAB” model = “95 Aero”
  • 23. Example: The Matrix name = “The Architect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS CODED_BY KNO 1 7 3 W S KN 13 S OW name = “Cypher” KNOW S last name = “Reagan” name = “Agent Smith” disclosure = secret version = 1.0b age = 3 days age = 6 months language = C++ 2 name = “Trinity”
  • 24. Code (1): Building a node space GraphDatabaseService graphDb = ... // Get factory // Create Thomas 'Neo' Anderson Node mrAnderson = graphDb.createNode(); mrAnderson.setProperty( "name", "Thomas Anderson" ); mrAnderson.setProperty( "age", 29 ); // Create Morpheus Node morpheus = graphDb.createNode(); morpheus.setProperty( "name", "Morpheus" ); morpheus.setProperty( "rank", "Captain" ); morpheus.setProperty( "occupation", "Total bad ass" ); // Create a relationship representing that they know each other mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS ); // ...create Trinity, Cypher, Agent Smith, Architect similarly
  • 25. Code (1): Building a node space GraphDatabaseService graphDb = ... // Get factory Transaction tx = graphDb.beginTx(); // Create Thomas 'Neo' Anderson Node mrAnderson = graphDb.createNode(); mrAnderson.setProperty( "name", "Thomas Anderson" ); mrAnderson.setProperty( "age", 29 ); // Create Morpheus Node morpheus = graphDb.createNode(); morpheus.setProperty( "name", "Morpheus" ); morpheus.setProperty( "rank", "Captain" ); morpheus.setProperty( "occupation", "Total bad ass" ); // Create a relationship representing that they know each other mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS ); // ...create Trinity, Cypher, Agent Smith, Architect similarly tx.commit();
  • 26. Code (1b): Def ning RelationshipTypes i // In package org.neo4j.graphdb public interface RelationshipType { String name(); } // In package org.yourdomain.yourapp // Example on how to roll dynamic RelationshipTypes class MyDynamicRelType implements RelationshipType { private final String name; MyDynamicRelType( String name ){ this.name = name; } public String name() { return this.name; } } // Example on how to kick it, static-RelationshipType-like enum MyStaticRelTypes implements RelationshipType { KNOWS, WORKS_FOR, }
  • 27. Whiteboard friendly owns Björn Big Car build drives DayCare
  • 28.
  • 29. The Graph DB model: traversal Traverser framework for high-performance traversing name = “Emil” age = 31 across the node space sex = “yes” 1 2 type = KNOWS time = 4 years 3 type = car vendor = “SAAB” model = “95 Aero”
  • 30. Example: Mr Anderson’s friends name = “The Architect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS CODED_BY KNO 1 7 3 W S KN 13 S OW name = “Cypher” KNOW S last name = “Reagan” name = “Agent Smith” disclosure = secret version = 1.0b age = 3 days age = 6 months language = C++ 2 name = “Trinity”
  • 31. Code (2): Traversing a node space // Instantiate a traverser that returns Mr Anderson's friends Traverser friendsTraverser = mrAnderson.traverse( Traverser.Order.BREADTH_FIRST, StopEvaluator.END_OF_GRAPH, ReturnableEvaluator.ALL_BUT_START_NODE, RelTypes.KNOWS, Direction.OUTGOING ); // Traverse the node space and print out the result System.out.println( "Mr Anderson's friends:" ); for ( Node friend : friendsTraverser ) { System.out.printf( "At depth %d => %s%n", friendsTraverser.currentPosition().getDepth(), friend.getProperty( "name" ) ); }
  • 32. name = “The Architect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS CODED_BY KNO 1 7 3 W S KN 13 S OW name = “Cypher” KNOW S last name = “Reagan” name = “Agent Smith” disclosure = secret version = 1.0b age = 3 days age = 6 months language = C++ 2 name = “Trinity” $ bin/start-neo-example Mr Anderson's friends: At depth 1 => Morpheus friendsTraverser = mrAnderson.traverse( Traverser.Order.BREADTH_FIRST, At depth 1 => Trinity StopEvaluator.END_OF_GRAPH, At depth 2 => Cypher ReturnableEvaluator.ALL_BUT_START_NODE, RelTypes.KNOWS, At depth 3 => Agent Smith Direction.OUTGOING ); $
  • 33. Example: Friends in love? name = “The Architect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS CODED_BY KNO 1 7 3 W S KN 13 S KNOW OW name = “Cypher” S last name = “Reagan” name = “Agent Smith” LO disclosure = secret version = 1.0b VE age = 6 months language = C++ S 2 name = “Trinity”
  • 34. Code (3a): Custom traverser // Create a traverser that returns all “friends in love” Traverser loveTraverser = mrAnderson.traverse( Traverser.Order.BREADTH_FIRST, StopEvaluator.END_OF_GRAPH, new ReturnableEvaluator() { public boolean isReturnableNode( TraversalPosition pos ) { return pos.currentNode().hasRelationship( RelTypes.LOVES, Direction.OUTGOING ); } }, RelTypes.KNOWS, Direction.OUTGOING );
  • 35. Code (3a): Custom traverser // Traverse the node space and print out the result System.out.println( "Who’s a lover?" ); for ( Node person : loveTraverser ) { System.out.printf( "At depth %d => %s%n", loveTraverser.currentPosition().getDepth(), person.getProperty( "name" ) ); }
  • 36. name = “The Architect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS KNO CODED_BY 1 7 3 W S KN 13 S name = “Cypher” KNOW OW S last name = “Reagan” name = “Agent Smith” LO disclosure = secret version = 1.0b VE age = 6 months language = C++ S 2 name = “Trinity” $ bin/start-neo-example new ReturnableEvaluator() Who’s a lover? { public boolean isReturnableNode( TraversalPosition pos) At depth 1 => Trinity { $ return pos.currentNode(). hasRelationship( RelTypes.LOVES, Direction.OUTGOING ); } },
  • 37. Bonus code: domain model How do you implement your domain model? Use the delegator pattern, i.e. every domain entity wraps a Neo4j primitive: // In package org.yourdomain.yourapp class PersonImpl implements Person { private final Node underlyingNode; PersonImpl( Node node ){ this.underlyingNode = node; } public String getName() { return (String) this.underlyingNode.getProperty( "name" ); } public void setName( String name ) { this.underlyingNode.setProperty( "name", name ); } }
  • 38. Domain layer frameworks Qi4j (www.qi4j.org) Framework for doing DDD in pure Java5 Def nes Entities / Associations / Properties i Sound familiar? Nodes / Rel’s / Properties! Neo4j is an “EntityStore” backend Jo4neo (http://code.google.com/p/jo4neo) Annotation driven Weaves Neo4j-backed persistence into domain objects at runtime
  • 39. Neo4j system characteristics Disk-based Native graph storage engine with custom binary on-disk format Transactional JTA/JTS, XA, 2PC, Tx recovery, deadlock detection, MVCC, etc Scales up Many billions of nodes/rels/props on single JVM Robust 6+ years in 24/7 production
  • 40. Social network pathE xists() 12 ~1k persons 7 1 3 Avg 50 friends per person pathExists(a, b) limit depth 4 36 41 5 77 Two backends Eliminate disk IO so warm up caches
  • 41. Social network pathE xists() 2 Emil 1 5 7 Mike Kevin 3 John Marcus 9 4 Bruce Leigh # persons query time Relational database 1 000 2 000 ms Graph database (Neo4j) 1 000 2 ms Graph database (Neo4j) 1 000 000 2 ms
  • 42.
  • 43.
  • 44. Pros & Cons compared to RDBMS + No O/R impedance mismatch (whiteboard friendly) + Can easily evolve schemas + Can represent semi-structured info + Can represent graphs/networks (with performance) - Lacks in tool and framework support - Few other implementations => potential lock in - No support for ad-hoc queries
  • 45. Query languages SPARQL – “SQL for linked data” Ex: ”SELECT ?person WHERE { ?person neo4j:KNOWS ?friend . ?friend neo4j:KNOWS ?foe . ?foe neo4j:name “Larry Ellison” . }” Gremlin – “perl for graphs” Ex: ”./outE[@label='KNOWS']/inV[@age > 30]/@name” Chec it out at http://try.neo4j.org
  • 46. The Neo4j ecosystem Neo4j is an embedded database Tiny teeny lil jar f le i Component ecosystem index meta-model graph-matching remote-graphdb sparql-engine ... See http://components.neo4j.org
  • 47. Language bindings Neo4j.py – bindings for Jython and CPython http://components.neo4j.org/neo4j.py Neo4jrb – bindings for JRuby (incl RESTful API) http://wiki.neo4j.org/content/Ruby Neo4jrb-simple http://github.com/mdeiters/neo4jr-simple Clojure http://wiki.neo4j.org/content/Clojure Scala (incl RESTful API) http://wiki.neo4j.org/content/Scala
  • 48.
  • 49.
  • 51. Scale out – replication Neo4j HA in internal beta testing with cust's now Master-slave replication, 1st configuration MySQL style... ish Except all instances can write, synchronously between writing slave & master (strong consistency) Updates are asynchronously propagated to the other slaves (eventual consistency) This can handle billions of entities... … but not 100B
  • 52. Scale out – partitioning “Sharding possible today” … but you have to do manual work … just as with MySQL Transparent partitioning? Neo4j 2.0 100B? Easy to say. Sliiiiightly harder to do. Fundamentals: BASE & eventual consistency Generic clustering algorithm as base case, but give lots of knobs for developers
  • 53. While we're on future stuff Neo4j 1.1 [July 2010] Full JMX / SNMP support Event framework (feedback plz!) Traverser framework iteration … Neo4j HA FCS this summer, GA this fall REST-ful API (already out in 0.8, please give feedback for 1.0!) Framework integration: Roo, Grails Database integration: CouchDB, MySQL Neo4j Spatial (uDig integr, {R,Quad}-tree etc, OSM imports) Client tools: Webling, Neoclipse, Monitoring console
  • 54. How ego are you? (aka other impls?) Franz’ A lleg roG ra ph (http://agraph.franz.com) Proprietary, Lisp, RDF-oriented but real graphdb Sones g ra phD B (http://sones.com) Proprietary, .NET, in beta Twitter's Floc k D B (http://github.com/twitter/flockdb) Twitter's graph database for large and shallow graphs Google P reg el (http://bit.ly/dP9IP) We are oh-so-secret Some academic papers from ~10 years ago G = {V, E} #FAIL
  • 55. Conclusion Graphs && Neo4j => teh awesome! Available NOW under AGPLv3 / commercial license AGPLv3: “if you’re open source, we’re open source” If you have proprietary software? Must buy a commercial license But the first one is free! Download http://neo4j.org Feedback http://lists.neo4j.org
  • 56.
  • 57. Questions? Image credit: lost again! Sorry :(