SlideShare une entreprise Scribd logo
1  sur  40
Chapter 3
Implementation with NOSQL databases Document Databases (Mongodb)
Graph databases (Neo4j)
Document Databases
 Documents are the main concept in document
databases.
 The database stores and retrieves documents,
which can be XML, JSON, BSON, and so on.
 These documents are self-describing, hierarchical
tree data structures which can consist of maps,
collections, and scalar values.
 The documents stored are similar to each other but
do not have to be exactly the same.
 Document databases store documents in the value
part of the key-value store; think about document
databases as key-value stores where the value is
examinable
What Is a Document Database?
 Document databases are considered to be non-
relational (or NoSQL) databases.
 Instead of storing data in fixed rows and columns,
document databases use flexible documents.
 Document databases are the most popular
alternative to tabular, relational databases.
 They do not have a set number of fields, slots, etc.
and there are no empty spaces -- the missing info is
simply omitted rather than there being an empty slot
left for it. Data can be added, edited, removed and
queried.
 The keys assigned to each document are unique
identifiers required to access data within the
database, usually a path, string or Uniform Resource
Identifier. IDs tend to be indexed in the database to
speed up data retrieval.
 The following list helps draw a parallel between the
two types of databases:
 SQL: Table, Row, Column, Primary
key, Index, View, Nested table or object, Array
 MongoDB: Collection, Document, Field, ObjectId,
Index, View, Embedded document, Array
What are documents?
 A document is a record in a document database. A
document typically stores information about one
object and any of its related metadata.
 Documents store data in field-value pairs. The
values can be a variety of types and structures,
including strings, numbers, dates, arrays, or objects.
Documents can be stored in formats like
JSON, BSON, and XML.
 Example:
Collections
 A collection is a group of documents.
 Collections typically store documents that have
similar contents.
 Not all documents in a collection are required to
have the same fields, because document databases
have a flexible schema.
CRUD operations
 Document databases typically have an API or query
language that allows developers to execute the CRUD
(create, read, update, and delete) operations.
 Create:
 Documents can be created in the database. Each document
has a unique identifier.
 Read:
 Documents can be read from the database. The API or query
language allows developers to query for documents using their
unique identifiers or field values. Indexes can be added to the
database in order to increase read performance.
 Update:
 Existing documents can be updated — either in whole or in
part.
 Delete:
 Documents can be deleted from the database.s
Features
 Consistency
 Availability
 Transactions
 Document model
 Flexible schema
 Distributed and resilient
 Querying through an API or query language
 Consistency:
 Consistency in MongoDB database is configured by using
the replica sets and choosing to wait for the writes to
be replicated to all the slaves or a given number of
slaves.
 Every write can specify the number of servers the write
has to be propagated to before it returns as successful.
 Similar to various options available for read, you can
change the settings to achieve strong write consistency, if
desired.
 By default, a write is reported successful once the
database receives it; you can change this so as to wait for
the writes to be synced to disk or to propagate to two or
more slaves.
This is known as WriteConcern
 Availability:
 The CAP theorem dictates that we can have only
two of Consistency, Availability, and Partition
Tolerance.
 Document databases try to improve on availability by
replicating data using the master-slave setup.
 The same data is available on multiple nodes and
the clients can get to the data even when the primary
node is down.
 Usually, the application code does not have to
determine if the primary node is available or not.
 MongoDB implements replication, providing high
availability using replica sets.
 Transactions:
 Transactions at the single-document level are known as
atomic transactions.
 Transactions involving more than one operation are
not possible, although there are products such as
RavenDB that do support transactions across multiple
operations.
 By default, all writes are reported as successful.
 A finer control over the write can be achieved by using
WriteConcern parameter.
 Document model
 Data is stored in documents (unlike other databases that
store data in structures like tables or graphs).
 Documents map to objects in most popular programming
languages, which allows developers to rapidly develop
their applications.
 Flexible schema:
 Document databases have a flexible schema, meaning
that not all documents in a collection need to have the
same fields.
 Note that some document databases support schema
validation, so the schema can be optionally locked down.
 Distributed and resilient:
 Document databases are distributed, which allows for
horizontal scaling (typically cheaper than vertical scaling)
and data distribution.
 Document databases provide resiliency through
replication.
 Querying through an API or query language:
 Document databases have an API or query language that
allows developers to execute the CRUD operations on the
database.
 Developers have the ability to query for documents based
on unique identifiers or field values.
Suitable Use Cases
 Event Logging
 Content Management Systems, Blogging
Platforms
 Web Analytics or Real-Time Analytics
 E-Commerce Applications
 Event Logging
 Applications have different event logging needs; within
the enterprise, there are many different applications that
want to log events.
 Document databases can store all these different types of
events and can act as a central data store for event
storage.
 This is especially true when the type of data being
captured by the events keeps changing.
 Events can be sharded by the name of the application
where the event originated or by the type of event such as
order_processed or customer_logged.
 Content Management Systems, Blogging
Platforms
 Since document databases have no predefined schemas
and usually understand JSON documents, they work well
in content management systems or applications for
publishing websites, managing user comments, user
registrations, profiles, web-facing documents.
 Web Analytics or Real-Time Analytics
 Document databases can store data for real-time
analytics; since parts of the document can be updated, it’s
very easy to store page views or unique visitors, and new
metrics can be easily added without schema changes
 E-Commerce Applications
 E-commerce applications often need to have flexible
schema for products and orders, as well as the ability to
evolve their data models without expensive database
refactoring or data migration
Examples of Document Data Models
 Amazon DocumentDB
 MongoDB
 Cosmos DB
 ArangoDB
 Couchbase Server
 CouchDB
Advantages:
 The document model is ubiquitous, intuitive, and
enables rapid software development.
 The flexible schema allows for the data model to
change as an application's requirements change.
 Document databases have rich APIs and query
languages that allow developers to easily interact
with their data.
 Document databases are distributed (allowing for
horizontal scaling as well as global data distribution)
and resilient.
Disadvantages:
 Weak Atomicity:
 It lacks in supporting multi-document ACID transactions. A
change in the document data model involving two
collections will require us to run two separate queries i.e.
one for each collection. This is where it breaks atomicity
requirements.
 Consistency Check Limitations:
 One can search the collections and documents that are
not connected to an author collection but doing this might
create a problem in the performance of database
performance.
 Security:
 Nowadays many web applications lack security which in
turn results in the leakage of sensitive data. So it
becomes a point of concern, one must pay attention to
web app vulnerabilities.
Graph Databases
 A graph database is a type of database used to
represent the data in the form of a graph.
 A graph database is a type of NoSQL database that
is designed to handle data with complex
relationships and interconnections.
 In a graph database, data is stored as nodes and
edges, where nodes represent entities and edges
represent the relationships between those entities.
 The concept of a Graph Database is based on the
theory of graphs. It was introduced in the year 2000.
 They are commonly referred to NoSql databases as
data is stored using nodes, relationships and
properties instead of traditional databases.
 A graph database is very useful for heavily
interconnected data. Here relationships between
data are given priority and therefore the relationships
can be easily visualized. They are flexible as new
data can be added without hampering the old ones.
They are useful in the fields of social networking,
fraud detection, AI Knowledge graphs etc.
 It has three components:
 nodes, relationships, and properties.
 Nodes:
 represent the objects or instances.
 They are equivalent to a row in database.
 The node basically acts as a vertex in a graph.
 The nodes are grouped by applying a label to each
member.
 Relationships:
 They are basically the edges in the graph.
 They have a specific direction, type and form patterns of
the data.
 They basically establish relationship between nodes.
 Properties:
 They are the information associated with the nodes.
 Once we have a graph of these nodes and edges
created, we can query the graph in many ways,.
 A query on the graph is also known as traversing the
graph.
 An advantage of the graph databases is that we can
change the traversing requirements without having to
change the nodes or edges.
 In graph databases, traversing the joins or relationships
is very fast.
 The relationship between nodes is not calculated at
query time but is actually persisted as a relationship.
 Traversing persisted relationships is faster than
calculating them for every query.
Features
 Consistency
 Transactions
 Availability
 Query Features
 Consistency
 Since graph databases are operating on connected
nodes, most graph database solutions usually do not
support distributing the nodes on different servers.
 There are some solutions, however, that support node
distribution across a cluster of servers, such as Infinite
Graph.
 Within a single server, data is always consistent,
especially in Neo4J which is fully ACID-compliant.
 When running Neo4J in a cluster, a write to the master is
eventually synchronized to the slaves, while slaves are
always available for read.
 Writes to slaves are allowed and are immediately
synchronized to the master; other slaves will not be
synchronized immediately, though—they will have to wait
for the data to propagate from the master.
 Graph databases ensure consistency through
transactions. They do not allow dangling relationships:
The start node and end node always have to exist, and
nodes can only be deleted if they don’t have any
relationships attached to them.
 Transactions
 Neo4J is ACID-compliant. Before changing any nodes or
adding any relationships to existing nodes, we have to
start a transaction.
 A transaction has to be marked as success, otherwise
Neo4J assumes that it was a failure and rolls it back
when finish is issued.
 sSetting success without issuing finish also does not
commit the data to the database.
 Availability
 Neo4J, as of version 1.8, achieves high availability by
providing for replicated slaves.
 These slaves can also handle writes: When they are
written to, they synchronize the write to the current
master, and the write is committed first at the master and
then at the slave.
 Other slaves will eventually get the update.
 Neo4J uses the Apache ZooKeeper [ZooKeeper] to keep
track of the last transaction IDs persisted on each slave
node and the current master node.
 If the server is the first one to join the cluster, it becomes
the master; when a master goes down, the cluster elects
a master from the available nodes, thus providing high
availability.
 Query Features
 Neo4J also has the Cypher [Cypher] query language
for querying the graph.
 Cypher needs a node to START the query. The start
node can be identified by its node ID, a list of node IDs,
or index lookups.
 Cypher uses the MATCH keyword for matching
patterns in relationships; the WHERE keyword filters
the
 properties on a node or relationship. The RETURN
keyword specifies what gets returned by the query —
nodes, relationships, or fields on the nodes or
relationships.
 Outside these query languages, Neo4J allows you to
query the graph for properties of the nodes, traverse
the graph, or navigate the nodes relationships using
language bindings
 Properties of a node can be indexed using the indexing
service.
 Similarly, properties of relationships or edges can be
indexed, so a node or edge can be found by the value.
Indexes should be queried to find the starting node to
begin a traversal
Advantages
 Establishing the relationships with external sources as
well
 No joins are required since relationships is already
specified.
 Query is dependent on concrete relationships and not on
the amount of data.
 It is flexible and agile.
 it is easy to manage the data in terms of graph.
 Efficient data modeling:
 Graph databases allow for efficient data modeling by
representing data as nodes and edges. This allows for more
flexible and scalable data modeling than traditional relational
databases.
 Flexible relationships:
 Graph databases are designed to handle complex relationships
and interconnections between data elements. This makes them
well-suited for applications that require deep and complex queries,
such as social networks, recommendation engines, and fraud
detection systems.
 High performance:
 Graph databases are optimized for handling large and complex
datasets, making them well-suited for applications that require
high levels of performance and scalability.
 Scalability:
 Graph databases can be easily scaled horizontally, allowing
additional servers to be added to the cluster to handle increased
data volume or traffic.
 Easy to use:
 Graph databases are typically easier to use than traditional
relational databases. They often have a simpler data model and
query language, and can be easier to maintain and scale.
Disadvantages
 Often for complex relationships speed becomes
slower in searching.
 The query language is platform dependent.
 They are inappropriate for transactional data
 It has smaller user base.
 Limited use cases: Graph databases are not suitable
for all applications. They may not be the best choice
for applications that require simple queries or that
deal primarily with data that can be easily
represented in a traditional relational database.
 Specialized knowledge:
 Graph databases may require specialized knowledge and
expertise to use effectively, including knowledge of graph
theory and algorithms.
 Immature technology:
 The technology for graph databases is relatively new and
still evolving, which means that it may not be as stable or
well-supported as traditional relational databases.
 Integration with other tools:
 Graph databases may not be as well-integrated with other
tools and systems as traditional relational databases,
which can make it more difficult to use them in
conjunction with other technologies.
Use cases of graph databases
 Fraud detection
 Connected Data
 Recommendation engines
 Route optimization
 Pattern discovery
 Knowledge management

Contenu connexe

Similaire à 3.Implementation with NOSQL databases Document Databases (Mongodb).pptx

DATABASE MANAGEMENT SYSTEM-MRS. LAXMI B PANDYA FOR 25TH AUGUST,2022.pptx
DATABASE MANAGEMENT SYSTEM-MRS. LAXMI B PANDYA FOR 25TH AUGUST,2022.pptxDATABASE MANAGEMENT SYSTEM-MRS. LAXMI B PANDYA FOR 25TH AUGUST,2022.pptx
DATABASE MANAGEMENT SYSTEM-MRS. LAXMI B PANDYA FOR 25TH AUGUST,2022.pptxLaxmi Pandya
 
2.Introduction to NOSQL (Core concepts).pptx
2.Introduction to NOSQL (Core concepts).pptx2.Introduction to NOSQL (Core concepts).pptx
2.Introduction to NOSQL (Core concepts).pptxRushikeshChikane2
 
Evaluation criteria for nosql databases
Evaluation criteria for nosql databasesEvaluation criteria for nosql databases
Evaluation criteria for nosql databasesEbenezer Daniel
 
data base system to new data science lerne
data base system to new data science lernedata base system to new data science lerne
data base system to new data science lernetarunprajapati0t
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
NOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdfNOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdfajajkhan16
 
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLA STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLijscai
 
A Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQLA Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQLIJSCAI Journal
 
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLA STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLijscai
 
A Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQLA Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQLIJSCAI Journal
 
Analysis on NoSQL: MongoDB Tool
Analysis on NoSQL: MongoDB ToolAnalysis on NoSQL: MongoDB Tool
Analysis on NoSQL: MongoDB Toolijtsrd
 
moving_from_relational_to_nosql_couchbase_2016
moving_from_relational_to_nosql_couchbase_2016moving_from_relational_to_nosql_couchbase_2016
moving_from_relational_to_nosql_couchbase_2016Richard (Rick) Nelson
 
DATABASE Lecture 1 and 2.pptx
DATABASE Lecture 1 and 2.pptxDATABASE Lecture 1 and 2.pptx
DATABASE Lecture 1 and 2.pptxRUBAB79
 
NoSQL powerpoint presentation difference with rdbms
NoSQL powerpoint presentation difference with rdbmsNoSQL powerpoint presentation difference with rdbms
NoSQL powerpoint presentation difference with rdbmsAtulKabbur
 

Similaire à 3.Implementation with NOSQL databases Document Databases (Mongodb).pptx (20)

DATABASE MANAGEMENT SYSTEM-MRS. LAXMI B PANDYA FOR 25TH AUGUST,2022.pptx
DATABASE MANAGEMENT SYSTEM-MRS. LAXMI B PANDYA FOR 25TH AUGUST,2022.pptxDATABASE MANAGEMENT SYSTEM-MRS. LAXMI B PANDYA FOR 25TH AUGUST,2022.pptx
DATABASE MANAGEMENT SYSTEM-MRS. LAXMI B PANDYA FOR 25TH AUGUST,2022.pptx
 
2.Introduction to NOSQL (Core concepts).pptx
2.Introduction to NOSQL (Core concepts).pptx2.Introduction to NOSQL (Core concepts).pptx
2.Introduction to NOSQL (Core concepts).pptx
 
Evaluation criteria for nosql databases
Evaluation criteria for nosql databasesEvaluation criteria for nosql databases
Evaluation criteria for nosql databases
 
data base system to new data science lerne
data base system to new data science lernedata base system to new data science lerne
data base system to new data science lerne
 
WEB_DATABASE_chapter_4.pptx
WEB_DATABASE_chapter_4.pptxWEB_DATABASE_chapter_4.pptx
WEB_DATABASE_chapter_4.pptx
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
NOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdfNOSQL in big data is the not only structure langua.pdf
NOSQL in big data is the not only structure langua.pdf
 
Presentation1
Presentation1Presentation1
Presentation1
 
No sq lv2
No sq lv2No sq lv2
No sq lv2
 
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLA STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
 
A Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQLA Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQL
 
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLA STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
 
A Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQLA Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQL
 
Analysis on NoSQL: MongoDB Tool
Analysis on NoSQL: MongoDB ToolAnalysis on NoSQL: MongoDB Tool
Analysis on NoSQL: MongoDB Tool
 
Oslo bekk2014
Oslo bekk2014Oslo bekk2014
Oslo bekk2014
 
Datastores
DatastoresDatastores
Datastores
 
moving_from_relational_to_nosql_couchbase_2016
moving_from_relational_to_nosql_couchbase_2016moving_from_relational_to_nosql_couchbase_2016
moving_from_relational_to_nosql_couchbase_2016
 
Selecting best NoSQL
Selecting best NoSQL Selecting best NoSQL
Selecting best NoSQL
 
DATABASE Lecture 1 and 2.pptx
DATABASE Lecture 1 and 2.pptxDATABASE Lecture 1 and 2.pptx
DATABASE Lecture 1 and 2.pptx
 
NoSQL powerpoint presentation difference with rdbms
NoSQL powerpoint presentation difference with rdbmsNoSQL powerpoint presentation difference with rdbms
NoSQL powerpoint presentation difference with rdbms
 

Plus de RushikeshChikane2

Chapter 2 System Security.pptx
Chapter 2 System Security.pptxChapter 2 System Security.pptx
Chapter 2 System Security.pptxRushikeshChikane2
 
Security Architectures and Models.pptx
Security Architectures and Models.pptxSecurity Architectures and Models.pptx
Security Architectures and Models.pptxRushikeshChikane2
 
Social Media and Text Analytics
Social Media and Text AnalyticsSocial Media and Text Analytics
Social Media and Text AnalyticsRushikeshChikane2
 
Mining Frequent Patterns, Associations, and.pptx
 Mining Frequent Patterns, Associations, and.pptx Mining Frequent Patterns, Associations, and.pptx
Mining Frequent Patterns, Associations, and.pptxRushikeshChikane2
 
Machine Learning Overview.pptx
Machine Learning Overview.pptxMachine Learning Overview.pptx
Machine Learning Overview.pptxRushikeshChikane2
 
Chapter 4_Introduction to Patterns.ppt
Chapter 4_Introduction to Patterns.pptChapter 4_Introduction to Patterns.ppt
Chapter 4_Introduction to Patterns.pptRushikeshChikane2
 
Chapter 3_Architectural Styles.pptx
Chapter 3_Architectural Styles.pptxChapter 3_Architectural Styles.pptx
Chapter 3_Architectural Styles.pptxRushikeshChikane2
 
Chapter 2_Software Architecture.ppt
Chapter 2_Software Architecture.pptChapter 2_Software Architecture.ppt
Chapter 2_Software Architecture.pptRushikeshChikane2
 
Chapter 1_UML Introduction.ppt
Chapter 1_UML Introduction.pptChapter 1_UML Introduction.ppt
Chapter 1_UML Introduction.pptRushikeshChikane2
 

Plus de RushikeshChikane2 (9)

Chapter 2 System Security.pptx
Chapter 2 System Security.pptxChapter 2 System Security.pptx
Chapter 2 System Security.pptx
 
Security Architectures and Models.pptx
Security Architectures and Models.pptxSecurity Architectures and Models.pptx
Security Architectures and Models.pptx
 
Social Media and Text Analytics
Social Media and Text AnalyticsSocial Media and Text Analytics
Social Media and Text Analytics
 
Mining Frequent Patterns, Associations, and.pptx
 Mining Frequent Patterns, Associations, and.pptx Mining Frequent Patterns, Associations, and.pptx
Mining Frequent Patterns, Associations, and.pptx
 
Machine Learning Overview.pptx
Machine Learning Overview.pptxMachine Learning Overview.pptx
Machine Learning Overview.pptx
 
Chapter 4_Introduction to Patterns.ppt
Chapter 4_Introduction to Patterns.pptChapter 4_Introduction to Patterns.ppt
Chapter 4_Introduction to Patterns.ppt
 
Chapter 3_Architectural Styles.pptx
Chapter 3_Architectural Styles.pptxChapter 3_Architectural Styles.pptx
Chapter 3_Architectural Styles.pptx
 
Chapter 2_Software Architecture.ppt
Chapter 2_Software Architecture.pptChapter 2_Software Architecture.ppt
Chapter 2_Software Architecture.ppt
 
Chapter 1_UML Introduction.ppt
Chapter 1_UML Introduction.pptChapter 1_UML Introduction.ppt
Chapter 1_UML Introduction.ppt
 

Dernier

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 

Dernier (20)

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 

3.Implementation with NOSQL databases Document Databases (Mongodb).pptx

  • 1. Chapter 3 Implementation with NOSQL databases Document Databases (Mongodb) Graph databases (Neo4j)
  • 2. Document Databases  Documents are the main concept in document databases.  The database stores and retrieves documents, which can be XML, JSON, BSON, and so on.  These documents are self-describing, hierarchical tree data structures which can consist of maps, collections, and scalar values.  The documents stored are similar to each other but do not have to be exactly the same.  Document databases store documents in the value part of the key-value store; think about document databases as key-value stores where the value is examinable
  • 3. What Is a Document Database?  Document databases are considered to be non- relational (or NoSQL) databases.  Instead of storing data in fixed rows and columns, document databases use flexible documents.  Document databases are the most popular alternative to tabular, relational databases.  They do not have a set number of fields, slots, etc. and there are no empty spaces -- the missing info is simply omitted rather than there being an empty slot left for it. Data can be added, edited, removed and queried.
  • 4.  The keys assigned to each document are unique identifiers required to access data within the database, usually a path, string or Uniform Resource Identifier. IDs tend to be indexed in the database to speed up data retrieval.  The following list helps draw a parallel between the two types of databases:  SQL: Table, Row, Column, Primary key, Index, View, Nested table or object, Array  MongoDB: Collection, Document, Field, ObjectId, Index, View, Embedded document, Array
  • 5. What are documents?  A document is a record in a document database. A document typically stores information about one object and any of its related metadata.  Documents store data in field-value pairs. The values can be a variety of types and structures, including strings, numbers, dates, arrays, or objects. Documents can be stored in formats like JSON, BSON, and XML.
  • 7. Collections  A collection is a group of documents.  Collections typically store documents that have similar contents.  Not all documents in a collection are required to have the same fields, because document databases have a flexible schema.
  • 8. CRUD operations  Document databases typically have an API or query language that allows developers to execute the CRUD (create, read, update, and delete) operations.  Create:  Documents can be created in the database. Each document has a unique identifier.  Read:  Documents can be read from the database. The API or query language allows developers to query for documents using their unique identifiers or field values. Indexes can be added to the database in order to increase read performance.  Update:  Existing documents can be updated — either in whole or in part.  Delete:  Documents can be deleted from the database.s
  • 9. Features  Consistency  Availability  Transactions  Document model  Flexible schema  Distributed and resilient  Querying through an API or query language
  • 10.  Consistency:  Consistency in MongoDB database is configured by using the replica sets and choosing to wait for the writes to be replicated to all the slaves or a given number of slaves.  Every write can specify the number of servers the write has to be propagated to before it returns as successful.  Similar to various options available for read, you can change the settings to achieve strong write consistency, if desired.  By default, a write is reported successful once the database receives it; you can change this so as to wait for the writes to be synced to disk or to propagate to two or more slaves. This is known as WriteConcern
  • 11.  Availability:  The CAP theorem dictates that we can have only two of Consistency, Availability, and Partition Tolerance.  Document databases try to improve on availability by replicating data using the master-slave setup.  The same data is available on multiple nodes and the clients can get to the data even when the primary node is down.  Usually, the application code does not have to determine if the primary node is available or not.  MongoDB implements replication, providing high availability using replica sets.
  • 12.  Transactions:  Transactions at the single-document level are known as atomic transactions.  Transactions involving more than one operation are not possible, although there are products such as RavenDB that do support transactions across multiple operations.  By default, all writes are reported as successful.  A finer control over the write can be achieved by using WriteConcern parameter.
  • 13.  Document model  Data is stored in documents (unlike other databases that store data in structures like tables or graphs).  Documents map to objects in most popular programming languages, which allows developers to rapidly develop their applications.  Flexible schema:  Document databases have a flexible schema, meaning that not all documents in a collection need to have the same fields.  Note that some document databases support schema validation, so the schema can be optionally locked down.
  • 14.  Distributed and resilient:  Document databases are distributed, which allows for horizontal scaling (typically cheaper than vertical scaling) and data distribution.  Document databases provide resiliency through replication.  Querying through an API or query language:  Document databases have an API or query language that allows developers to execute the CRUD operations on the database.  Developers have the ability to query for documents based on unique identifiers or field values.
  • 15. Suitable Use Cases  Event Logging  Content Management Systems, Blogging Platforms  Web Analytics or Real-Time Analytics  E-Commerce Applications
  • 16.  Event Logging  Applications have different event logging needs; within the enterprise, there are many different applications that want to log events.  Document databases can store all these different types of events and can act as a central data store for event storage.  This is especially true when the type of data being captured by the events keeps changing.  Events can be sharded by the name of the application where the event originated or by the type of event such as order_processed or customer_logged.
  • 17.  Content Management Systems, Blogging Platforms  Since document databases have no predefined schemas and usually understand JSON documents, they work well in content management systems or applications for publishing websites, managing user comments, user registrations, profiles, web-facing documents.  Web Analytics or Real-Time Analytics  Document databases can store data for real-time analytics; since parts of the document can be updated, it’s very easy to store page views or unique visitors, and new metrics can be easily added without schema changes
  • 18.  E-Commerce Applications  E-commerce applications often need to have flexible schema for products and orders, as well as the ability to evolve their data models without expensive database refactoring or data migration
  • 19. Examples of Document Data Models  Amazon DocumentDB  MongoDB  Cosmos DB  ArangoDB  Couchbase Server  CouchDB
  • 20. Advantages:  The document model is ubiquitous, intuitive, and enables rapid software development.  The flexible schema allows for the data model to change as an application's requirements change.  Document databases have rich APIs and query languages that allow developers to easily interact with their data.  Document databases are distributed (allowing for horizontal scaling as well as global data distribution) and resilient.
  • 21. Disadvantages:  Weak Atomicity:  It lacks in supporting multi-document ACID transactions. A change in the document data model involving two collections will require us to run two separate queries i.e. one for each collection. This is where it breaks atomicity requirements.  Consistency Check Limitations:  One can search the collections and documents that are not connected to an author collection but doing this might create a problem in the performance of database performance.  Security:  Nowadays many web applications lack security which in turn results in the leakage of sensitive data. So it becomes a point of concern, one must pay attention to web app vulnerabilities.
  • 22. Graph Databases  A graph database is a type of database used to represent the data in the form of a graph.  A graph database is a type of NoSQL database that is designed to handle data with complex relationships and interconnections.  In a graph database, data is stored as nodes and edges, where nodes represent entities and edges represent the relationships between those entities.  The concept of a Graph Database is based on the theory of graphs. It was introduced in the year 2000.
  • 23.  They are commonly referred to NoSql databases as data is stored using nodes, relationships and properties instead of traditional databases.  A graph database is very useful for heavily interconnected data. Here relationships between data are given priority and therefore the relationships can be easily visualized. They are flexible as new data can be added without hampering the old ones. They are useful in the fields of social networking, fraud detection, AI Knowledge graphs etc.
  • 24.  It has three components:  nodes, relationships, and properties.  Nodes:  represent the objects or instances.  They are equivalent to a row in database.  The node basically acts as a vertex in a graph.  The nodes are grouped by applying a label to each member.
  • 25.  Relationships:  They are basically the edges in the graph.  They have a specific direction, type and form patterns of the data.  They basically establish relationship between nodes.  Properties:  They are the information associated with the nodes.
  • 26.
  • 27.
  • 28.  Once we have a graph of these nodes and edges created, we can query the graph in many ways,.  A query on the graph is also known as traversing the graph.  An advantage of the graph databases is that we can change the traversing requirements without having to change the nodes or edges.  In graph databases, traversing the joins or relationships is very fast.  The relationship between nodes is not calculated at query time but is actually persisted as a relationship.  Traversing persisted relationships is faster than calculating them for every query.
  • 29. Features  Consistency  Transactions  Availability  Query Features
  • 30.  Consistency  Since graph databases are operating on connected nodes, most graph database solutions usually do not support distributing the nodes on different servers.  There are some solutions, however, that support node distribution across a cluster of servers, such as Infinite Graph.  Within a single server, data is always consistent, especially in Neo4J which is fully ACID-compliant.  When running Neo4J in a cluster, a write to the master is eventually synchronized to the slaves, while slaves are always available for read.
  • 31.  Writes to slaves are allowed and are immediately synchronized to the master; other slaves will not be synchronized immediately, though—they will have to wait for the data to propagate from the master.  Graph databases ensure consistency through transactions. They do not allow dangling relationships: The start node and end node always have to exist, and nodes can only be deleted if they don’t have any relationships attached to them.
  • 32.  Transactions  Neo4J is ACID-compliant. Before changing any nodes or adding any relationships to existing nodes, we have to start a transaction.  A transaction has to be marked as success, otherwise Neo4J assumes that it was a failure and rolls it back when finish is issued.  sSetting success without issuing finish also does not commit the data to the database.
  • 33.  Availability  Neo4J, as of version 1.8, achieves high availability by providing for replicated slaves.  These slaves can also handle writes: When they are written to, they synchronize the write to the current master, and the write is committed first at the master and then at the slave.  Other slaves will eventually get the update.  Neo4J uses the Apache ZooKeeper [ZooKeeper] to keep track of the last transaction IDs persisted on each slave node and the current master node.  If the server is the first one to join the cluster, it becomes the master; when a master goes down, the cluster elects a master from the available nodes, thus providing high availability.
  • 34.  Query Features  Neo4J also has the Cypher [Cypher] query language for querying the graph.  Cypher needs a node to START the query. The start node can be identified by its node ID, a list of node IDs, or index lookups.  Cypher uses the MATCH keyword for matching patterns in relationships; the WHERE keyword filters the
  • 35.  properties on a node or relationship. The RETURN keyword specifies what gets returned by the query — nodes, relationships, or fields on the nodes or relationships.  Outside these query languages, Neo4J allows you to query the graph for properties of the nodes, traverse the graph, or navigate the nodes relationships using language bindings  Properties of a node can be indexed using the indexing service.  Similarly, properties of relationships or edges can be indexed, so a node or edge can be found by the value. Indexes should be queried to find the starting node to begin a traversal
  • 36. Advantages  Establishing the relationships with external sources as well  No joins are required since relationships is already specified.  Query is dependent on concrete relationships and not on the amount of data.  It is flexible and agile.  it is easy to manage the data in terms of graph.  Efficient data modeling:  Graph databases allow for efficient data modeling by representing data as nodes and edges. This allows for more flexible and scalable data modeling than traditional relational databases.
  • 37.  Flexible relationships:  Graph databases are designed to handle complex relationships and interconnections between data elements. This makes them well-suited for applications that require deep and complex queries, such as social networks, recommendation engines, and fraud detection systems.  High performance:  Graph databases are optimized for handling large and complex datasets, making them well-suited for applications that require high levels of performance and scalability.  Scalability:  Graph databases can be easily scaled horizontally, allowing additional servers to be added to the cluster to handle increased data volume or traffic.  Easy to use:  Graph databases are typically easier to use than traditional relational databases. They often have a simpler data model and query language, and can be easier to maintain and scale.
  • 38. Disadvantages  Often for complex relationships speed becomes slower in searching.  The query language is platform dependent.  They are inappropriate for transactional data  It has smaller user base.  Limited use cases: Graph databases are not suitable for all applications. They may not be the best choice for applications that require simple queries or that deal primarily with data that can be easily represented in a traditional relational database.
  • 39.  Specialized knowledge:  Graph databases may require specialized knowledge and expertise to use effectively, including knowledge of graph theory and algorithms.  Immature technology:  The technology for graph databases is relatively new and still evolving, which means that it may not be as stable or well-supported as traditional relational databases.  Integration with other tools:  Graph databases may not be as well-integrated with other tools and systems as traditional relational databases, which can make it more difficult to use them in conjunction with other technologies.
  • 40. Use cases of graph databases  Fraud detection  Connected Data  Recommendation engines  Route optimization  Pattern discovery  Knowledge management