Understanding how memory is managed with MongoDB is instrumental in maximizing database performance and hardware utilisation. This talk covers the workings of low level operating system components like the page cache and memory mapped files. We will examine the differences between RAM, SSD and hard disk drives to help you choose the right hardware configuration. Finally, we will learn how to monitor and analyze memory and disk usage using the MongoDB Management Service, linux administration commands and MongoDB commands.
2. Hello
everyone,
my
name
is
Alon
Horev,
I’m
based
in
Israel
and
I’m
working
at
intucell
which
was
acquired
by
Cisco.
I’m
a
python
developer
and
I
lead
intucell’s
data
team.
About
two
years
ago
we
migrated
our
product
off
MySQL
and
started
working
with
MongoDB.
I
want
to
start
off
by
introducing
our
use
case
of
MongoDB:
We’ve
built
a
system
that
opJmizes
cellular
networks
automaJcally.
OpJmizing
cellular
networks
is
about
making
your
data
connecJon
faster
and
improve
the
quality
of
your
calls.
2
3. The
way
we
do
this
is
preOy
simple,
we
collect
a
lot
of
staJsJcs
about
what
goes
on
in
the
network,
like
how
many
calls
are
taking
place
or
how
many
users
are
connected
to
the
antenna.
We
then
analyze
this
informaJon
to
idenJfy
things
like
which
antennas
are
loaded.
Once
we
know
what
are
the
problems
in
the
network
we
act,
we
change
parameters
in
the
network
,
for
example,
we
would
force
your
phone
to
use
a
different
antenna
so
you’ll
get
a
beOer
service.
Now,
as
you
see
this
process
is
cyclic,
we’ll
collect
more
staJsJcs
to
make
further
changes
and
make
sure
we
improved
the
network.
This
happens
all
the
Jme,
even
here
right
now,
with
AT&T.
In
the
process
of
working
with
MongoDB
we
learned
a
lot
about
database
performance
and
server
performance.
I
personally
spent
a
lot
of
Jme
monitoring
and
opJmizing
the
storage
and
memory
usage
which
brings
me
to
this
lecture.
3
4. Today
I’m
going
to
try
and
give
you
an
understanding
of
how
MongoDB
manages
memory.
So,
first,
what
is
'memory
management'
when
it
comes
to
MongoDB?
Well,
memory
is
a
fast
but
limited
and
expensive
resource,
memory
management
is
about
deciding
what
data
to
save
in
memory.
4
5. Why
should
you
care
about
memory
management?
memory
management
has
a
huge
impact
on
performance
and
costs.
This
relates
both
to
developers
and
dbas,
as
a
developer
you
can
opJmize
the
schema
and
queries
for
beOer
memory
usage,
As
a
dba
you
can
monitor
and
predict
performance
issues
related
to
memory
usage.
I’m
preOy
sure
every
mongodb
administrator
asked
himself
atleast
once:
how
much
memory
do
I
really
need?.
Before
we
dive
in
I
want
to
tell
you
a
liOle
secret:
MongoDB
doesn’t
actually
manage
memory.
It
leaves
that
responsibility
to
the
operaJng
system.
5
6. Within
the
operaJng
system
there’s
a
stack
of
components
which
MongoDB
depends
on
to
manage
memory.
Each
component
relies
on
the
component
below
it.
(!)
This
talk
is
structured
around
this
stack
of
components.
We’ll
start
from
the
low
level
components
which
are
storage
devices:
disks
and
RAM
We’ll
conJnue
with
the
page
cache
and
memory
mapped
files
which
are
a
part
of
the
operaJng
system’s
kernel
And
we’ll
finish
off
with
MongoDB’s
usage
of
these
mechanisms.
(!)
Let’s
talk
about
storage.
6
7. There
are
different
types
of
storage
devices
with
different
characterisJcs,
we’ll
review
hard
disk
drives,
solid
state
drives
and
RAM.
Let’s
start
by
breaking
these
into
categories:
(!)
HDDs
and
SSDs
are
persistent
and
RAM
isn’t,
but
RAM
is
really
fast.
That’s
why
every
computer
has
both
types
of
storage,
one
persistent
(a
HDD
or
a
SSD)
and
one
is
volaJle
(RAM).
7
8. Now
let’s
compare
throughput.
As
I
said
before,
RAM
is
fast,
it
could
go
as
fast
as
6400
MBPS
for
reads
and
writes.
SSDs
are
10
Jmes
slower
than
RAM,
modern
SSDs
can
reach
a
read
rate
of
650
MBPS
and
a
liOle
less
for
writes.
HDDs
are
much
slower,
ranging
from
1
MB
to
160
MB
per
second
for
reads
and
writes.
The
reason
there’s
such
variance
in
HDD
speed
is
because
throughput
is
highly
affected
by
access
paOerns.
Specifically
with
HDDs,
random
access
is
much
slower
than
sequenJal
access,
and
that’s
because
a
HDD
contains
a
mechanical
arm
that
needs
to
move
on
almost
every
random
access.
Sadly
for
us,
databases
do
a
lot
of
random
I/O.
which
means,
if
you’re
running
a
query
on
data
that’s
not
in
memory
and
therefore,
it
has
to
be
read
from
disk,
you’re
seeing
a
penalty
of
about
two
mulJtudes
on
response
Jmes.
The
next
characterisJc
is
price.
(!)
For
making
the
comparison
easier
we’ll
compare
the
price
per
GB.
It’s
not
surprising
that
there’s
a
correlaJon
between
price
and
throughput,
meaning,
the
more
you
pay
for
each
GB,
you
get
beOer
throughput.
So
hard
drives
are
really
cheap
at
5
cents
per
GB,
SSDs
are
10
Jmes
more
expensive
and
RAM
is
100
Jmes
more
expensive.
8
9. Is
this
informaJon
sufficient
to
choose
the
opJmal
hardware
configuraJon?
I
think
it’s
not,
your
applicaJon’s
requirements
are
also
a
part
of
the
equaJon.
For
example,
if
your
applicaJon
is
an
archive
that
saves
huge
amounts
of
data
that
is
rarely
accessed,
you
can
go
for
a
large
HDD
and
save
a
lot
of
money.
Later
on
we’ll
see
how
can
you
take
measurements
of
things
like
RAM
and
capacity
and
then
you’ll
be
able
to
determine
what
kind
of
hardware
configuraJon
you
need.
9
10. Now
lets
zoom
out
of
storage
and
and
move
up
to
the
next
layer
which
is
the
page
cache.
10
11. The
page
cache
is
a
part
of
the
operaJng
system’s
kernel
and
whenever
a
program
does
file
I/O
like
reads
and
writes
it
always
goes
through
the
page
cache.
The
page
cache
makes
reads
faster
by
saving
popular
chunks
of
data
in
memory
and
makes
writes
faster
by
lehng
the
applicaJon
write
to
memory
and
not
to
disk.
So
we
can
say
the
page
cache
was
invented
to
combine
the
disk’s
persistence
with
the
memory’s
speed.
It’s
about
having
the
best
of
both
worlds.
11
12. So..
It’s
called
the
page
cache
but
what
is
a
page?
A
page
is
a
4K
chunk
of
data.
Each
file
is
broken
into
pages.
The
number
of
pages
belong
to
a
file
is
simply
the
file’s
size
divided
by
4K.
(!)
Looking
at
the
example,
you
can
see
a
file
spanning
3
pages
because
it’s
10
kilobytes
in
size,
that
grey
area
is
an
unused
part
of
the
last
page
as
the
file’s
size
isn’t
a
mulJple
of
4
kilobytes.
The
page
cache’s
job
is
to
determine
which
pages
to
save
in
memory.
12
13. Lets
dive
a
liOle
deeper
and
see
what
happens
behind
the
scenes
when
we
read
from
a
file.
(!)
We
have
a
process
running
in
user
space
and
it’s
reading
100
bytes
from
a
file.
(!)
Through
a
system
call
we
get
to
the
kernel
where
the
page
cache
handles
the
read
request.
(!)
First,
the
page
cache
translates
the
posiJon
and
count
of
bytes
to
read
to
a
list
of
pages.
If
we
would
read
a
100
bytes
from
the
beginning
of
the
file,
the
result
of
this
step
would
be
the
first
page.
(!)
The
next
thing
the
page
cache
will
do
is
check
if
the
page
exists
in
the
cache,
(!)
if
it’s
not,
the
data
has
to
be
read
from
disk
and
then
it
will
be
stored
in
the
cache.
Once
the
page
is
in
the
cache
we
reach
the
last
step,
(!)
which
is
to
copy
the
data
to
the
user
space
applicaJon.
So
that’s
how
a
read
works.
13
14. The
page
cache
also
handles
writes.
(!)
This
Jme
our
process
is
calling
the
write
system
call.
(!)
The
page
cache
copies
the
data
from
the
process
to
the
relevant
pages
and
marks
them
as
dirty.
That’s
all
it
does,
change
data
in
memory.
It
gives
the
impression
the
data
has
been
wriOen,
where
in
fact
it
has
been
wriOen
only
to
memory
and
not
to
disk.
If
an
applicaJon
would
read
from
the
file
it
would
get
the
latest
the
data
from
memory
because
dirty
pages
must
stay
in
the
cache.
Having
dirty
pages
is
somewhat
dangerous
for
two
reasons:
first,
they
will
be
lost
if
the
operaJng
system
crashes.
Second,
if
there’s
a
lack
of
memory
they
can’t
be
freed.
The
soluJon
for
these
problems
is
to
flush
the
dirty
pages
to
the
disk.
(!)
There’s
a
thread
in
the
kernel
that
flushes
pages
aler
they
stay
in
the
cache
for
some
Jme
or
when
memory
needs
to
be
freed.
If
a
process
wants
to
make
sure
the
data
is
flushed
to
disk
it
can
call
the
fsync
system
call
that
can
trigger
a
flush
for
a
specific
file
or
even
the
enJre
file
system.
MongoDB
calls
that
every
30
seconds
to
make
sure
data
is
backed
by
disk.
14
15. I
menJoned
how
the
page
cache
frees
pages
when
memory
is
running
low,
this
procedure
is
called
page
reclamaJon.
There
are
different
page
reclamaJon
policies.
A
page
reclamaJon
policy
is
an
algorithm
that
answers
a
simple
quesJon:
“what’s
the
next
page
that
can
be
freed?”
In
linux,
the
simple
answer
is:
“The
one
that
is
the
least
recently
used”.
Turns
out
page
reclamaJon
is
happening
all
the
Jme
even
on
healthy
systems,
it
doesn’t
mean
you’re
out
of
memory.
That’s
because
the
page
cache
is
greedy
and
will
try
to
use
all
the
free
memory
on
your
machine
to
cache
the
file
system.
In
order
to
understand
how
much
memory
is
used
by
the
page
cache
you
can
use
the
free
command.
15
16. Free
is
a
linux
program
that
displays
memory
usage
staJsJcs.
Lets
try
to
interpret
its
output.
When
running
free
with
–g
it
prints
units
in
GBs.
The
first
line
reveals
the
total
amount
of
memory
which
is
64GB,
out
of
these
61GB
are
used
and
3GB
are
free.
Then,
out
of
the
61GB
that
are
used,
55GB
are
of
of
cached
data.
These
are
pages
in
the
page
cache.
The
second
line
interprets
the
cached
data
as
free
so
we
suddenly
have
only
5GB
of
used
memory.
This
is
memory
directly
allocated
by
programs.
The
reason
cached
memory
can
be
considered
free
is
because
even
though
the
memory
is
used
it
will
be
freed
if
programs
need
it.
As
soon
as
programs
allocate
memory
and
the
free
memory
runs
out
the
page
cache
shrinks
and
frees
pages.
16
18. Memory
mapping
of
files
is
an
alternaJve
mechanism
for
reading
and
wriJng
from
files.
Instead
of
calling
the
read()
and
write()
system
calls,
a
process
can
map
a
part
of
file
into
memory
and
every
access
the
process
makes
to
memory
translates
to
a
file
read
or
write.
On
the
lel
you
can
see
a
process
with
a
memory
region
which
is
mapped
to
a
segment
of
a
file.
So
memory
addresses
100
to
200
are
mapped
to
a
file
segment
that
starts
at
400
and
ends
at
500.
A
write
to
memory
address
100
is
translated
to
a
write
to
the
file
at
address
400.
Mapping
a
file
into
memory
doesn’t
necessarily
load
its
data
into
memory,
if
a
process
reads
from
a
page
that
is
not
in
memory
the
infamous
page
fault
is
triggered.
The
code
in
the
kernel
that
handles
page
faults
tells
the
page
cache
to
load
the
required
pieces
of
data
from
disk
and
then
serves
the
read.
So
memory
mapping
has
several
advantages
over
regular
file
I/O:
First,
it’s
fast,
there’s
no
system
call
involved
and
no
copying
of
memory.
Reads
and
writes
access
memory
that
is
allocated
in
the
page
cache.
Second,
it
takes
the
responsibility
of
memory
management
from
the
user.
As
we’ve
seen
earlier,
the
page
cache
will
determine
what’s
actually
stored
in
memory.
18
19. In
this
example
two
processes
map
the
same
region
of
a
file
into
memory.
Only
one
copy
of
this
data
will
occupy
memory
or
even
less
if
it’s
not
accessed.
Historically
this
mechanism
was
invented
to
reduce
the
memory
usage
of
processes.
Whenever
you
execute
a
program,
the
program’s
code
and
it’s
shared
libraries
are
mapped
to
memory.
So
if
you
open
10
instances
of
chrome,
its
code
sJll
appears
once
in
memory.
19
20. Now
lets
see
how
Mongo
uses
this
stack
of
components
20
21. (!)
Mongo
maps
all
it’s
data
into
memory.
This
includes
the
documents,
the
indexes
and
the
journal.
(!)
When
running
top
you
can
actually
see
how
much
memory
is
mapped
and
how
much
is
used.
(!)
The
lel
column
called
VIRT
stands
for
virtual
memory,
once
a
process
maps
files
to
memory
they’re
accounted
under
virtual
memory.
When
using
journaling
mongo
actually
maps
the
data
files
twice,
so
this
figure
is
twice
the
amount
on
disk
which
is
about
273GB.
RES
stands
for
resident
memory
and
is
the
amount
of
memory
that’s
actually
located
in
RAM
out
the
virtual
memory.
SHR
stands
for
shared
resident
memory.
So
out
of
the
24GB
of
resident
memory,
23GB
is
data
from
memory
mapped
files
which
is
sharable.
21
22. Turns
out
this
very
cool
strategy
for
managing
memory
also
has
problems.
The
biggest
problem
is
MongoDB
(!)
has
no
control
of
what
is
saved
in
memory.
You
can’t
tell
mongo:
promise
me
this
document
or
collecJon
is
stored
in
memory
and
by
that
ensuring
fast
access.
Why
is
this
a
problem?
I’ll
give
you
some
examples:
1. (!)
The
first
example
is
warm-‐up
–
aler
restarJng
your
server,
none
of
the
data
is
stored
in
memory,
for
every
page
that
is
accessed
for
the
first
Jme,
a
page
fault
will
be
triggered
and
the
query
will
take
longer.
2. (!)
The
second
example
is
what
I
call
expensive
queries
–
expensive
queries
are
queries
that
aren’t
indexed
well
or
request
data
that
is
hardly
ever
accessed.
When
these
things
happen
documents
are
loaded
into
memory
at
the
cost
of
freeing
other
documents
who
are
more
important.
Why
does
this
happen?
As
we’ve
seen
before
the
page
cache
frees
the
least
recently
used
pages
first.
There
are
things
you
can
do
to
miJgate
this
problem.
22
23. What
we
did
is
(!)
protect
MongoDB
with
an
API.
The
API
enforces
index
usage
so
mongo
reads
less
documents
into
memory.
Another
thing
the
API
does
is
pass
a
query
Jmeout
to
make
sure
costly
queries
are
being
cancelled.
The
API
doesn’t
have
to
be
complicated,
it
could
be
a
simple
module
sihng
on
top
of
the
MongoDB
driver.
Lets
look
at
an
example,
(!)
this
is
(!)
a
python
funcJon
called
find_samples
and
it’s
used
whenever
we
want
to
run
a
find
query
on
the
collecJon
named
samples.
The
funcJon
accepts
two
parameters
that
define
a
date
range:
start_Jme
and
end_Jme.
By
forcing
the
user
to
pass
a
date
range
we
make
sure
the
query
is
indexed.
You
could
add
further
validaJons
to
make
sure
the
range
isn’t
too
big
or
doesn’t
go
too
far
back
in
history.
23
24. Another
challenge
worth
menJoning
is
(!)
the
lack
of
prioriJzaJon
between
processes.
When
processes
allocate
a
lot
of
memory
the
page
cache
shrinks
automaJcally,
and
since
mongo
relies
on
the
page
cache,
you
could
say
mongo’s
memory
shrinks
automaJcally.
In
other
words,
mongo
has
a
lower
priority
than
other
processes
over
memory.
Since
mongo
will
just
become
slower
if
it
doesn’t
have
enough
memory
you
need
to
be
careful
with
other
processes
running
on
the
same
server.
You
can
miJgate
this
phenomenon
by
isolaJng
mongo.
(!)
Don’t
run
it
on
the
same
server
along
with
memory
or
disk
intensive
applicaJons.
The
last
challenge
I’d
like
to
tackle
is
(!)
esJmaJng
how
much
memory
is
required,
also
known
as
the
size
of
the
working
set.
24
25. So
what
is
the
working
set?
this
is
the
data
that
your
applicaJon
reads
regularly
and
should
be
returned
in
a
Jmely
manner,
therefore
it
should
fit
in
memory.
The
working
set
contains
(!)
more
than
documents,
it
also
includes
indexes
and
some
padding.
To
emphasize
the
padding
issue
lets
look
at
an
example
memory
page.
(!)
As
I
menJoned
before,
a
page’s
size
is
4k.
This
page
includes
3
documents,
between
the
documents
there’s
some
padding.
This
padding
accounts
for
expansion
of
exisJng
documents
or
inserJon
of
new
ones.
Out
of
the
three
documents,
only
document
number
2
is
accessed
regularly.
So
even
though
a
small
part
of
this
page
is
actually
used,
the
whole
page
is
saved
in
memory.
the
page
cache
can’t
save
half
pages
in
memory.
This
brings
us
to
the
conclusion
that
it’s
really
hard
to
measure
the
size
of
the
working
set
by
simply
looking
at
the
count
or
size
of
the
documents
being
queried.
SJll,
there
are
several
tools
to
help
you
esJmate
how
much
memory
a
collecJon
should
require.
25
26. The
tools
fall
into
two
categories:
planning
and
monitoring.
26
27. Planning
is
about
predicJng
how
much
memory
each
collecJon
is
going
to
need.
Lets
take
a
real
world
example.
In
one
of
our
collecJons
we
save
a
month
long
of
history,
out
of
that
month
we
know
our
applicaJon
olen
queries
the
last
two
weeks
and
someJmes
the
week
before
that.
The
last
two
weeks
are
considered
“hot
data”
because
they
have
to
be
stored
in
memory,
the
week
before
that
is
considered
warm,
it
doesn’t
have
to
be
in
memory
but
we
should
sJll
take
into
account
so
it
won’t
push
out
the
hot
data.
If
we’re
going
to
take
some
spares
to
compensate
for
padding
and
such,
it’s
safe
to
assume
3
out
of
the
4
weeks
should
fit
in
memory.
(!)
You
can
use
the
collecJon
stats
command
to
get
important
metrics
like
the
size
of
indexes
and
the
size
of
the
data
and
roughly
calculate
how
much
memory
the
collecJon
is
going
to
require.
Once
you
have
a
running
database
you
can
use
several
monitoring
tools
to
analyze
the
working
set.
27
28. When
I
think
about
monitoring
tools
they
generally
fall
into
two
categories:
1. (!)
One
is
online
monitoring
which
is
basically
seeing
what’s
going
on
at
the
moment.
This
category
includes
running
linux
commands
like
top
and
iostat
or
mongo
commands
like
currentOp,
mongostat
and
mongomem.
2. (!)
The
second
category
is
offline
monitoring
which
is
more
about
collecJng
and
aggregaJng
historical
data.
One
example
would
be
the
profiling
collecJons
that
collects
slow
queries
over
Jme.
another
example
is
the
MMS
or
other
graphing
tools
like
graphite
that
collect
different
metrics
over
Jme.
these
are
used
for
idenJfying
trends,
correlaJons
and
predicJng
growth.
Lets
start
from
the
online
tools.
28
29. Mongomem
is
a
great
tool
for
memory
use
analysis.
It’s
wriOen
in
python
by
the
people
at
a
company
called
wish
so
you’ll
have
to
install
it
manually,
it
doesn’t
come
packaged
with
mongodb.
Mongomem
won’t
tell
you
how
much
memory
you
need
but
it
will
tell
you
how
much
memory
each
collecJon
is
using
at
the
moment.
Here’s
an
example
output,
(!)
each
line
shows
how
many
megabytes
of
the
collecJon
are
in
memory.
The
top
collecJon
in
this
example
is
the
oplog
with
more
then
11GB
of
data
in
memory
out
of
almost
50GB
of
data.
So
about
22%
of
the
collecJon
is
in
memory.
The
last
line
shows
the
total
amount
of
memory
used
by
mongo
out
of
the
total
data
size,
so
in
this
example
we
have
16GB
of
data
in
memory
out
of
280GB
of
total
data.
Since
I’ve
got
16GB
of
memory
on
this
machine,
we
can
see
all
the
memory
is
being
used.
But
what
does
this
say
about
the
working
set?
Is
it
larger
than
memory?
In
other
words,
do
we
have
enough
memory?
Well,
we
can’t
say,
because
it’s
possible
there’s
data
in
memory
that
is
hardly
ever
accessed..
The
page
cache
just
didn’t
have
to
reclaim
these
pages.
29
30. What
you
can
do
in
order
to
test
how
much
RAM
mongo
actually
uses
is
the
following
procedure:
1. First
thing
you
have
to
do
is
stop
the
database
2. Then,
you
need
to
clear
the
page
cache,
the
following
command
invokes
some
code
in
the
kernel
that
drops
all
pages
from
memory.
3. The
next
step
is
to
start
the
database
4. And
aler
that
you
need
to
invoke
the
queries
that
should
cover
your
working
set.
Queries
that
should
access
all
the
data
you
expect
to
have
in
memory.
5. At
this
point,
when
running
mongomem
you’ll
be
able
to
get
a
more
accurate
picture
of
how
much
memory
is
required.
30
31. Before
looking
at
addiJonal
tools
I
want
to
answer
a
simple
quesJon:
how
do
we
know
when
something
is
wrong?
what
do
we
need
to
monitor?
And
since
we’re
talking
about
memory,
how
do
we
know
we
don’t
have
enough
of
it?.
Well,
the
phenomenon
of
not
having
enough
memory
is
called
thrashing.
When
the
OS
is
thrashing,
it’s
because
an
applicaJon
is
constantly
accessing
pages
that
are
not
in
memory,
the
OS
is
busy
handling
the
pagefaults,
reading
the
pages
from
disk.
So
the
first
thing
to
monitor
is
page
faults
(!),
and
since
it’s
hard
to
tell
how
many
page
faults
are
too
much,
you
should
also
look
at
disk
uJlizaJon,
if
the
disk
is
uJlized
100%
of
the
Jme,
you’re
in
trouble.
There
are
a
lot
of
other
things
that
go
wrong
like
(!)
a
lot
of
queries
being
queued
and
high
locking
raJos
but
these
just
are
symptoms
31
32. I
usually
use
iostat
for
looking
at
disk
utlizaJon.
Here’s
an
example
output
of
the
command,
the
rightmost
column
shows
this
disk
uJlizaJon
and
reveals
a
disk
that
is
busy
a
100%
of
the
Jme.
The
second
column
show
the
disk
serves
570
reads
per
second
and
the
third
column
shows
the
number
of
writes
per
second
which
is
zero.
If
this
is
happening
constantly,
the
working
set
does
not
fit
in
memory.
Along
with
iostat,
I
frequently
use
mongostat
32
33. Mongostat
comes
packaged
with
MongoDB
and
uses
the
underlying
(!)
serverStatus
command.
It
displays
a
bunch
of
interesJng
metrics
like
(!)
the
number
of
page
faults
and
queued
reads.
It’s
preOy
hard
to
say
how
many
page
faults
are
too
much
but
more
than
one
or
two
hundred
page
faults
per
second
are
an
indicaJon
of
a
lot
of
data
being
read
from
disk.
If
this
happens
over
long
periods
of
Jme
it
could
be
an
indicaJon
the
working
set
does
not
fit
in
RAM.
If
the
number
of
queued
reads
is
larger
than
a
hundred
over
long
periods
of
Jme
it
could
also
be
an
indicaJon
the
working
set
doesn’t
fit
in
RAM.
It’s
olen
important
to
look
at
these
parameters
over
Jme
in
order
to
determine
if
there’s
a
sudden
spike
or
repeaJng
problem.
This
brings
me
to
offline
monitoring.
33
34. Tools
like
the
(!)
MMS
or
graphite
can
show
you
these
important
metrics
over
Jme.
Using
one
of
these
tools
is
(!)
mandatory
for
a
producJon
system.
I
cannot
tell
you
how
useful
they
are.
Whenever
we
get
a
Jcket
about
a
performance
problem
we
put
our
Sherlock
hats
on
and
start
an
invesJgaJon.
We
look
at
metrics
related
to
our
applicaJon
but
also,
a
lot
of
metrics
related
to
mongo
and
how
they
change
over
Jme:
we
look
at
the
number
of
queries,
the
number
of
documents
in
collecJons
and
tens
of
other
metrics.
I’d
like
to
show
you
an
example
workflow
of
a
Jcket.
Try
to
picture
this:
it
was
a
quiet
evening,
I
was
about
to
go
to
sleep,
when
I
get
an
automated
email
that
one
of
our
shards
is
misbehaving,
what
were
the
symptoms?
it
had
more
than
300
queries
just
waiJng
in
queue.
What
do
I
do
next?
34
35. I
immediately
open
graphite,
this
is
a
screenshot
of
the
number
of
page
faults
in
green
and
the
number
of
queued
readers
in
blue.
By
looking
at
the
history
you
can
spot
two
trends:
1.
First,
there’s
a
spike
of
high
load
every
hour.
This
is
actually
normal
since
we’re
doing
hourly
aggregaJons
of
our
data.
2.
The
second
trend,
is
a
massive
rise
in
page
faults
and
queued
queries
at
exactly
20:00.
At
this
point
there’s
an
impact
on
users
as
a
lot
of
queries
take
a
very
long
Jme.
Why
is
this
happening?
Has
the
working
set
outgrown
memory?
35
36. Lets
look
at
another
screenshot
of
the
same
Jme
frame.
This
Jme
we
look
at
other
metrics:
in
blue
are
the
numbers
of
queries,
in
green
are
the
number
of
updates
and
in
red
is
the
disk
uJlizaJon.
Remember
that
disk
uJlizaJon
is
measured
in
percentage
so
even
though
the
graph
is
lower
than
others
we
can
sJll
see
that
at
20:00
the
disk
was
constantly
uJlized
at
a
100%.
When
looking
at
the
updates
vs.
queries
it’s
obvious
that
a
huge
amount
of
updates
is
hurJng
the
query
performance.
We
were
busy
wriJng
to
disk.
In
this
case
an
applicaJon
change
was
the
root
cause
of
the
problem,
the
applicaJon
simply
started
updaJng
a
lot
more
documents.
So
using
graphite,
we
were
able
to
trace
the
problem
to
a
specific
change
in
our
applicaJon
and
later
on
modified
our
schema
to
reduce
the
document
size
and
the
load
on
disk.
This
brings
me
to
next
topic
which
is
opJmizaJon.
36
37. When
opJmizing
memory
usage
the
main
target
is
to
reduce
the
amount
of
required
memory
for
your
applicaJon.
(!)
Smaller
the
collecJons
and
documents
are,
the
faster
the
queries
will
be.
not
just
in
terms
of
memory
but
also
disk,
if
documents
are
smaller
less
disk
access
is
required
to
read
them.
There
are
several
opJmizaJons
you
can
do
when
it
comes
to
schema:
(!)
1. first,
shorten
the
keys.
we’ve
started
with
long
names
like
firstName,
then,
shortened
them
to
a
single
word
or
acronym
and
finally
used
one
or
two
leOers
since
it
had
a
huge
impact
on
the
size
of
our
data.
By
shortening
the
keys
we
reduced
the
size
of
our
data
in
more
than
50%.
There
is
a
huge
downside
for
doing
this
because
it
obscures
the
data
but
fortunately,
we
have
an
API
that
hides
this
ugly
implementaJon
detail
so
it
doesn’t
have
an
impact
on
our
users.
2. Another
thing
to
consider
is
the
tradeoff
between
the
number
of
documents
and
their
size,
in
many
use
cases
it’s
more
efficient
to
store
a
smaller
amount
of
large
documents
vs.
a
large
amount
of
small
ones.
We
previously
seen
how
padding
occupies
memory,
by
changing
the
padding
factor
and
running
repair
every
some
Jme
you
can
reduce
the
padding
overhead.
The
next
thing
you
can
opJmize
is
indices
37
38. First
thing
you
should
know
is
that
unused
indices
are
sJll
accessed
whenever
documents
are
being
inserted,
updated
or
deleted.
Try
to
idenJfy
those
and
remove
them.
(!)
Use
sparse
indices
when
only
some
of
the
documents
will
have
the
indexed
aOribute
as
they
use
less
space.
(!)
The
last
thing
I
want
to
talk
about
is
how
much
of
the
index
is
located
in
memory.
The
answer
is:
it
depends.
If
the
enJre
index
is
accessed
by
queries
then
the
enJre
index
should
be
located
in
memory.
If
only
a
single
part
of
the
index
is
used,
only
that
part
has
to
fit
in
memory.
Lets
look
at
a
few
examples
to
emphasize
the
difference,
you
can
imagine
an
index
(!)
as
a
segment
of
memory,
the
red
marks
are
locaJons
frequently
accessed
by
queries.
(!)
The
first
example
is
an
index
on
a
date
field
called
creaJon_Jme.
Each
inserted
document
inserts
the
largest
value
of
all
previous
ones
so
the
right
most
part
of
the
index
is
updated.
In
many
such
indexes
only
the
recent
history
is
olen
accessed
so
only
the
right-‐most
part
of
the
index
will
be
located
in
memory.
(!)
The
second
example
is
an
index
on
a
person’s
name,
the
index
accesses
will
probably
distribute
evenly
across
the
enJre
index
so
most
of
it
will
be
located
in
memory.
38
39. So
lets
summarize
what
we’ve
learned:
1.
We’ve
seen
how
memory
management
works,
we’ve
started
from
the
disk
and
RAM,
went
up
the
stack
to
the
page
cache
whose
sole
purpose
is
to
improve
read
and
write
performance
by
using
the
memory.
We
conJnued
to
memory
mapped
files
which
translate
memory
accesses
like
reads
and
writes
to
file
reads
and
writes.
And
we
finished
with
MongoDB’s
usage
of
these
mechanisms.
2.
We’ve
talked
about
the
challenges
this
strategy
presents:
like
predicJng
and
measuring
the
size
of
the
working
set.
3.
We
then
talked
about
monitoring,
which
is
something
you
have
to
do
if
you
have
a
DB
running
in
producJon.
4.
We
finished
with
schema
and
index
opJmizaJons
which
are
crucial
for
cuhng
costs
and
improving
performance.
39
40. And
that’s
it!
I
hope
you
enjoyed
my
talk
and
thanks
for
having
me.
40