Introduction to
"MongoDB is a scalable, high-
performance, open source,
schema-free, document-
oriented database."
Goal: be the best database for
(most) web apps
...not the best DB for everything.
What do Web Apps Want?
Performance
Scalability
Availability
What do developers want?
What else do developers want?
Easier development
Less maintenance (at 3am)
Great support
Goals
Easy-to-use
Fast
Always available
Easy to scale
Goals
Easy-to-use
Fast
Always available
Easy to scale
5-minute install
Download binaries from
http://www.mongodb.org
and run!
Mongo knows JavaScript
$ ./mongo
MongoDB shell version: 1.3.2
url: test
connecting to: test
type "help" for help
>
Mongo knows JavaScript
$ ./mongo
MongoDB shell version: 1.3.2
url: test
connecting to: test
type "help" for help
> x = 123+1
124
>
mongo.kylebanker.com
Basic Usage
$ ./mongo
MongoDB shell version: 1.3.2
url: test
connecting to: test
type "help" for help
>
Basic Usage
$ ./mongo
MongoDB shell version: 1.3.2
url: test
connecting to: test
type "help" for help
> db
test
>
Database ≈ Database
> use foo
switched to db foo
>
Database ≈ Database
> use foo
switched to db foo
> db
foo
>
Collection ≈ Table
> db.my_collection
foo.my_collection
>
Collection ≈ Table
> db.my_collection
foo.my_collection
>
Collection ≈ Table
> db.my_collection
foo.my_collection
> db.jzxgtx.find()
>
Collection ≈ Table
> db.my_collection
foo.my_collection
> db.jzxgtx.find()
> db.pltzbt.count()
0
>
Document ≈ Row
{
"title" : "My first blog post",
"author" : "Fred",
"content" : "Hello, world!",
"comments" : []
}
Documents
Language Document
JavaScript {"foo" : "bar"}
Perl {"foo" => "bar"}
PHP array("foo" => "bar")
Python {"foo" : "bar"}
Ruby {"foo" => "bar"}
Java DBObject obj = new BasicDBObject("foo", "bar");
Documents
{
"title" : "My first blog post",
"author" : "Fred",
"content" : "Hello, world!",
"comments" : []
}
Documents
> db.posts.insert({
... "title" : "My first blog post",
... "author" : "Fred",
... "content" : "Hello, world!",
... "comments" : [] })
>
Documents
> post = db.posts.findOne()
{
"_id" : ObjectId("2fe3e4d892aa73234c910bed"),
"title" : "My first blog post",
"author" : "Fred",
"content" : "Hello, world!",
"comments" : []
}
>
Documents
> post = db.posts.findOne()
{
"_id" : ObjectId("2fe3e4d892aa73234c910bed"),
"title" : "My first blog post",
"author" : "Fred",
"content" : "Hello, world!",
"comments" : []
}
>
ObjectId
Q: How do we make a unique ID quickly that is guaranteed to
be unique across multiple servers?
A:
> print(post._id)
2fe3e4d892aa73234c910bed
|------||----||--||----|
ts mac pid inc
Built-in document creation timestamp
A few examples...
Chess
{
"name" : "black king",
"symbol" : "♚",
"pos" : {
"x" : "e",
"y" : 8
}
}
Moving
> db.chess.update(
{
"name" : "black king",
"symbol" : "♚",
"pos" : {"x":"e", "y":8}
}
Moving
> db.chess.update(
... {"name" : "black king"},
{
"name" : "black king",
"symbol" : "♚",
"pos" : {"x":"e", "y":8}
}
Moving
> db.chess.update(
... {"name" : "black king"},
... {"$inc" : {"pos.y" : -1}})
>
{
"name" : "black king",
"symbol" : "♚",
"pos" : {"x":"e", "y":8}
}
Moving
> db.chess.update(
... {"name" : "black king"},
... {"$inc" : {"pos.y" : -1}})
>
Moving
> db.chess.update(
... {"name" : "black king"},
... {"$inc" : {"pos.y" : -1}})
> db.chess.find({"name" : "black king"})
{
"name" : "black king",
"symbol" : "♚",
"pos" : {"x" : "e", "y" : 7}
}
Adding Information
> db.chess.update(
... {"name" : /pawn/},
... {"$set" : {"importance" : 1}})
>
Adding Information
> db.chess.update(
... {"name" : /pawn/},
... {"$set" : {"importance" : 1}})
> db.chess.findOne({"symbol" : "♙"})
{
"name" : "white pawn a",
"symbol" : " ♙ ",
"pos" : {"x" : "a", "y" : 2},
"importance" : 1
}
Types
null date
boolean binary data
integer object id
long regular expression
double code
string max value
array min value
object
Querying
> db.chess.find()
Querying
1
a h
Querying
> db.chess.find().sort(
... {"pos.y" : -1, "pos.x" : 1})
Querying
> db.chess.find(
... {"name" : /white/}).sort(
... {"pos.y" : -1, "pos.x" : 1})
Querying
> db.chess.find().sort(
... {"importance" : -1}).limit(1)
Querying
> db.chess.find().sort(
... {"importance" : -1}).limit(1).skip(1)
Query Conditions
> db.chess.find({"pos.y" : 1})
Query Conditions
> db.chess.find({"pos.y" :
... {"$gt" : 2, "$lt" : 7}})
Query Conditions
> db.chess.find({"pos.y" :
... {"$gt" : 2, "$lt" : 7},
... "pos.x" :
... {"$ne" : "b"}})
"$gte"
"<null>"
""
'$gte' or "\$gte"
Or define your own!
":gte"
"=gte"
"?gte"
Or define your own!
" gte"
Or define your own!
" gte"
" ♕ gte"
Or define your own!
" gte"
" ♕ gte"
"xgte"
Other Neat Stuff
Analytics
{
"uri" : "/blog",
"pageviews" : 0
}
Analytics - Increment Pageviews
page = db.analytics.findOne({"uri" : "/blog"});
if (page != null) {
page.pageviews++;
db.analytics.save(page);
}
else {
db.analytics.insert({
"uri" : "/blog",
"pageviews" : 1});
}
...that's 1 round trip + 1 update or insert.
Analytics - Upsert
db.analytics.update(
{ "uri" : "/blog" },
{ "$inc" : { "pageviews" => 1 } },
{ "upsert" : true });
...all in one update!
Capped Collections
4 GB, 50 MB, 97 documents, etc.
Capped Collections
4 GB, 50 MB, 97 documents, etc.
Capped Collections
4 GB, 50 MB, 97 documents, etc.
Storing Files - GridFS
Max: 4 MB
Storing Files - GridFS
(More than 4 MB)
Storing Files - GridFS
J J J
chunks
J J J
J J J
_id : J files
Storing Files - GridFS
$grid = $db->get_gridfs();
$grid->insert($fh,
{"permissions" => 644,
"comment" => "vacation pics"});
Storing Files - GridFS
$file = $grid->find_one($query);
$file->print($another_fh);
JavaScript Functions
> db.eval("function() { return 'hello'; }")
hello
>
JavaScript Functions
> db.eval("function() { return 'hello'; }")
hello
>
> func = "function(x) {" +
... "return 'hello '+ x + '!';" +
... "}"
>
JavaScript Functions
> db.eval("function() { return 'hello'; }")
hello
>
> func = "function(x) {" +
... "return 'hello '+ x + '!';" +
... "}"
> db.eval(func, ["joe"])
hello, joe!
"Stored Procedures"
> db.system.js.insert({
... "_id" : "x",
... "value" : 3});
>
> db.system.js.insert({
... "_id" : "y",
... "value" : 4});
>
"Stored Procedures"
> db.system.js.insert({
... "_id" : "x",
... "value" : 3});
>
> db.system.js.insert({
... "_id" : "y",
... "value" : 4});
>
> db.eval("return x+y");
7
Goals
Easy-to-use
Fast
Always available
Easy to scale
Query Optimizer
B C
A
Query Optimizer
B C
Done!
Performance
"Any operation that takes longer than 0 milliseconds
is suspect."
- Mongo user on IRC
Performance
100,000 inserts (with indexes for MongoDB and MySQL)
Database Average Insert
MongoDB 0.00011
MySQL 0.00083
CouchDB 0.00640
Memcached 0.00279
Performance
Database Average Query (Indexed)
MongoDB 0.00035
MySQL 0.00772
CouchDB 0.01640
Memcached 0.00015
The Space-Time Continuum
Time
Space
The Space-Time Continuum
Time
Space
Care and Feeding of a Mongod
Server
Laptop
Keeps your database
in memory
Data size
Keeps your database
in memory
Keep indexes
in memory
Data size
Keep the portion of the indexes
Keeps your database you're using in memory
in memory
Keep indexes
in memory
Data size
Keep the portion of the indexes
Keeps your database you're using in memory
in memory
Hits the disk and
takes forever.
Keep indexes
in memory
Data size
Goals
Easy-to-use
Fast
Always available
Easy to scale
master
slave slave slave
Replica Pairs
master
slave
Replica Pairs
slave
Replica Pairs
master
Replica Pairs
slave
master
Coming soon... replica sets
slave
slave
slave
master
Eventual Consistency
Hey Twitter, I'm eating
a donut.
Fred
Okay
Hey Twitter, I'm eating
a donut.
x 1,000,000
Fred
Fail Whale
Okay
Hey Twitter, I'm eating
a donut.
Fred
Okay
Hey Twitter, I'm eating Nothing new
a donut. from Fred
Fred
Bob
One sec
Hey Twitter, I'm eating
a donut.
Fred
Wait a sec, Bob
Bob
I'd rather have Twitter up and know
what Fred is doing later.
Bob
...eventual consistency works for your customers
I MUST KNOW WHAT FRED IS
DOING RIGHT THIS SECOND!
Bob
...you need "real" consistency
Transactions
A Transaction
Insert this.
Okay, got it.
Phew, my data's safe.
A Transaction
A Transaction
Get Paranoid
Insert this.
Okay, got it.
Phew, my data's safe.
Get Paranoid
Get Paranoid
? I have no idea what
you're talking about.
Get Real Paranoid
Write this to disk
I know better than
he does, I'll just
let this sit in a
buffer for a while.
All over it!
Trust No One!
...trust a bunch of ones. Mostly.
Goals
Easy-to-use
Fast
Always available
Easy to scale
WARNING!
α
shard
shard
shard
shard
I want Bob's comments
Those are stored here.
Comments from Bob?
Here you go.
Server #3, you're
getting overloaded.
Configuration
Server #3, lockdown.
Split the data,
migrate half to the
new shard.
Router, server #3 has
A-M, server #5 has N-Z.
Shards
Configuration
Routers
Scaling
NoSQL Live
March 11
in Boston
See www.10gen.com/events for details
Thank You!
www.mongodb.org
irc.freenode.net#mongodb
http://groups.google.com/group/mongodb-user
kristina@mongodb.org
@kchodorow