Notes From A MongoDB Presentation I Attended

Sep 7, 2012

These are some sparse notes I took at a free MongoDB presentation I attended at Urban Rethink in Downtown Orlando, FL. The event was called An Evening with MongoDB

Features of MongoDB

BSON is nomal JSON but with typed data
Ad-hoc query support - secondary index
effective for an array of value or set search
has a ttl collection that expires data
various indexes - 2d, sparse ....
geospacial indexes - like with zillow
lacks
- joins,
- multi-collection transactions (2 face commit)
- no transactions but uses document level locks
search for mongodb cookbook

Replication

arbitratry node prevents 2 nodes from gaining master status if the master failes
recommends 3 nodes
getLastError will not loose data sent to a failing master and will retry on the new masters
mostly checked on the application side
can check the health of a node

Sharding

horizontal scaling
no application logic needed
range based partition mechanism
shard key - can be composite keys ( ie username/date)
geospacial sharding in future releases
replacing a shard will slow down application as it rebuilds
driver connects to mongoS that takes care of sharding

Mongo 2.2

concurrency improvements
- db level locking
- improved yielding
ttl will delete data after a certain amount of time (like session data). sometime after the time and can be backed up. minute interval.
Ask for a soft delete feature with the ttl - it currently only deletes it
Can normalize data using a pson function. No triggers yet. application has to check data being inserted
JIRA ticketing system for bugs, requests...

Aggregation Framework

MapReduce - scans collections and finds data you're looking for. Groups and sorts it. Then reduces and outputs it to a collection to analyze it.
- Overkill for simple aggregation tasks
- Implementd with js
- single threaded
- difficult to debug
- concurrency
  - appearance of parralllism

Aggregation Functions

$project
$match
- match in pipeline
$group
- group during aggregation
- group by constant id to group on the entire collection
- array of set operators
- addtoset takes out duplicates, push doesn't
- $unwind
- - operates on an array field and then you can group on it
$sort
- has to wait for all operators to finish
$limit
$skip

extending framework can add new pipeline operators and expressions

$out and $tee for output and debugging

early filtering

match
sort
limiy
group
sort
project

operation memory limits
$group and $sort entirely in memory. warning > 5% and errors >5%
returns results, but returns an error code

sharding - splits the pipeline at first group or sort

github.com/rozza/demos

OpenShift

Titanium studio

can port code to different languages from js
ideal for mongo because it uses js
application as a service

github.com/beershift

Comments

Carlos: Thank you!

May 16, 2014

Jessica: Great info, thanks!

May 15, 2014 | 1 Replies

Walter: Thanks, you totally saved me a bunch of searching around just now. I was setting up a percona xtradb...

Feb 25, 2014

Avi: I've found the reason of the problem in my case. It's a bug as described here :http://bugs.mysql.com/bug.php?id=68892Shortly:...

Dec 12, 2013

Avi: I've also met that behavior twice on 5.6.12 slaves. However, i didn't find the root cause of that....