These are some sparse notes I took at a free MongoDB presentation I attended at Urban Rethink in Downtown Orlando, FL. The event was called An Evening with MongoDB
Features of MongoDB- BSON is nomal JSON but with typed data
- Ad-hoc query support - secondary index
- effective for an array of value or set search
- has a ttl collection that expires data
- various indexes - 2d, sparse ....
- geospacial indexes - like with zillow
-
lacks
- joins,
- multi-collection transactions (2 face commit)
- no transactions but uses document level locks
- search for mongodb cookbook
Replication
- arbitratry node prevents 2 nodes from gaining master status if the master failes
- recommends 3 nodes
- getLastError will not loose data sent to a failing master and will retry on the new masters
- mostly checked on the application side
- can check the health of a node
Sharding
- horizontal scaling
- no application logic needed
- range based partition mechanism
- shard key - can be composite keys ( ie username/date)
- geospacial sharding in future releases
- replacing a shard will slow down application as it rebuilds
- driver connects to mongoS that takes care of sharding
Mongo 2.2
-
concurrency improvements
- db level locking
- improved yielding
- ttl will delete data after a certain amount of time (like session data). sometime after the time and can be backed up. minute interval.
- Ask for a soft delete feature with the ttl - it currently only deletes it
- Can normalize data using a pson function. No triggers yet. application has to check data being inserted
- JIRA ticketing system for bugs, requests...
Aggregation Framework
-
MapReduce - scans collections and finds data you're looking for. Groups and sorts it. Then reduces and outputs it to a collection to analyze it.
- Overkill for simple aggregation tasks
- Implementd with js
- single threaded
- difficult to debug
-
concurrency
- appearance of parralllism
Aggregation Functions
- $project
- $match
-
- match in pipeline
- $group
-
- group during aggregation
- group by constant id to group on the entire collection
- array of set operators
- addtoset takes out duplicates, push doesn't
- $unwind
-
- operates on an array field and then you can group on it
- $sort
-
- has to wait for all operators to finish
- $limit
- $skip
$out and $tee for output and debugging
early filtering
- match
- sort
- limiy
- group
- sort
- project
operation memory limits
$group and $sort entirely in memory. warning > 5% and errors >5%
returns results, but returns an error code
sharding - splits the pipeline at first group or sort
github.com/rozza/demos
OpenShift
Titanium studio
- can port code to different languages from js
- ideal for mongo because it uses js
- application as a service
Comments