Notes From A MongoDB Presentation I Attended

Sep 7, 2012

These are some sparse notes I took at a free MongoDB presentation I attended at Urban Rethink in Downtown Orlando, FL. The event was called An Evening with MongoDB

Features of MongoDB

  • BSON is nomal JSON but with typed data
  • Ad-hoc query support - secondary index
  • effective for an array of value or  set search
  • has a ttl collection that expires data
  • various indexes - 2d, sparse ....
  • geospacial indexes - like with zillow
  • lacks
    • joins,
    • multi-collection transactions (2 face commit)
    • no transactions but uses document level locks
  • search for mongodb cookbook


Replication

  • arbitratry node prevents 2 nodes from gaining master status if the master failes
  • recommends 3 nodes
  • getLastError will not loose data sent to a failing master and will retry on the new masters
  • mostly checked on the application side
  • can check the health of a node


Sharding

  • horizontal scaling
  • no application logic needed
  • range based partition mechanism
  • shard key - can be composite keys ( ie username/date)
  • geospacial sharding in future releases
  • replacing a shard will slow down application as it rebuilds
  • driver connects to mongoS that takes care of sharding


Mongo 2.2

  • concurrency improvements
    • db level locking
    • improved yielding
  • ttl will delete data after a certain amount of time (like session data). sometime after the time and can be backed up. minute interval.
  • Ask for a soft delete feature with the ttl - it currently only deletes it
  • Can normalize data using a pson function. No triggers yet. application has to check data being inserted
  • JIRA ticketing system for bugs, requests...

 

 

Aggregation Framework

  • MapReduce - scans collections and finds data you're looking for. Groups and sorts it. Then reduces and outputs it to a collection to analyze it.
    • Overkill for simple aggregation tasks
    • Implementd with js
    • single threaded
    • difficult to debug
    • concurrency
      • appearance of parralllism

Aggregation Functions

  • $project
  • $match
    • match in pipeline
  • $group
    • group during aggregation
    • group by constant id to group on the entire collection
    • array of set operators
    • addtoset takes out duplicates, push doesn't
    • $unwind
      • operates on an array field and then you can group on it
  • $sort
    • has to wait for all operators to finish
  • $limit
  • $skip
extending framework can add new pipeline operators and expressions

$out and $tee for output and debugging

 

early filtering

  • match
  • sort
  • limiy
  • group
  • sort
  • project


operation memory limits
$group and $sort entirely in memory. warning > 5% and errors >5%
returns results, but returns an error code

 

sharding - splits the pipeline at first group or sort

 

github.com/rozza/demos

 

OpenShift

Titanium studio

  • can port code to different languages from js
  • ideal for mongo because it uses js
  • application as a service

github.com/beershift

Comments

New Comment