You are on page 1of 14

Foursquare & MongoDB

Harry Heymann May 21, 2010

~615k checkins/day. etc to share their location with friends. badges) that can sometimes lead to real life rewards (5 checkins at a restaurant might get you a free appetizer) ● ~1. shopping destinations.3M registered users. restaurants. Nearly 50M checkins total.Foursquare Overview ● A location based social network. ● Rewards users with virtual prizes (points. Allows users to "check in" at bars. Very rapid growth. .

. ● Scaling up on on a SQL database can be frustrating (replication & sharding don't work as easily as one would like) so we're moving to MongoDB. It runs on the JVM) ● Uses a web framework called Lift ● Originally used a single PostgreSQL instance as the data store. ● Built in geospatial capabilities (obviously very important to foursquare) are a very nice bonus.Basic technical details ● Written in scala (a somewhat new language what you'd get if Java & ML had a baby.

venues (and various things related to venues) to MongoDB.Transition to Mongo Currently writing checkins. Other items later. Migrating geo related queries first. All writes still go to PostgreSQL as well. tips. Checkins a high priority (due to the fact that they represent the bulk of our data). A short lived record of where our users are at any given time (contains 3 hours worth of data). . Slowly migrating various reads. Exclusively use mongo for our "Who's here" server.

180 degrees in each of 2 axes).Geospatial Indexes ● MongoDB conveniently supports geospatial indexes out of the box. ● It cheats to make the math easier/faster by assuming a flat earth where 1 degree of latitude or longitude is always the same distance. ● Currently limited to Earth like dimensions (+/. This is fine as long as you are dealing with relatively small distances (as foursquare does) ● Implemented using geographic hash codes atop standard MongoDB b-trees .

ensureIndex({latlng: "2d".venues.ensureIndex ({latlng: "2d"}) ● Specify additional fields if you plan on using compound geospatial queries (more on these in a moment): db. closed: 1.venues.Creating Geospatial Indexes ● Indicate the "2d" index type: db. keywordList: 1}) .

"in". "on". "the". "different". -72]. "on". "forever". "and". "venue". "with". . "words". "ever". For example if the index is on {latlng: "2d". "the". "going". "venue". keywordList: ["some". "stop"]} ● In these cases the individual item will be dropped from the index making it impossible to query.Take care: 1k limit on key size ● If you have a compound geospatial index and you don't take care it can be easy to go over the MongoDB limit of 1K. "of". "lot". "name". "a". keywordList: 1} then the following venue would be a problem: {latlng: [40. "it". "of". "just". "whole". "without". "seeming. "keeps". "to".

limit(20) ● Find up to the closes 20 venues to a given location that are within 1 degree of the location: db. and various other geolocated data.72. -73.venues. find({latlng: {$near: [40. tips.venues.find ({latlng: {$near: [40. specials.limit(20) ● Foursquare uses this to find nearby venues. -73.72. . 1]}).99.99]}).Basic Geospatial Queries ● Find the closes 20 venues to a given location: db.

99]}.72. closed: false. -73.venus.Complex Geospatial Queries ● If you have a compound geospatial index defined you can query on additional fields and still use the index: db.find({latlng: {$near: [40. keywordList: $all: ["nyc". closed: false} (because we generally don't want closed venues) ● Basic search: db. find({latlng: {$near: [40. -73.venus.99]}. "seminar"]}) .72.

find({latlng: {"$within": {"$circle": [[40.Bounded geospatial queries ● Foursquare doesn't do much of this. -73]]}}}) db.5]}}} ● Can be combined with complex geospatial queries that were demonstrated on last slide. 0. -72].find({latlng: {"$within": {"$box": [[40. -72]. In general though $near will be more useful/performant than $within.venues. but it's possible to find all of the items in a collection that are within a given circle or square: db. .venues. [41.

but as of Lift 2. so has some wards/oddities. ● Originally called scamongo. but should improve fairly rapidly (foursquare is working on this) .MongoDB & Scala/Lift ● Lift has a generic ORM layer called record for which there is a MongoDB implementation.0 M5 it's integrated into the core Lift codebase as lift-mongodb ● It's a very thin wrapper around the Java driver provided by 10gen ● It's new.

255) object address extends OptionalStringField(this. 50) object closed extends BooleanField(this) // etc } .A foursquare venue using lift-mongo class Venue extends MongoRecord[Venue] with MongoId[Venue] with GeolocationMongo[Venue] { object venuename extends StringField(this.

address("71 W 23rd St") .get() venue = Venue.venuename.venuename("NYC Seminar & Conference Center") .city("New York").zip("10010") venue.start(Venue.Basic ORM operations work as you might expect val venue = Venue.name) .findAll(query) .save val query = QueryBuilder.is("Gramercy Tavern") .state("NY").createRecord .

See http://foursquare. business.PS: We're hiring ● All sorts of roles: engineering. operations.com/jobs or come talk to me. .