With a recent project I have been working on, we needed a way to be able to find venues based on user’s location. We needed a way to be able to zoom in and refine our search our zoom out and get a broader picture of what’s going on at a particular location. The other question is how can we do this easily and efficiently and use Dynamodb in the process. There are great gems that does this type of thing already Geocoder but in our case we can’t use it. We had to build some sort of solution on our own and what we found was that geohash fits our needs. Geohash is a way to be able to take any longitude and latitude and map it to a string you can check more of the details at wikipedia. Why this technique works well for us is that Dynamodb is a key-value store so we can use the hash buckets Geohash gives us to easily hold information about venues for a particular bucket. When I say bucket I mean the mapping from coordinates to a bounding box you can see this in action here.  Since I am using ruby and the dynamoid gem I’ll dive right into an example to show you how it works.

class GeoBucket
 include Dynamoid::Document
 table :name => :geo_bucket, :key => :geohash

 field :geohash, :string
 field :venue_ids, :set
end

What this class does is store the hash of the bucket and all venue_ids which belong to this bucket. I should also mention a bucket with the hash of dpwxp is contained inside the bucket with 1 lower resolution dpwx. This gives us a nice way to control what resolution of buckets we want to use and keep track of now for the venue class

class Venue
 include Dynamoid::Document
 table :name => :venue, :key => :venue_id
 field :venue_id, :string
 field :name, :string
 field :latitude, :float
 field :longitude, :float
 field :country, :string
 after_create :index_geohash
def index_geohash
 hash = GeoHash.encode(self.latitude, self.longitude)
 levelOne = GeoBucket.new(:geohash => hash[0..3])
 levelTwo = GeoBucket.new(:geohash => hash[0..4])
 levelThree = GeoBucket.new(:geohash => hash[0..5])
 levelOne.update({:return => :none}) do |t|
   t.add(:venue_ids => ["#{self.venue_id}"])
 end
 levelTwo.update({:return => :none}) do |t|
   t.add(:venue_ids => ["#{self.venue_id}"])
 end
 levelThree.update({:return => :none}) do |t|
   t.add(:venue_ids => ["#{self.venue_id}"])
 end
end
end

We first build the hash for the venues coordinates and then slice it so we are using a geohash resolution of 4, 5, and 6. We don’t want Dynamodb to return anything about our update call so we specify none. The nice thing about dynamodb is when adding an element to a set it makes sure that the set is unique so we aren’t storing redundant info. Now we have an efficient way to quickly lookup a bucket and see what venues lie within it’s boundaries. The question still remains given a user’s current location how can we lookup these venues around them? This is done easily by first mapping the current user’s location to a bucket and then checking the neighbouring buckets if our search radius overlaps them (ie when the user is zooming in and out of a map). Here I’m going to define a function within the GeoBucket class which given a current user’s location and a radius we select the proper resolution to use and find which buckets we should be pulling from. I am using the Geohash git because the official one’s fixes hasn’t been pushed to ruby gems and the Geocoder gem although if not needed to keep everything lightweight you could implement the bounding box calculation yourself.

class GeoBucket
 include Dynamoid::Document
 ...

# current_locations is a hash of {:lng => ..., :lat => ...} 
# and radius is specified in km
def self.get_buckets(current_location, radius = 0)
  resolution =
  if radius <= 1.6
   6
  elsif radius <= 9
   5
  else
   4
  end
  center_point = [current_location[:lat], current_location[:lng]]
  user_query = Geocoder::Calculations.bounding_box(center_point, radius)
  user_query = [[user_query[0],user_query[1]],[user_query[2],user_query[3]]]
  hash = GeoHash.new(current_location[:lat],current_location[:lng],resolution)
  neighbors = hash.neighbors
  neighbors_bbox = neighbors.map{|h| GeoHash.new(h).to_bbox}
  buckets = neighbors.select.with_index{|bucket,i| bounding_box_overlap(user_query,neighbors_bbox[i])}
  buckets += [hash.to_s]
  return GeoBucket.find_all(buckets)
 end
# Determines if two bounding boxes overlap each other
def self.bounding_box_overlap(bbox1, bbox2)
 return false if bbox1[0][1] > bbox2[1][1]
 return false if bbox1[1][1] < bbox2[0][1]
 return false if bbox1[1][0] < bbox2[0][0]
 return false if bbox1[0][0] > bbox2[1][0]
 return true
end
end

You can see that we first select the resolution depending on how big of a radius the user is searching and then proceed to check if the user’s query which is also a bounding box overlaps the current buckets neighbours. We now have a way to efficiently grab all venue_ids for a current location. Later we need some way to know which venues are more popular by simulating a checkin for when a user performs an action at a venue. This will be achieved with a mapreduce!