CRUD Is Not Spelled With An “S”

Advanced Searching in Rails Steve Midgley http://www.misuse.org/science http://www.hutz.com May 30th 2008

Why?

Who Should Be Here?
• Free text search isn’t enough. • Want multi-column parameterized searching. – Buying/finding things with specific attributes • Date-driven • Price-driven • Categorized • Building advanced searching is HARD.

Here

The Four Things You Have to Deal With
• POST/GET Params representing Search Criteria • Merging Search Criteria with Persistent Criteria – Session/backend, hidden input tags or on the URL line • Converting Search Criteria to Search Rules to SQL • Paginating the Search & Render

Advanced Search

Anatomy of a Search
HTTP Form POST/GET

Previous Criteria Whatever Object

SQL Query

Params Object

Data Objects
Persist back to session

Add/Merge/ Replace/Delete

Merge

Search Criteria
Search Business Pagination Rules

Send to Template

Post

Converting HTTP Params to Search Criteria
• For clean URL’s use POST’s. Make your URL’s are distinct for core search options! http://mysite.com/findvacations/p/1/UnitedStates/California NOT http://mysite.com/controller/action?wt f=is&all=this • Why: google-juice, page caching, LB splitting, happy customers, decoupled controller from URL logic

Converting HTTP Params to Search Criteria
• You are probably going to have to create custom input tags for your search. <input name=“property[min_price]”> • Think of incoming params as associated with columns in your database Property[id] = 1092 Property[min_price] = 299 Region[id] = 11 Region[ids][] = [11,22,44]

Merge

Merging Search Criteria to/from Session
HTTP Form POST/GET

Previous Criteria Session Object

Params Object
Merge

Merging Search Criteria to/from Session
• User searches for a set of Regions: Region[ids][] = [11,22,44] • User wants to adjust that by adding one more region Region[ids][] = [33] • Our search criteria should now be: [11,22,33,44] • This makes your UI more flexible.

Merging Search Criteria to/from Session, more
[11,22,33,44] • User wants to clear this set of parameters and add two
Region[ids][] = [--,55,66]  [55,66]

• We have to cope with single item changes especially if you use Ajax:
remove only item: [--55]  [66] Misuse.org/science => “deep_merge”

Params + SQL

Converting Search Criteria to Search Rules to SQL
• This is the core of your search. You have to convert: Region[ids] = [33,44,55] • Into SQL: region.id in (33,44,55) • Doesn’t seem hard – but…

Converting Search Criteria to Search Rules to SQL
• What if you want to pass in this: Property[min_search_rate] = 245 • Into SQL: property.search_rate >= 245 • Might be equality or comparison. Could be an order by. Could force you to join in another table. Could even affect the output columns.

Mechani cal Advanta

More on Search Rules and SQL
• You want to store your SQL in an object so that you can pass it around.
– This makes writing your Search Rules more modular – Doesn’t have to be session (maybe you want searches shared across users?)

Search Rules and SQL, code
cond = Caboose::EZ::Condition.new :my_table do foo == 'bar' baz <=> (1..5) id === [1, 2, 3, 5, 8] condition :my_other_table do fiz =~ '%faz%' end end // EZ Where code

SQL Tools
• There are three great tools out there written in Ruby / ActiveRecord for your needs:
– EZ-Where
• http://rubyforge.org/projects/ez-where/

– Squirrel – Sequel
• http://code.google.com/p/ruby-sequel/

• https://svn.thoughtbot.com/plugins/squirrel/trunk

Order

Advanced Search SQL ORDER BY and CASE
• Many queries can be accomplished more effectively and logically with ORDER BY statements. – Float stuff to the top of the query instead of WHERE clausing it out of the query. – If you work with business guys, they will love you. Lets them manage (aka sell) fine-grained search placement, etc.

Advanced Search SQL SQL CASE Code
ORDER BY city_id <> 555 – WTF: city_id = 555 floats to the top – Tip: In ANSI SQL false sorts before true ORDER BY CASE city_id WHEN 555 then 1 WHEN 342 then 2 WHEN 111 then 3 ELSE 4 END

Paginating the Search
• Pagination is lame. • If you do it half-ass you will hate yourself every morning. • So do it right. Spend a couple of days getting it right. • Build good tests. • Try not to mess with it. • I use the original Rails plugin. I don’t see what the fuss is. It seems to work fine if you just want basic fence-posting.

Paginating Code
@paginator = Paginator.new(self, @row_count, @rows_per_page, @cur_page) @query.offset = @paginator.current.offset <%= @paginator.current.first_item %> <%= @paginator.current.last_item %>

Pruning

Beautiful URL’s
• Use Routes • If you use a lot of GET params your URL line will suck. • Hidden vars + Ajax = not beautiful • Store search state in session or backend – Put a UI “key” for that search on the URL line: mysite.com/search?search_id=abc123 – Tie that UI key to a hash key in session to store your params Session[“searches”][“abc123”]

Search Routes
map.browse_city_page 'rentals/p/:page/city/:url_city/ *url_regions', …. mysite.com/rentals/p/1/city/ San-Diego/United-States/ California/SoCal

Persistent params to SQL
• Is this a controller thingy or a model thingy? • Many options: I use a controller “module mix-in” (i.e. “acts_as_search_engine”). • A Model based mix-in seems ok too. • Key concept: Build SQL incrementally: pass around whatever SQL storage container you’ve got – Don’t try to do all your SQL builds in one method: that leads to spaghetti. – Be Modular

Controller Search Logic
Module Search public def merge_params(params, session);.. end def sql_assembler(sql_obj, criteria);..end protected def build_rate_sql(sql_obj, criteria);..end def build_sqft_sql(sql_obj, criteria);.. end //etc... end //... Class SearchController include Search def results criteria = merge_params(params, session) sql_assembler(sql_obj.new, criteria) Property.count_by_sql(sql_obj.count_sql) Property.find_by_sql(sql_obj.find_sql) end end

Persisting a Search
• Store as individual elements – More coding, some pain, flexible • Marshall your SQL object – Less coding, less pain, less flexible • Marshall criteria object – Less coding, some pain, some flex • Store as SQL clauses – Please don’t • Version your searches in persistent layer in all cases

Optimi ze

Optimization, Rails
• Don’t optimize until you need to. • Use data to optimize. Do not guess! • Make a baseline of performance before you optimize. • Rails is really a page template generator – Use page caching for common search results:
http://www.misuse.org/science/2008/02/22/rai ls-page-caching-nginx-ssi-ajax-and-formposts/

Optimization, SQL
• It’s usually your SQL that’s wrong anyway – Watch your indices – NewRelic – 10 minutes instead of 10 hours – Compound indices are very powerful in some DB’s. – LIMIT / OFFSET results (for god’s sake) – Analyze and profile with SQL backend tools: • EXPLAIN ANALYZE in Postgres – Talk with the listservs for your SQL server

Optimization, Hardware
• Get a real SQL server and ISP – I like EngineYard – great guys, solid architecture/hardware • If your SQL box is hammered by your queries and your queries are not “dumb” – there are some tricks like: – Convert result set to id list – store and iterate in session. More memory, less cpu. – Preload common searches into a warehouse. More disk, less cpu. – Page cache commonly returned pages – Use distinct URL’s

There

Geographic Searching
• Use GIS or DB GIS extensions if you have to but it can be easier by just making some assumptions: search areas are small and therefore the world is flat. • High precision is often not that important. • Following is some Ruby that calculates distances between two points based on lat long.

Geographic Searching, Ruby!
RADIUS_OF_EARTH_KM = 6366.71 def deg_to_rad(val) val*(Math::PI/180) end def km_distance(deg_lat1, deg_lng1, deg_lat2, deg_lng2) (Math::acos(Math::sin(deg_to_rad(deg_lat1)) * Math::sin(deg_to_rad(deg_lat2)) + Math::cos(deg_to_rad(deg_lat1)) * Math::cos(deg_to_rad(deg_lat2)) * Math::cos(deg_to_rad(deg_lng1)deg_to_rad(deg_lng2))) * RADIUS_OF_EARTH_KM) end

XKCD

PostgreSQL

Geographic Search, DB
• PostgreSQL is really, really great • Built-in functions and indices to find all points within a polygon • This makes rough geo-searching ridiculously fast – (the world is made flat but if your polygon is small relative to the surface of the earth, who cares?) • Ara T Howard says “Divide the world into a flat grid, map features into a grid cell, use normal db indexing.”

Be Prepared

Gotchas
• ActiveRecord is a dog

– ActiveRecord is not built for lots of objects. Find all the rows you need in SQL. Then pull only those into AR. – If you need to loop through rows use something like Hash Extension which will pull down SQL data as hashes – you can then iterate quickly and convert the ones you want to AR objects as needed: http://enterpriserails.rubyforge.org/hash_e

Gotchas, page 2
• Managing the Browser Cache – Browser caching can screw up your search tool, when the user uses the “back” button to a POST page. – They get a message along the lines of “Cache expired: click reload to post data again.” – Normally this is a good thing, in that case you must tell Rails to tell the browser that caching is “OK” for these specific pages. You do that with this code in your controller action (I use a filter for this): expires_in 24.hours, :private => false

There is Always More
• Steve Midgley • public@misuse.org • www.misuse.org/science
– GeoX: Simple Rails geocoding – MojoMagick: Simple Rails image tool

• www.hutz.com • Happy Coding! • Questions!