This action might not be possible to undo. Are you sure you want to continue?
Presented by, MySQL & O’Reilly Media, Inc.
Jim Starkey firstname.lastname@example.org
Why Falcon? Because the World is Changing!
Hardware is evolving rapidly Customers need ACID transactions Atomic – the books should balance Consistent – the alternative is chaos Isolated – preserve programmer’s sanity(sic) Durable – who wants to lose data?
Where Hardware is going
CPUs breed like rabbits – more sockets, more cores per socket, more threads per core Memory is bigger, faster, and cheaper Disks are bigger and cheaper but not much faster (Boxes are cheaper and more plentiful, but that’s a different story)
Where Applications are going
Batch – dead! Timesharing – dead! Departmental computing – dead! Client server – fading fast Application servers for most of us Web services for the really big guys
The Database challenge
Traditional challenge: Exhaust CPU, memory, and disk simultaneously Today’s challenge: Exhaust CPU and memory and avoid the disk
Use memory (page cache) to avoid disk reads Use memory (record cache) to avoid the page cache manipulation. Use CPU to find the fastest path to a record Use CPU to minimize record size Synchronize most data structures with user mode read/write locks Synchronize high contention data structures with interlocked instructions.
The Falcon architecture
Incomplete in-memory database with disk backfill Multi-version concurrency control in memory Updates in memory until commit Group commits to a single serial log write Post-commit multi-threaded pipe line to move updates to disk
Incomplete in-memory database
Selected records cached in memory Separate cache for disk pages Record cache hit is 15% the cost of a page cache hit Record cache is more memory efficient than page cache
Record Encoding - Cache Efficiency
Records encoded by value, not declaration String “abc” occupies the same space in varchar(3) or varchar(4096) The number 7 is the same where small, medium, int, bigint, decimal, or numeric
Multi-Version Concurrency Control
Update operations create new record versions New version is tagged with transaction id, points to old version System tracks which transactions should see which versions Readers don’t block writers Everyone sees a consistent view of the data
Updates Are in Memory Until Commit
Updates held in memory pending commit (well, usually) Index changes held in memory pending commit (same caveat) Verb rollback is dirt cheap Transaction rollback is dirt cheap
Pending record updates flushed to serial log Pending index updates flushed to serial log Commit record written to serial log Serial log flushed to the oxide And the transaction is committed!
Alas, Memory isn’t infinite, so
Large transaction chills uncommitted data (flushes it to the log early) Chilled records can be thawed (fetched from the log) Scavenger garbage collects unloved records periodically When things get really bad, entire record chains flushed to backlog (Note: This is hard and we aren’t done.)
Transactions are ACID but not serializable Latency advantage disappears at saturation Very large transactions degrade performance Optimized for Web, not batch
Runs like a memory database when data fits in cache Scales like disk-based database when data doesn’t fit in cache Lowest possible latency for Web applications Absorbs huge spiky loads
Generally benchmark against InnoDB (transactional engines) We use the DBT2 benchmark: High contention Write intensive – 40% records touched are updated Measures only performance at saturation DBT2 (we believe) is InnoDB’s best spot and Falcon’s worst
16 & 8 cpu system: Falcon exceeds InnoDB performance 4 cpu systems: Falcon exceeds InnoDB performance for moderate to large number of threads 2 cpu systems: Rough parity, advantage to InnoDB 1 cpu systems: InnoDB wins Caveat: Results subject to change! Both systems are moving targets!!!
When should you use what?
If you don’t need ACID, MyISAM is probably fastest For Uniprocessors and small memory systems, InnoDB is a good choice For large transaction batch, InnoDB may be best match For multi-cores and large number of threads, Falcon is probably best For the Web, Falcon is hard to beat.