SCALE 19x logo
Los Angeles, CA
July 2022

Explaining EXPLAIN



  • Writing queries can be difficult, especially if they are complex, involving many joins, indexes, and other components
  • Once a query is written and sent to the database to be executed, the query planner needs to decide what's the best way to execute that query--what join order, join types, and which indexes to use will return the correct result the fastest.
  • The query planner, as good as it is, needs to make plans based on estimates, and sometimes the best plan isn't the fastest one
  • One of PostgreSQL's most powerful tools for developers and DBAs is EXPLAIN
    • Returns the best plan for a query
    • Completely based on statistics and costs
      • Statistics are gathered through DML activity, or through ANALYZE against a table
      • Costs are calculated based on estimated statistics and pre-determined costs in postgresql.conf
        • cpu_tuple_cost
        • seq_page_cost
        • random_page_cost
    • Does not run the query
    • Runs the query, contains statistics about what it took to actually run the query
      • Rather than report the estimated number of rows, it prints out the actual row count
      • Prints out actual time to run each node
      • With (BUFFERS), can see what was in the cache and what needed to be fetched from disk
    • Here, we can see whether the query planner's selected join types are actually good choices
    • We can also see if work_mem was set too low
    • (optional) Prepared statements and custom/generic plans
  • Why did the planner do that?
    • Failed to choose index
      • Foreign Keys
      • LIKE
      • Ordering/additional calculation on columns
    • Estimates might be off
      • Vacuum/Analyze
  • Auto-EXPLAIN
    • Useful for ORMs
    • Useful for troubleshooting slow queries that you'd expect to be fast
  • Other Tools
    • PEV
    • Depesz 
Los Angeles B
Thursday, July 28, 2022 - 14:30 to 15:30