Skip to main content

One post tagged with "duckdb"

View All Tags

Query 1B Rows in PostgreSQL >25x Faster with Squirrels!

· 5 min read
Tim Huang
Co-Founder of Squirrels Analytics

The One Billion Row Challenge has been making waves in the data engineering community lately. Originally created to test CSV parsing performance, the challenge involves processing a file containing 1 billion weather measurements to calculate basic temperature statistics for each city. In this post, I'll tackle a variation of this challenge using PostgreSQL and demonstrate how to achieve dramatic performance improvements using Squirrels.

The Challenge​

The original One Billion Row Challenge focuses on raw CSV processing performance. For our variation, we'll:

  1. Load 1 billion rows into PostgreSQL with additional columns
  2. Query for city-level temperature statistics
  3. Create a Squirrels project to serve these analytics via REST API
  4. Demonstrate significant query performance improvements
  5. Show how to handle incremental data updates