Skip to content

ASQL: SQL with Analytics Built In

Analytic SQL (ASQL) is a query language designed for analytics work. It adds built-in analytics features that SQL lacks—guaranteed complete results, cohort analysis, cross-dialect date handling—and uses a pipeline syntax that flows top-to-bottom.

Think of it as SQL with analytics superpowers. Write ASQL, get standard SQL for any dialect.

Status: Experimental. GitHub


The Problem with SQL for Analytics

SQL was designed in the 1970s for transactional systems—banking, inventory, order processing. It answers "what happened?" based on rows that exist. This works great for OLTP.

But analytics is different. When you're building dashboards, tracking trends, or calculating month-over-month metrics, SQL's design causes real problems:

Missing Data Points

SELECT month, SUM(amount) as revenue
FROM orders
GROUP BY month;
month revenue
Jan 1000
Feb 1500
Mar 800
Jun 1200

April and May are missing. Your chart line jumps. Your MoM calculation uses the wrong prior month. Your dashboard looks broken.

Every analytics team builds workarounds: date dimension tables, calendar CTEs, CROSS JOINs, post-processing in Python. It's tedious and error-prone.

Date Dialect Chaos

-- PostgreSQL
SELECT DATE_TRUNC('month', created_at) FROM orders;

-- BigQuery  
SELECT DATE_TRUNC(created_at, MONTH) FROM orders;

-- MySQL
SELECT DATE_FORMAT(created_at, '%Y-%m-01') FROM orders;

Same operation, different syntax per database. Copy-paste between dialects breaks.

Window Function Verbosity

Getting the most recent order per customer—a common analytics pattern:

SELECT * FROM (
    SELECT *, ROW_NUMBER() OVER (
        PARTITION BY customer_id ORDER BY order_date DESC
    ) as rn
    FROM orders
) sub WHERE rn = 1;

This should be one line.


ASQL's Analytics Features

ASQL builds solutions to these problems into the language.

Guaranteed Complete Results

ASQL automatically fills gaps in your grouped data:

from orders
  group by month(order_date) (
    sum(amount) ?? 0 as revenue
  )
month revenue
Jan 1000
Feb 1500
Mar 800
Apr 0
May 0
Jun 1200

No dimension tables. No CTEs. No post-processing. ASQL ensures all expected time periods appear in your results—analytically correct by default.

See Guaranteed Groups for details.

Cross-Dialect Date Handling

Write once, run on any database:

where created_at > 7 days ago
where due_date < 30 days from now
select order_date + 14 days as delivery_date
Operation PostgreSQL BigQuery ASQL
Truncate to month DATE_TRUNC('month', d) DATE_TRUNC(d, MONTH) month(d)
Add days d + INTERVAL '7 days' DATE_ADD(d, INTERVAL 7 DAY) d + 7 days
Date literal DATE '2024-01-01' DATE '2024-01-01' @2024-01-01

ASQL compiles to the correct syntax for your target database.

One-Line Window Patterns

Common analytics operations become single lines:

-- Most recent order per customer
from orders
  per customer_id first by -order_date

-- Previous period comparison
from monthly_sales
  select month, revenue, prior(revenue) as prev_revenue

-- Running totals
from daily_sales
  select date, revenue, running_sum(revenue) as cumulative
ASQL SQL Equivalent
per X first by -date ROW_NUMBER() OVER (PARTITION BY X ORDER BY date DESC) = 1
prior(col) LAG(col, 1) OVER (...)
next(col) LEAD(col, 1) OVER (...)
running_sum(col) SUM(col) OVER (ROWS UNBOUNDED PRECEDING)
rolling_avg(col, 7) AVG(col) OVER (ROWS 6 PRECEDING)

Convention-Based Inference

ASQL infers relationships from naming patterns:

-- Auto-joins on user_id → users.id
from orders
  & users
  select orders.*, users.name

-- FK shorthand: department_id matches departments.id
from employees
  join departments on department_id

When your tables follow standard naming conventions (user_idusers.id), ASQL figures out the join conditions. No more typing the obvious.


Pipeline Syntax

ASQL queries flow top-to-bottom in execution order:

from sales
  where year(date) = 2024
  group by region (sum(amount) as revenue)
  where revenue > 1000
  order by -revenue
  limit 10

Compare to SQL's inside-out structure:

SELECT region, SUM(amount) as revenue  -- 5th
FROM sales                              -- 1st
WHERE YEAR(date) = 2024                 -- 2nd
GROUP BY region                         -- 3rd
HAVING SUM(amount) > 1000               -- 4th
ORDER BY revenue DESC                   -- 6th
LIMIT 10;                               -- 7th

This matters most for analytics work—dbt models, ELT pipelines, dashboards—where queries are written once and read many times. Linear flow makes code review, debugging, and refactoring significantly easier.

CTEs When You Need Them

When you need to reuse intermediate results, use stash as:

from users
  where is_active
  stash as active_users

  group by country (# as total)

Syntax Shortcuts

ASQL adds shorthand for patterns that are verbose in SQL.

Count

--                    -- COUNT(*)
-- users              -- COUNT(DISTINCT user_id)
#(distinct user_id)  -- COUNT(DISTINCT user_id) - explicit

Descending Sort

order by -revenue    -- ORDER BY revenue DESC

Natural Aggregates

These are equivalent:

sum(amount)
sum amount
sum_amount

All produce a column named sum_amount.


Column Operations

Exclude Columns

from users
  except password_hash, internal_notes

Rename

from users
  rename id as user_id

Replace Values

from users
  replace name with upper(name), salary with round(salary, 2)

Sampling

-- Fixed sample size
from orders sample 100

-- Percentage sample
from orders sample 10%

-- Stratified sampling (N per group)
from orders sample 100 per category

Pivot & Unpivot

-- Rows to columns
from sales
  pivot sum(amount) by category values ('Electronics', 'Clothing')

-- Columns to rows
from metrics
  unpivot jan, feb, mar into month, value

-- Expand arrays
from posts
  explode tags as tag

Multi-Dialect Output

ASQL uses SQLGlot for transpilation:

PostgreSQL, MySQL, SQLite, BigQuery, Snowflake, Redshift, DuckDB, Trino, Spark SQL, and more.


What ASQL Is

  • Analytics-first — Built-in features for dashboards, time series, and reporting
  • Analytically correct — Guaranteed complete results prevent missing data bugs
  • SQL-compatible — Familiar vocabulary, compiles to standard SQL
  • Dialect-portable — Write once, run on any database

What ASQL Is Not

  • Not a database
  • Not required—use the generated SQL directly if you prefer
  • Not production-ready (yet)

Getting Started

Documentation

Concepts

Understanding ASQL's design:

Syntax Guide

Reference

Development


Inspiration

  • SQLGlot — SQL parsing and transpilation
  • pandas — Method chaining, column operations
  • PRQL — Pipeline approach to SQL
  • dbt — Many ASQL features are built-in versions of dbt macro patterns
  • Kusto/KQL — Pipeline syntax at scale

⚠️ Work in Progress
ASQL is experimental. Syntax may change. Feedback welcome on GitHub.