ASQL Tutorial¶

A hands-on guide to learning ASQL. Follow along step-by-step to master the language.

Prerequisites¶

This tutorial assumes you're familiar with basic SQL concepts like SELECT, FROM, WHERE, and GROUP BY. If you've written SQL queries before, you're ready to go.

Part 1: Your First ASQL Query¶

Start with FROM¶

Every ASQL query starts with from. No more jumping around to find where your data comes from:

from users

This compiles to SELECT * FROM users. Simple.

Add a Filter¶

Chain a where clause:

from users
  where status = "active"

Notice we indent with 2 spaces. This visual hierarchy shows the data flow.

Select Specific Columns¶

from users
  where status = "active"
  select name, email, created_at

Key insight: Read top-to-bottom. Data flows from users → filtered → projected.

Part 2: Aggregations¶

Count Rows¶

Use # for count:

from orders
  group by status (
    # as order_count
  )

The # symbol is just COUNT(*).

Multiple Aggregations¶

from orders
  group by customer_id (
    sum(amount) as total_spent,
    # as order_count,
    avg(amount) as avg_order
  )

Distinct Count (`##` or `uniq`)¶

Use double hash ## or uniq() to count distinct values:

from orders
  group by status (
    # as total_orders,
    ##user_id as unique_customers
  )

The ##user_id becomes COUNT(DISTINCT user_id). You can also use uniq(user_id).

Part 3: Sorting & Limiting¶

Descending Order¶

Use - prefix for descending:

from users
  order by -created_at

This is ORDER BY created_at DESC.

Combine with Limit¶

from orders
  group by customer_id (sum(amount) as total)
  order by -total
  limit 10

Top 10 customers by total spend.

Part 4: Joins¶

Join Operators¶

ASQL uses symbolic operators:

Operator	Type	Meaning
`&`	INNER	Both must match
`&?`	LEFT	Right side optional
`?&`	RIGHT	Left side optional
`?&?`	FULL	Both optional

Example: Left Join¶

from users
  &? orders on users.id = orders.user_id
  select users.name, orders.amount

The ? marks the "maybe null" side.

Multi-Table Joins¶

from orders
  & customers on orders.customer_id = customers.id
  & products on orders.product_id = products.id
  select customers.name, products.name, orders.amount

Part 5: Date Handling¶

Date Literals¶

Use @ prefix:

from orders
  where order_date >= @2024-01-01

Relative Dates¶

from orders
  where created_at > 7 days ago

from orders
  where due_date < 30 days from now

Date Truncation¶

Group by month:

from orders
  group by month(created_at) (
    sum_amount  -- No alias needed - 
  )

Part 6: Window Functions¶

Get Previous Value¶

from monthly_sales
  order by month
  select month, revenue, prior(revenue) as prev_month

Running Totals¶

from daily_sales
  order by date
  select date, amount, running_sum(amount) as cumulative

First/Last Per Group¶

Most recent order per customer:

from orders
  per customer_id first by -order_date

This keeps only the first row (ordered by date descending) for each customer.

Part 7: Sampling (New!)¶

Random Sample¶

Get 100 random rows:

from orders
  sample 100

Percentage Sample¶

Get ~10% of rows:

from orders
  sample 10%

Stratified Sample¶

100 random rows per category:

from products
  sample 100 per category

Part 8: Reshaping Data (New!)¶

Pivot: Rows to Columns¶

from sales
  pivot sum_amount by status values ("pending", "shipped", "delivered")
  group by customer_id

Creates columns pending, shipped, delivered with sum of amounts.

Unpivot: Columns to Rows¶

from quarterly_data
  unpivot q1, q2, q3, q4 into quarter, value

Turns wide data into long format.

Explode: Arrays to Rows¶

from posts
  explode tags as tag
  group by tag (# as post_count)

One row per array element.

Part 9: Column Operators (New!)¶

Exclude Columns¶

from users
  except password_hash, internal_notes

Rename Columns¶

from users
  rename id as user_id

Replace Values¶

from users
  replace name with upper(name), email with lower(email)

Part 10: CTEs with stash¶

Save Intermediate Results¶

from orders
  where status = "completed"
  stash as completed_orders
  group by customer_id (sum(amount) as total)
  order by -total
  limit 10

The stash as creates a CTE and continues the pipeline.

Top-Level Variables¶

from users
  where is_active
  stash as active_users

from active_users
  group by country (# as total)

Putting It All Together¶

Here's a complete analytics query:

from orders
  where created_at > 30 days ago
  where status = "completed"
  &? customers on orders.customer_id = customers.id
  group by customers.country (
    sum_amount,                    -- Total revenue (column: sum_amount)
    # as order_count,              -- Number of orders
    # customers as unique_customers,  -- Distinct customers
    avg_amount                     -- Average order value (column: avg_amount)
  )
  order by -sum_amount
  limit 10

This query: 1. Filters to completed orders from last 30 days 2. Left joins with customers 3. Groups by country 4. Computes revenue, counts, and average 5. Sorts by revenue 6. Takes top 10 7. Selects final columns

Quick Reference¶

ASQL	SQL
`from users`	`SELECT * FROM users`
`where x = 1`	`WHERE x = 1`
`order by -col`	`ORDER BY col DESC`
`#`	`COUNT(*)`
`##col` or `uniq(col)`	`COUNT(DISTINCT col)`
`&`	`INNER JOIN`
`&?`	`LEFT JOIN`
`@2024-01-01`	`DATE '2024-01-01'`
`7 days ago`	`CURRENT_DATE - INTERVAL '7 days'`
`prior(col)`	`LAG(col)`
`running_sum(col)`	`SUM(col) OVER (ROWS UNBOUNDED PRECEDING)`
`sample 100`	`ORDER BY RANDOM() LIMIT 100`
`except col`	`SELECT * EXCEPT(col)`

Next Steps¶

Playground — Try these examples interactively
Syntax Guide — Deep dive into each feature
Examples — More real-world patterns
Language Specification — Complete reference

Exercises¶

Try writing these queries in the playground:

Basic: Get the top 5 countries by user count
Joins: Find orders with their customer names, sorted by amount
Dates: Get monthly revenue for 2024
Window: Add a running total to daily sales
Sampling: Get 50 random products per category
Pivot: Turn status counts into columns

Solutions

**1. Top 5 countries:**

from users
  group by country (# as user_count)
  order by -user_count
  limit 5

**2. Orders with customers:**

from orders
  &? customers on orders.customer_id = customers.id
  select orders.id, customers.name, orders.amount
  order by -orders.amount

**3. Monthly revenue:**

from orders
  where year(created_at) = 2024
  group by month(created_at) (
    sum_amount  -- Column named sum_amount, referenced in order by
  )
  order by -sum_amount

**4. Running total:**

from daily_sales
  order by date
  select date, amount, running_sum(amount) as cumulative

**5. Stratified sample:**

from products
  sample 50 per category

**6. Status pivot:**

from orders
  pivot # by status values ("pending", "shipped", "delivered")

**7. Cohort analysis:**

from events
  join users
  group by 
    month(users.signup_date) as cohort,
    months_between(events.event_date, users.signup_date) as period
  (count(distinct user_id) as active)

This tracks monthly active users by their signup month using alias reuse - no special syntax needed!

Part 8: Cohort Analysis¶

Cohort analysis is one of the most powerful patterns in analytics. In ASQL, it's straightforward using computed columns and alias reuse.

What is Cohort Analysis?¶

Cohort analysis groups users by when they started (their "cohort") and tracks their behavior over time. For example: - How many users from January 2024 are still active in March 2024? - How does revenue from February signups compare to March signups?

Basic Cohort Query¶

from events
join users
group by 
  month(users.signup_date) as cohort,
  months_between(events.event_date, users.signup_date) as period
(
  count(distinct user_id) as active
)
order by cohort, period

Output: - cohort: When users signed up - period: Months since signup (0 = signup month, 1 = first month after, etc.) - active: Active users in that period

The key insight: ASQL's alias reuse lets you reference cohort immediately after defining it. No CTEs needed!

Retention with Cohort Size¶

To calculate retention percentages, add cohort size using a window function:

from events
join users
group by 
  month(users.signup_date) as cohort,
  months_between(events.event_date, users.signup_date) as period
(
  count(distinct user_id) as active
)
select *,
  first(active) over cohort as cohort_size,
  round(active::numeric / cohort_size * 100, 1) as retention_pct
order by cohort, period

Revenue Cohorts with LTV¶

Track revenue by first purchase cohort and calculate lifetime value:

from orders
join customers
group by 
  month(customers.first_order_date) as cohort,
  months_between(orders.order_date, customers.first_order_date) as period
(
  sum(total) as revenue,
  count(distinct customer_id) as customers
)
select *,
  first(customers) over cohort as cohort_size,
  running_sum(revenue) over cohort / cohort_size as ltv
order by cohort, period

Period-over-Period Analysis¶

Add window functions for period-over-period comparisons:

from events
join users
group by 
  month(users.signup_date) as cohort,
  months_between(events.event_date, users.signup_date) as period
(
  count(distinct user_id) as active
)
select *,
  prior(active) over cohort as prev_period_active,
  active - prev_period_active as change
order by cohort, period

Why ASQL Makes This Simple¶

SQL (50+ lines):

WITH user_cohorts AS (
    SELECT user_id, DATE_TRUNC('month', signup_date) AS cohort_month
    FROM users
),
activity_months AS (
    SELECT user_id, DATE_TRUNC('month', event_date) AS activity_month
    FROM events
    GROUP BY 1, 2
),
cohort_sizes AS (
    SELECT cohort_month, COUNT(*) AS size
    FROM user_cohorts GROUP BY 1
),
cohort_activity AS (
    SELECT 
        uc.cohort_month,
        EXTRACT(YEAR FROM AGE(am.activity_month, uc.cohort_month)) * 12 +
        EXTRACT(MONTH FROM AGE(am.activity_month, uc.cohort_month)) AS period,
        COUNT(DISTINCT uc.user_id) AS active
    FROM user_cohorts uc
    JOIN activity_months am ON uc.user_id = am.user_id
    WHERE am.activity_month >= uc.cohort_month
    GROUP BY 1, 2
)
SELECT 
    ca.cohort_month, cs.size, ca.period,
    ca.active, ROUND(ca.active::numeric / cs.size * 100, 1) AS retention
FROM cohort_activity ca
JOIN cohort_sizes cs ON ca.cohort_month = cs.cohort_month
ORDER BY ca.cohort_month, ca.period;

ASQL (computed columns + group by):

from events
join users
group by 
  month(users.signup_date) as cohort,
  months_between(events.event_date, users.signup_date) as period
(count(distinct user_id) as active)
order by cohort, period

The simplicity comes from ASQL's alias reuse - you can reference computed columns immediately!

Next Steps¶

You've learned the fundamentals of ASQL! To dive deeper:

Syntax Reference: See syntax documentation for detailed syntax guides
Examples: Check out more examples for real-world patterns
Specification: Read the full specification for complete language details
Playground: Try queries interactively at play.analyticsql.com

Happy querying! 🚀

ASQL Tutorial¶

Prerequisites¶

Part 1: Your First ASQL Query¶

Start with FROM¶

Add a Filter¶

Select Specific Columns¶

Part 2: Aggregations¶

Count Rows¶

Multiple Aggregations¶

Distinct Count (## or uniq)¶

Part 3: Sorting & Limiting¶

Descending Order¶

Combine with Limit¶

Part 4: Joins¶

Join Operators¶

Example: Left Join¶

Multi-Table Joins¶

Part 5: Date Handling¶

Date Literals¶

Relative Dates¶

Date Truncation¶

Part 6: Window Functions¶

Get Previous Value¶

Running Totals¶

First/Last Per Group¶

Part 7: Sampling (New!)¶

Random Sample¶

Percentage Sample¶

Stratified Sample¶

Part 8: Reshaping Data (New!)¶

Pivot: Rows to Columns¶

Unpivot: Columns to Rows¶

Explode: Arrays to Rows¶

Part 9: Column Operators (New!)¶

Exclude Columns¶

Rename Columns¶

Replace Values¶

Part 10: CTEs with stash¶

Save Intermediate Results¶

Top-Level Variables¶

Putting It All Together¶

Quick Reference¶

Next Steps¶

Exercises¶

Part 8: Cohort Analysis¶

What is Cohort Analysis?¶

Basic Cohort Query¶

Retention with Cohort Size¶

Revenue Cohorts with LTV¶

Period-over-Period Analysis¶

Why ASQL Makes This Simple¶

Next Steps¶

Distinct Count (`##` or `uniq`)¶