ASQL Tutorial¶
A hands-on guide to learning ASQL. Follow along step-by-step to master the language.
Prerequisites¶
This tutorial assumes you're familiar with basic SQL concepts like SELECT, FROM, WHERE, and GROUP BY. If you've written SQL queries before, you're ready to go.
Part 1: Your First ASQL Query¶
Start with FROM¶
Every ASQL query starts with from. No more jumping around to find where your data comes from:
from users
This compiles to SELECT * FROM users. Simple.
Add a Filter¶
Chain a where clause:
from users
where status = "active"
Notice we indent with 2 spaces. This visual hierarchy shows the data flow.
Select Specific Columns¶
from users
where status = "active"
select name, email, created_at
Key insight: Read top-to-bottom. Data flows from users → filtered → projected.
Part 2: Aggregations¶
Count Rows¶
Use # for count:
from orders
group by status (
# as order_count
)
The # symbol is just COUNT(*).
Multiple Aggregations¶
from orders
group by customer_id (
sum(amount) as total_spent,
# as order_count,
avg(amount) as avg_order
)
Distinct Count (## or uniq)¶
Use double hash ## or uniq() to count distinct values:
from orders
group by status (
# as total_orders,
##user_id as unique_customers
)
The ##user_id becomes COUNT(DISTINCT user_id). You can also use uniq(user_id).
Part 3: Sorting & Limiting¶
Descending Order¶
Use - prefix for descending:
from users
order by -created_at
This is ORDER BY created_at DESC.
Combine with Limit¶
from orders
group by customer_id (sum(amount) as total)
order by -total
limit 10
Top 10 customers by total spend.
Part 4: Joins¶
Join Operators¶
ASQL uses symbolic operators:
| Operator | Type | Meaning |
|---|---|---|
& |
INNER | Both must match |
&? |
LEFT | Right side optional |
?& |
RIGHT | Left side optional |
?&? |
FULL | Both optional |
Example: Left Join¶
from users
&? orders on users.id = orders.user_id
select users.name, orders.amount
The ? marks the "maybe null" side.
Multi-Table Joins¶
from orders
& customers on orders.customer_id = customers.id
& products on orders.product_id = products.id
select customers.name, products.name, orders.amount
Part 5: Date Handling¶
Date Literals¶
Use @ prefix:
from orders
where order_date >= @2024-01-01
Relative Dates¶
from orders
where created_at > 7 days ago
from orders
where due_date < 30 days from now
Date Truncation¶
Group by month:
from orders
group by month(created_at) (
sum_amount -- No alias needed -
)
Part 6: Window Functions¶
Get Previous Value¶
from monthly_sales
order by month
select month, revenue, prior(revenue) as prev_month
Running Totals¶
from daily_sales
order by date
select date, amount, running_sum(amount) as cumulative
First/Last Per Group¶
Most recent order per customer:
from orders
per customer_id first by -order_date
This keeps only the first row (ordered by date descending) for each customer.
Part 7: Sampling (New!)¶
Random Sample¶
Get 100 random rows:
from orders
sample 100
Percentage Sample¶
Get ~10% of rows:
from orders
sample 10%
Stratified Sample¶
100 random rows per category:
from products
sample 100 per category
Part 8: Reshaping Data (New!)¶
Pivot: Rows to Columns¶
from sales
pivot sum_amount by status values ("pending", "shipped", "delivered")
group by customer_id
Creates columns pending, shipped, delivered with sum of amounts.
Unpivot: Columns to Rows¶
from quarterly_data
unpivot q1, q2, q3, q4 into quarter, value
Turns wide data into long format.
Explode: Arrays to Rows¶
from posts
explode tags as tag
group by tag (# as post_count)
One row per array element.
Part 9: Column Operators (New!)¶
Exclude Columns¶
from users
except password_hash, internal_notes
Rename Columns¶
from users
rename id as user_id
Replace Values¶
from users
replace name with upper(name), email with lower(email)
Part 10: CTEs with stash¶
Save Intermediate Results¶
from orders
where status = "completed"
stash as completed_orders
group by customer_id (sum(amount) as total)
order by -total
limit 10
The stash as creates a CTE and continues the pipeline.
Top-Level Variables¶
from users
where is_active
stash as active_users
from active_users
group by country (# as total)
Putting It All Together¶
Here's a complete analytics query:
from orders
where created_at > 30 days ago
where status = "completed"
&? customers on orders.customer_id = customers.id
group by customers.country (
sum_amount, -- Total revenue (column: sum_amount)
# as order_count, -- Number of orders
# customers as unique_customers, -- Distinct customers
avg_amount -- Average order value (column: avg_amount)
)
order by -sum_amount
limit 10
This query: 1. Filters to completed orders from last 30 days 2. Left joins with customers 3. Groups by country 4. Computes revenue, counts, and average 5. Sorts by revenue 6. Takes top 10 7. Selects final columns
Quick Reference¶
| ASQL | SQL |
|---|---|
from users |
SELECT * FROM users |
where x = 1 |
WHERE x = 1 |
order by -col |
ORDER BY col DESC |
# |
COUNT(*) |
##col or uniq(col) |
COUNT(DISTINCT col) |
& |
INNER JOIN |
&? |
LEFT JOIN |
@2024-01-01 |
DATE '2024-01-01' |
7 days ago |
CURRENT_DATE - INTERVAL '7 days' |
prior(col) |
LAG(col) |
running_sum(col) |
SUM(col) OVER (ROWS UNBOUNDED PRECEDING) |
sample 100 |
ORDER BY RANDOM() LIMIT 100 |
except col |
SELECT * EXCEPT(col) |
Next Steps¶
- Playground — Try these examples interactively
- Syntax Guide — Deep dive into each feature
- Examples — More real-world patterns
- Language Specification — Complete reference
Exercises¶
Try writing these queries in the playground:
- Basic: Get the top 5 countries by user count
- Joins: Find orders with their customer names, sorted by amount
- Dates: Get monthly revenue for 2024
- Window: Add a running total to daily sales
- Sampling: Get 50 random products per category
- Pivot: Turn status counts into columns
Solutions
**1. Top 5 countries:**from users
group by country (# as user_count)
order by -user_count
limit 5
**2. Orders with customers:**
from orders
&? customers on orders.customer_id = customers.id
select orders.id, customers.name, orders.amount
order by -orders.amount
**3. Monthly revenue:**
from orders
where year(created_at) = 2024
group by month(created_at) (
sum_amount -- Column named sum_amount, referenced in order by
)
order by -sum_amount
**4. Running total:**
from daily_sales
order by date
select date, amount, running_sum(amount) as cumulative
**5. Stratified sample:**
from products
sample 50 per category
**6. Status pivot:**
from orders
pivot # by status values ("pending", "shipped", "delivered")
**7. Cohort analysis:**
from events
join users
group by
month(users.signup_date) as cohort,
months_between(events.event_date, users.signup_date) as period
(count(distinct user_id) as active)
This tracks monthly active users by their signup month using alias reuse - no special syntax needed!
Part 8: Cohort Analysis¶
Cohort analysis is one of the most powerful patterns in analytics. In ASQL, it's straightforward using computed columns and alias reuse.
What is Cohort Analysis?¶
Cohort analysis groups users by when they started (their "cohort") and tracks their behavior over time. For example: - How many users from January 2024 are still active in March 2024? - How does revenue from February signups compare to March signups?
Basic Cohort Query¶
from events
join users
group by
month(users.signup_date) as cohort,
months_between(events.event_date, users.signup_date) as period
(
count(distinct user_id) as active
)
order by cohort, period
Output:
- cohort: When users signed up
- period: Months since signup (0 = signup month, 1 = first month after, etc.)
- active: Active users in that period
The key insight: ASQL's alias reuse lets you reference cohort immediately after defining it. No CTEs needed!
Retention with Cohort Size¶
To calculate retention percentages, add cohort size using a window function:
from events
join users
group by
month(users.signup_date) as cohort,
months_between(events.event_date, users.signup_date) as period
(
count(distinct user_id) as active
)
select *,
first(active) over cohort as cohort_size,
round(active::numeric / cohort_size * 100, 1) as retention_pct
order by cohort, period
Revenue Cohorts with LTV¶
Track revenue by first purchase cohort and calculate lifetime value:
from orders
join customers
group by
month(customers.first_order_date) as cohort,
months_between(orders.order_date, customers.first_order_date) as period
(
sum(total) as revenue,
count(distinct customer_id) as customers
)
select *,
first(customers) over cohort as cohort_size,
running_sum(revenue) over cohort / cohort_size as ltv
order by cohort, period
Period-over-Period Analysis¶
Add window functions for period-over-period comparisons:
from events
join users
group by
month(users.signup_date) as cohort,
months_between(events.event_date, users.signup_date) as period
(
count(distinct user_id) as active
)
select *,
prior(active) over cohort as prev_period_active,
active - prev_period_active as change
order by cohort, period
Why ASQL Makes This Simple¶
SQL (50+ lines):
WITH user_cohorts AS (
SELECT user_id, DATE_TRUNC('month', signup_date) AS cohort_month
FROM users
),
activity_months AS (
SELECT user_id, DATE_TRUNC('month', event_date) AS activity_month
FROM events
GROUP BY 1, 2
),
cohort_sizes AS (
SELECT cohort_month, COUNT(*) AS size
FROM user_cohorts GROUP BY 1
),
cohort_activity AS (
SELECT
uc.cohort_month,
EXTRACT(YEAR FROM AGE(am.activity_month, uc.cohort_month)) * 12 +
EXTRACT(MONTH FROM AGE(am.activity_month, uc.cohort_month)) AS period,
COUNT(DISTINCT uc.user_id) AS active
FROM user_cohorts uc
JOIN activity_months am ON uc.user_id = am.user_id
WHERE am.activity_month >= uc.cohort_month
GROUP BY 1, 2
)
SELECT
ca.cohort_month, cs.size, ca.period,
ca.active, ROUND(ca.active::numeric / cs.size * 100, 1) AS retention
FROM cohort_activity ca
JOIN cohort_sizes cs ON ca.cohort_month = cs.cohort_month
ORDER BY ca.cohort_month, ca.period;
ASQL (computed columns + group by):
from events
join users
group by
month(users.signup_date) as cohort,
months_between(events.event_date, users.signup_date) as period
(count(distinct user_id) as active)
order by cohort, period
The simplicity comes from ASQL's alias reuse - you can reference computed columns immediately!
Next Steps¶
You've learned the fundamentals of ASQL! To dive deeper:
- Syntax Reference: See syntax documentation for detailed syntax guides
- Examples: Check out more examples for real-world patterns
- Specification: Read the full specification for complete language details
- Playground: Try queries interactively at play.analyticsql.com
Happy querying! 🚀