ASQL Tutorial¶
A hands-on guide to learning ASQL. Follow along step-by-step to master the language.
Prerequisites¶
This tutorial assumes you're familiar with basic SQL concepts like SELECT, FROM, WHERE, and GROUP BY. If you've written SQL queries before, you're ready to go.
Part 1: Your First ASQL Query¶
Start with FROM¶
Every ASQL query starts with from. No more jumping around to find where your data comes from:
from users
This compiles to SELECT * FROM users. Simple.
Add a Filter¶
Chain a where clause:
from users
where status = "active"
Notice we indent with 2 spaces. This visual hierarchy shows the data flow.
Select Specific Columns¶
from users
where status = "active"
select name, email, created_at
Key insight: Read top-to-bottom. Data flows from users → filtered → projected.
Part 2: Aggregations¶
Count Rows¶
Use # for count:
from orders
group by status (
# as order_count
)
The # symbol is just COUNT(*).
Multiple Aggregations¶
from orders
group by customer_id (
sum(amount) as total_spent,
# as order_count,
avg(amount) as avg_order
)
Count Entities (New!)¶
When followed by a table name, # automatically counts distinct entities:
from orders
group by status (
# as total_orders,
# users as unique_customers
)
The # users becomes COUNT(DISTINCT user_id) — ASQL infers the primary key!
Part 3: Sorting & Limiting¶
Descending Order¶
Use - prefix for descending:
from users
order by -created_at
This is ORDER BY created_at DESC.
Combine with Limit¶
from orders
group by customer_id (sum(amount) as total)
order by -total
limit 10
Top 10 customers by total spend.
Part 4: Joins¶
Join Operators¶
ASQL uses symbolic operators:
| Operator | Type | Meaning |
|---|---|---|
& |
INNER | Both must match |
&? |
LEFT | Right side optional |
?& |
RIGHT | Left side optional |
?&? |
FULL | Both optional |
Example: Left Join¶
from users
&? orders on users.id = orders.user_id
select users.name, orders.amount
The ? marks the "maybe null" side.
Multi-Table Joins¶
from orders
& customers on orders.customer_id = customers.id
& products on orders.product_id = products.id
select customers.name, products.name, orders.amount
Part 5: Date Handling¶
Date Literals¶
Use @ prefix:
from orders
where order_date >= @2024-01-01
Relative Dates¶
from orders
where created_at > 7 days ago
from orders
where due_date < 30 days from now
Date Truncation¶
Group by month:
from orders
group by month(created_at) (
sum_amount -- No alias needed -
)
Part 6: Window Functions¶
Get Previous Value¶
from monthly_sales
order by month
select month, revenue, prior(revenue) as prev_month
Running Totals¶
from daily_sales
order by date
select date, amount, running_sum(amount) as cumulative
First/Last Per Group¶
Most recent order per customer:
from orders
per customer_id first by -order_date
This keeps only the first row (ordered by date descending) for each customer.
Part 7: Sampling (New!)¶
Random Sample¶
Get 100 random rows:
from orders
sample 100
Percentage Sample¶
Get ~10% of rows:
from orders
sample 10%
Stratified Sample¶
100 random rows per category:
from products
sample 100 per category
Part 8: Reshaping Data (New!)¶
Pivot: Rows to Columns¶
from sales
pivot sum_amount by status values ("pending", "shipped", "delivered")
group by customer_id
Creates columns pending, shipped, delivered with sum of amounts.
Unpivot: Columns to Rows¶
from quarterly_data
unpivot q1, q2, q3, q4 into quarter, value
Turns wide data into long format.
Explode: Arrays to Rows¶
from posts
explode tags as tag
group by tag (# as post_count)
One row per array element.
Part 9: Column Operators (New!)¶
Exclude Columns¶
from users
except password_hash, internal_notes
Rename Columns¶
from users
rename id as user_id
Replace Values¶
from users
replace name with upper(name), email with lower(email)
Part 10: CTEs with stash¶
Save Intermediate Results¶
from orders
where status = "completed"
stash as completed_orders
group by customer_id (sum(amount) as total)
order by -total
limit 10
The stash as creates a CTE and continues the pipeline.
Top-Level Variables¶
from users
where is_active
stash as active_users
from active_users
group by country (# as total)
Putting It All Together¶
Here's a complete analytics query:
from orders
where created_at > 30 days ago
where status = "completed"
&? customers on orders.customer_id = customers.id
group by customers.country (
sum_amount, -- Total revenue (column: sum_amount)
# as order_count, -- Number of orders
# customers as unique_customers, -- Distinct customers
avg_amount -- Average order value (column: avg_amount)
)
order by -sum_amount
limit 10
This query: 1. Filters to completed orders from last 30 days 2. Left joins with customers 3. Groups by country 4. Computes revenue, counts, and average 5. Sorts by revenue 6. Takes top 10 7. Selects final columns
Quick Reference¶
| ASQL | SQL |
|---|---|
from users |
SELECT * FROM users |
where x = 1 |
WHERE x = 1 |
order by -col |
ORDER BY col DESC |
# |
COUNT(*) |
# users |
COUNT(DISTINCT user_id) |
& |
INNER JOIN |
&? |
LEFT JOIN |
@2024-01-01 |
DATE '2024-01-01' |
7 days ago |
CURRENT_DATE - INTERVAL '7 days' |
prior(col) |
LAG(col) |
running_sum(col) |
SUM(col) OVER (ROWS UNBOUNDED PRECEDING) |
sample 100 |
ORDER BY RANDOM() LIMIT 100 |
except col |
SELECT * EXCEPT(col) |
Next Steps¶
- Playground — Try these examples interactively
- Syntax Guide — Deep dive into each feature
- Examples — More real-world patterns
- Language Specification — Complete reference
Exercises¶
Try writing these queries in the playground:
- Basic: Get the top 5 countries by user count
- Joins: Find orders with their customer names, sorted by amount
- Dates: Get monthly revenue for 2024
- Window: Add a running total to daily sales
- Sampling: Get 50 random products per category
- Pivot: Turn status counts into columns
Solutions
**1. Top 5 countries:**from users
group by country (# as user_count)
order by -user_count
limit 5
**2. Orders with customers:**
from orders
&? customers on orders.customer_id = customers.id
select orders.id, customers.name, orders.amount
order by -orders.amount
**3. Monthly revenue:**
from orders
where year(created_at) = 2024
group by month(created_at) (
sum_amount -- Column named sum_amount, referenced in order by
)
order by -sum_amount
**4. Running total:**
from daily_sales
order by date
select date, amount, running_sum(amount) as cumulative
**5. Stratified sample:**
from products
sample 50 per category
**6. Status pivot:**
from orders
pivot # by status values ("pending", "shipped", "delivered")
**7. Cohort analysis:**
from events
group by month(event_date) (count(distinct user_id) as active)
cohort by month(users.signup_date)
This tracks monthly active users by their signup month - a complex query that would take 50+ lines in SQL!
Part 8: Cohort Analysis¶
Cohort analysis is one of the most powerful patterns in analytics, but it's notoriously complex in SQL. ASQL makes it simple.
What is Cohort Analysis?¶
Cohort analysis groups users by when they started (their "cohort") and tracks their behavior over time. For example: - How many users from January 2024 are still active in March 2024? - How does revenue from February signups compare to March signups?
Basic Cohort Query¶
from events
group by month(event_date) (count(distinct user_id) as active)
cohort by month(users.signup_date)
This single query: - Groups events by month - Assigns users to cohorts based on signup month - Calculates periods since signup - Tracks active users per period
Output:
- cohort_month: When users signed up
- period: Months since signup (0 = signup month, 1 = first month after, etc.)
- active: Active users in that period
- cohort_size: Total users in the cohort
Revenue Cohorts¶
Track revenue by first purchase cohort:
from orders
group by month(order_date) (sum(total) as revenue)
cohort by month(customers.first_order_date)
With Period-over-Period Analysis¶
Add window functions for period-over-period comparisons:
from events
group by month(event_date) (count(distinct user_id) as active)
cohort by month(users.signup_date)
select
cohort_month,
period,
active,
prior(active) as prev_period_active,
active - prior(active) as change
Lifetime Value (LTV)¶
Calculate cumulative revenue and LTV:
from orders
group by month(order_date) (sum(total) as revenue)
cohort by month(customers.first_order_date)
select
cohort_month,
period,
revenue,
running_sum(revenue) as cumulative_revenue,
running_sum(revenue) / cohort_size as ltv
Why Cohort Analysis Matters¶
SQL (50+ lines):
WITH user_cohorts AS (
SELECT user_id, DATE_TRUNC('month', signup_date) AS cohort_month
FROM users
),
activity_months AS (
SELECT user_id, DATE_TRUNC('month', event_date) AS activity_month
FROM events
GROUP BY 1, 2
),
cohort_sizes AS (
SELECT cohort_month, COUNT(*) AS size
FROM user_cohorts GROUP BY 1
),
cohort_activity AS (
SELECT
uc.cohort_month,
EXTRACT(YEAR FROM AGE(am.activity_month, uc.cohort_month)) * 12 +
EXTRACT(MONTH FROM AGE(am.activity_month, uc.cohort_month)) AS period,
COUNT(DISTINCT uc.user_id) AS active
FROM user_cohorts uc
JOIN activity_months am ON uc.user_id = am.user_id
WHERE am.activity_month >= uc.cohort_month
GROUP BY 1, 2
)
SELECT
ca.cohort_month, cs.size, ca.period,
ca.active, ROUND(ca.active::numeric / cs.size * 100, 1) AS retention
FROM cohort_activity ca
JOIN cohort_sizes cs ON ca.cohort_month = cs.cohort_month
ORDER BY ca.cohort_month, ca.period;
ASQL (3 lines):
from events
group by month(event_date) (count(distinct user_id) as active)
cohort by month(users.signup_date)
94% reduction in complexity! 🎉
Next Steps¶
You've learned the fundamentals of ASQL! To dive deeper:
- Syntax Reference: See syntax documentation for detailed syntax guides
- Examples: Check out more examples for real-world patterns
- Specification: Read the full specification for complete language details
- Playground: Try queries interactively at play.analyticsql.com
Happy querying! 🚀