Cohort Analysis¶

Cohort analysis groups users by a shared characteristic (usually when they "started") and tracks their behavior over time. ASQL's features make cohort queries simple and readable.

What is Cohort Analysis?¶

Cohort analysis tracks the same entities over time periods relative to when they started. Key components:

Cohort: When each entity "started" (signup month, first purchase month, etc.)
Period: Time elapsed since their cohort start (0 = start, 1 = first period after, etc.)
Cohort size: How many entities in each cohort (for retention %)

The Simple Pattern¶

In ASQL, cohort analysis is just computed columns + GROUP BY:

from orders
join users
group by 
  month(users.signup_date) as cohort,
  months_between(orders.created_at, users.signup_date) as period
(
  count(distinct customer_id) as active
)

That's it. ASQL's alias reuse lets you reference cohort and period immediately after defining them.

Why This Works¶

Traditional SQL requires CTEs because you can't reference a computed column in the same SELECT:

-- This DOESN'T work in SQL:
SELECT 
  DATE_TRUNC('month', signup_date) AS cohort,
  DATEDIFF(month, cohort, order_date) AS period  -- ERROR!
FROM ...

ASQL's alias reuse handles this automatically - each alias becomes available to subsequent expressions.

Examples¶

Track monthly active users by their signup month:

from events
join users
group by 
  month(users.signup_date) as cohort,
  months_between(events.event_date, users.signup_date) as period
(
  count(distinct user_id) as active
)
order by cohort, period

Output columns: - cohort: Month users signed up - period: Months since signup (0 = signup month, 1 = first month after, etc.) - active: Active users in that period

Revenue Cohort Analysis¶

Track revenue by first purchase cohort:

from orders
join customers
group by 
  month(customers.first_order_date) as cohort,
  months_between(orders.order_date, customers.first_order_date) as period
(
  sum(total) as revenue,
  count(distinct customer_id) as customers
)
order by cohort, period

Weekly Cohorts¶

Track weekly active users with weekly periods:

from events
join users
group by 
  week(users.signup_date) as cohort,
  weeks_between(events.event_date, users.signup_date) as period
(
  count(distinct user_id) as active
)
order by cohort, period

Retention with Cohort Size¶

To calculate retention percentage, you need cohort size. Use a window function:

from events
join users
group by 
  month(users.signup_date) as cohort,
  months_between(events.event_date, users.signup_date) as period
(
  count(distinct user_id) as active
)
select *,
  first(active) over cohort as cohort_size,
  round(active::numeric / cohort_size * 100, 1) as retention_pct
order by cohort, period

The first(active) over cohort gets the period 0 value (all users in cohort), which is the cohort size.

Cumulative Metrics (LTV)¶

Use running_sum() for lifetime value:

from orders
join customers
group by 
  month(customers.first_order_date) as cohort,
  months_between(orders.order_date, customers.first_order_date) as period
(
  sum(total) as revenue,
  count(distinct customer_id) as customers
)
select *,
  running_sum(revenue) over cohort as cumulative_revenue,
  first(customers) over cohort as cohort_size,
  running_sum(revenue) over cohort / cohort_size as ltv
order by cohort, period

Self-Cohort (First Activity as Cohort)¶

When the cohort date isn't in a separate table, use a subquery or window function to find each entity's first activity:

from (
  from orders
  select *,
    month(first(created_at) over customer_id) as cohort
)
group by 
  cohort,
  months_between(created_at, cohort) as period
(
  count(distinct customer_id) as active
)
order by cohort, period

Segmented Cohorts¶

Add segment dimensions to compare cohorts across segments:

from events
join users
group by 
  users.channel,
  month(users.signup_date) as cohort,
  months_between(events.event_date, users.signup_date) as period
(
  count(distinct user_id) as active
)
order by channel, cohort, period

Period Calculation Functions¶

ASQL provides date difference functions for period calculation:

Function	Returns
`months_between(date1, date2)`	Months between dates
`weeks_between(date1, date2)`	Weeks between dates
`days_between(date1, date2)`	Days between dates

Or use explicit arithmetic:

datediff('month', signup_date, event_date) as period

Gap Filling with Spine¶

For complete cohort matrices (showing 0 for periods with no activity), use spine by:

from events
join users
spine by 
  month(users.signup_date) as cohort,
  months_between(events.event_date, users.signup_date) as period
(
  count(distinct user_id) as active
)
order by cohort, period

The spine by ensures you get a row for every (cohort, period) combination.

Common Patterns¶

Retention Table¶

from events
join users
group by 
  month(users.signup_date) as cohort,
  months_between(events.event_date, users.signup_date) as period
(
  count(distinct user_id) as active
)
select cohort, period, active,
  first(active) over cohort as cohort_size,
  round(active::numeric / cohort_size * 100, 1) as retention_pct
where period between 0 and 12
order by cohort, period

Churn Analysis¶

from events
join users
group by 
  month(users.signup_date) as cohort,
  months_between(events.event_date, users.signup_date) as period
(
  count(distinct user_id) as active
)
select cohort, period, active,
  first(active) over cohort as cohort_size,
  cohort_size - active as churned,
  round((cohort_size - active)::numeric / cohort_size * 100, 1) as churn_pct
order by cohort, period