Statistical & Window Functions
Drizzle Cube supports advanced statistical functions and window functions for analytics workloads. These measures enable calculations like standard deviation, variance, percentiles, moving averages, and ranking functions.
Database Support
Section titled “Database Support”Statistical and window function support varies by database. Drizzle Cube handles this gracefully - unsupported functions return NULL with a warning rather than crashing your queries.
| Function Type | PostgreSQL | MySQL 8.0+ | SQLite |
|---|---|---|---|
| STDDEV (population/sample) | Full | Full | Not supported |
| VARIANCE (population/sample) | Full | Full | Not supported |
| PERCENTILE (median, P95, P99) | Full | Not supported | Not supported |
| Window Functions (LAG, LEAD, RANK, etc.) | Full | Full | SQLite 3.25+ |
Statistical Measures
Section titled “Statistical Measures”Standard Deviation
Section titled “Standard Deviation”Standard deviation measures the amount of variation in your data. Use stddev for population standard deviation or stddevSamp for sample standard deviation.
import { defineCube } from 'drizzle-cube/server'
const ordersCube = defineCube('Orders', { sql: (ctx) => ({ from: orders, where: eq(orders.organisationId, ctx.securityContext.organisationId) }),
measures: { // Population standard deviation stddevAmount: { name: 'stddevAmount', title: 'Order Amount Std Dev', type: 'stddev', sql: () => orders.amount, description: 'Standard deviation of order amounts (population)' },
// Sample standard deviation stddevSampAmount: { name: 'stddevSampAmount', title: 'Order Amount Std Dev (Sample)', type: 'stddevSamp', sql: () => orders.amount, description: 'Standard deviation of order amounts (sample)' },
// Using statisticalConfig for sample vs population configuredStddev: { name: 'configuredStddev', title: 'Configured Std Dev', type: 'stddev', sql: () => orders.amount, statisticalConfig: { useSample: true // Use sample standard deviation } } }})When to use which:
- Population (
stddev): When your data represents the entire population you’re analyzing - Sample (
stddevSamp): When your data is a sample of a larger population
Variance
Section titled “Variance”Variance measures the spread of data points. It’s the square of the standard deviation.
measures: { // Population variance varianceAmount: { name: 'varianceAmount', title: 'Order Amount Variance', type: 'variance', sql: () => orders.amount, description: 'Variance of order amounts (population)' },
// Sample variance varianceSampAmount: { name: 'varianceSampAmount', title: 'Order Amount Variance (Sample)', type: 'varianceSamp', sql: () => orders.amount, description: 'Variance of order amounts (sample)' },
// Using statisticalConfig configuredVariance: { name: 'configuredVariance', title: 'Configured Variance', type: 'variance', sql: () => orders.amount, statisticalConfig: { useSample: true } }}Percentiles
Section titled “Percentiles”Percentiles help you understand data distribution. Drizzle Cube provides convenient shortcuts for common percentiles plus a configurable percentile type.
measures: { // Median (P50) - middle value medianAmount: { name: 'medianAmount', title: 'Median Order Amount', type: 'median', sql: () => orders.amount, description: 'The middle value of all order amounts' },
// 95th percentile - useful for SLA monitoring p95ResponseTime: { name: 'p95ResponseTime', title: 'P95 Response Time', type: 'p95', sql: () => requests.responseTime, description: '95% of requests complete within this time' },
// 99th percentile - tail latency p99ResponseTime: { name: 'p99ResponseTime', title: 'P99 Response Time', type: 'p99', sql: () => requests.responseTime, description: '99% of requests complete within this time' },
// Custom percentile (any value 0-100) p75Amount: { name: 'p75Amount', title: '75th Percentile Amount', type: 'percentile', sql: () => orders.amount, statisticalConfig: { percentile: 75 // Custom percentile value } },
// Quartiles for box plots q1Amount: { name: 'q1Amount', title: 'Q1 (25th percentile)', type: 'percentile', sql: () => orders.amount, statisticalConfig: { percentile: 25 } },
q3Amount: { name: 'q3Amount', title: 'Q3 (75th percentile)', type: 'percentile', sql: () => orders.amount, statisticalConfig: { percentile: 75 } }}Window Functions (Post-Aggregation)
Section titled “Window Functions (Post-Aggregation)”Window functions perform calculations across rows of aggregated data. They’re useful for period-over-period comparisons, rankings, running totals, and trend analysis.
Important: Window functions in drizzle-cube operate on aggregated data. This means:
- Your base measure is first aggregated (e.g., SUM revenue grouped by month)
- The window function is then applied to those aggregated results (e.g., LAG to compare months)
This follows the typical analytics pattern where you want to compare aggregated values across time periods or categories.
Defining Post-Aggregation Window Functions
Section titled “Defining Post-Aggregation Window Functions”Window functions reference a base measure via windowConfig.measure and specify an operation for how to combine the current value with the window result:
measures: { // First, define the base aggregate measure totalRevenue: { name: 'totalRevenue', title: 'Total Revenue', type: 'sum', sql: () => orders.amount },
// Then define window functions that reference it revenueChange: { name: 'revenueChange', title: 'Revenue Change', type: 'lag', windowConfig: { measure: 'totalRevenue', // Reference the base measure operation: 'difference', // current - previous orderBy: [{ field: 'date', direction: 'asc' }] } }}Operations
Section titled “Operations”The operation property controls how the window function result is combined with the current value:
difference(default for lag/lead):current - window- Shows the changeratio:current / window- Shows relative sizepercentChange:((current - window) / window) * 100- Shows percentage changeraw(default for rank/rowNumber): Returns the window function result directly
LAG and LEAD
Section titled “LAG and LEAD”Compare aggregated values with previous or next periods:
measures: { // Base aggregate measure totalRevenue: { name: 'totalRevenue', type: 'sum', sql: () => orders.amount },
// Month-over-month revenue change revenueChange: { name: 'revenueChange', title: 'Revenue Change (vs Previous)', type: 'lag', windowConfig: { measure: 'totalRevenue', operation: 'difference', orderBy: [{ field: 'date', direction: 'asc' }] } },
// Previous period's revenue (raw value) previousRevenue: { name: 'previousRevenue', title: 'Previous Period Revenue', type: 'lag', windowConfig: { measure: 'totalRevenue', operation: 'raw', orderBy: [{ field: 'date', direction: 'asc' }] } },
// Percent change from previous period revenuePercentChange: { name: 'revenuePercentChange', title: 'Revenue % Change', type: 'lag', windowConfig: { measure: 'totalRevenue', operation: 'percentChange', orderBy: [{ field: 'date', direction: 'asc' }] } },
// Compare to next period (look ahead) changeToNext: { name: 'changeToNext', title: 'Change to Next Period', type: 'lead', windowConfig: { measure: 'totalRevenue', operation: 'difference', orderBy: [{ field: 'date', direction: 'asc' }] } }}Ranking Functions
Section titled “Ranking Functions”Rank aggregated values:
measures: { totalSales: { name: 'totalSales', type: 'sum', sql: () => orders.amount },
// Rank periods by total sales (1 = highest) salesRank: { name: 'salesRank', title: 'Sales Rank', type: 'rank', windowConfig: { measure: 'totalSales', operation: 'raw', orderBy: [{ field: 'totalSales', direction: 'desc' }] } },
// Rank with partitioning regionSalesRank: { name: 'regionSalesRank', title: 'Rank within Region', type: 'rank', windowConfig: { measure: 'totalSales', operation: 'raw', partitionBy: ['region'], orderBy: [{ field: 'totalSales', direction: 'desc' }] } }}Moving Averages and Running Totals
Section titled “Moving Averages and Running Totals”Calculate rolling aggregations on already-aggregated data:
measures: { dailyRevenue: { name: 'dailyRevenue', type: 'sum', sql: () => orders.amount },
// 7-period moving average movingAvg7Period: { name: 'movingAvg7Period', title: '7-Period Moving Avg', type: 'movingAvg', windowConfig: { measure: 'dailyRevenue', operation: 'raw', orderBy: [{ field: 'date', direction: 'asc' }], frame: { type: 'rows', start: 6, // 6 preceding periods end: 'current' } } },
// Running total (cumulative sum) runningTotal: { name: 'runningTotal', title: 'Running Total', type: 'movingSum', windowConfig: { measure: 'dailyRevenue', operation: 'raw', orderBy: [{ field: 'date', direction: 'asc' }], frame: { type: 'rows', start: 'unbounded', end: 'current' } } }}Querying Window Functions
Section titled “Querying Window Functions”Window functions are queried alongside their base measures:
const query = { measures: [ 'Orders.totalRevenue', // Base measure 'Orders.revenueChange', // Window function 'Orders.revenuePercentChange' ], timeDimensions: [{ dimension: 'Orders.createdAt', granularity: 'month', dateRange: 'last 12 months' }]}The base measure (totalRevenue) is automatically included if not explicitly requested.
Example: Complete Analytics Dashboard
Section titled “Example: Complete Analytics Dashboard”const analyticsCube = defineCube('Analytics', { sql: (ctx) => ({ from: orders, where: eq(orders.organisationId, ctx.securityContext.organisationId) }),
measures: { // Base aggregates totalRevenue: { name: 'totalRevenue', type: 'sum', sql: () => orders.amount }, orderCount: { name: 'orderCount', type: 'count', sql: () => orders.id },
// Period-over-period analysis revenueChange: { name: 'revenueChange', title: 'Revenue Change', type: 'lag', windowConfig: { measure: 'totalRevenue', operation: 'difference', orderBy: [{ field: 'date', direction: 'asc' }] } }, revenueGrowthPercent: { name: 'revenueGrowthPercent', title: 'Growth %', type: 'lag', windowConfig: { measure: 'totalRevenue', operation: 'percentChange', orderBy: [{ field: 'date', direction: 'asc' }] } },
// Rankings revenueRank: { name: 'revenueRank', title: 'Revenue Rank', type: 'rank', windowConfig: { measure: 'totalRevenue', operation: 'raw', orderBy: [{ field: 'totalRevenue', direction: 'desc' }] } },
// Cumulative analysis cumulativeRevenue: { name: 'cumulativeRevenue', title: 'Cumulative Revenue', type: 'movingSum', windowConfig: { measure: 'totalRevenue', operation: 'raw', orderBy: [{ field: 'date', direction: 'asc' }], frame: { type: 'rows', start: 'unbounded', end: 'current' } } } },
dimensions: { date: { name: 'date', type: 'time', sql: () => orders.createdAt } }})Window Configuration Reference
Section titled “Window Configuration Reference”The windowConfig object supports these options:
interface WindowConfig { // Required: Reference to the base aggregate measure measure: string
// How to combine with window result operation?: 'raw' | 'difference' | 'ratio' | 'percentChange' // Defaults: 'difference' for lag/lead, 'raw' for others
// Dimension references to partition by partitionBy?: string[]
// How to order rows within each partition orderBy?: Array<{ field: string direction: 'asc' | 'desc' }>
// For LAG/LEAD: number of periods to offset (default: 1) offset?: number
// For LAG/LEAD: default value when offset is out of bounds defaultValue?: any
// For NTILE: number of buckets nTile?: number
// Frame specification for moving aggregates frame?: { type: 'rows' | 'range' start: number | 'unbounded' end: number | 'current' | 'unbounded' }}Graceful Degradation
Section titled “Graceful Degradation”When a statistical function isn’t supported by your database, Drizzle Cube handles it gracefully:
- Logs a warning to the console:
[drizzle-cube] stddev not supported on sqlite, returning NULL - Returns NULL for that measure
- Continues executing the rest of the query
This means your queries won’t crash - they’ll just have NULL values for unsupported measures.
Checking Database Capabilities
Section titled “Checking Database Capabilities”You can check what your database supports:
import { createDatabaseExecutor } from 'drizzle-cube/server'
const executor = createDatabaseExecutor(db, schema)const capabilities = executor.databaseAdapter.getCapabilities()
console.log(capabilities)// {// supportsStddev: true,// supportsVariance: true,// supportsPercentile: true,// supportsWindowFunctions: true,// supportsFrameClause: true// }Conditional Measure Definitions
Section titled “Conditional Measure Definitions”You can conditionally define measures based on capabilities:
const capabilities = executor.databaseAdapter.getCapabilities()
const measures: Record<string, any> = { count: { name: 'count', type: 'count', sql: () => table.id }}
// Only add percentile measures if supportedif (capabilities.supportsPercentile) { measures.medianAmount = { name: 'medianAmount', type: 'median', sql: () => table.amount }}
const cube = defineCube('Orders', { sql: (ctx) => ({ from: table, where: eq(table.orgId, ctx.securityContext.organisationId) }), measures})Database-Specific SQL Generated
Section titled “Database-Specific SQL Generated”PostgreSQL
Section titled “PostgreSQL”-- STDDEVSELECT COALESCE(STDDEV_POP(amount), 0) AS "stddevAmount" FROM orders
-- VARIANCESELECT COALESCE(VAR_SAMP(amount), 0) AS "varianceAmount" FROM orders
-- PERCENTILESELECT PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY response_time) AS "p95ResponseTime"
-- Window function (LAG)SELECT LAG(amount, 1, 0) OVER (PARTITION BY customer_id ORDER BY date ASC) AS "previousDayAmount"
-- Moving averageSELECT AVG(amount) OVER (ORDER BY date ASC ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS "movingAvg7Day"MySQL 8.0+
Section titled “MySQL 8.0+”-- STDDEV (uses IFNULL instead of COALESCE)SELECT IFNULL(STDDEV_POP(amount), 0) AS `stddevAmount` FROM orders
-- VARIANCESELECT IFNULL(VAR_SAMP(amount), 0) AS `varianceAmount` FROM orders
-- Window functions work the same as PostgreSQLSELECT LAG(amount, 1, 0) OVER (PARTITION BY customer_id ORDER BY date ASC) AS `previousDayAmount`SQLite
Section titled “SQLite”-- Statistical functions return NULL (not supported)SELECT MAX(NULL) AS "stddevAmount" FROM orders
-- Window functions ARE supported (SQLite 3.25+)SELECT LAG(amount, 1, 0) OVER (PARTITION BY customer_id ORDER BY date ASC) AS "previousDayAmount"Use Cases
Section titled “Use Cases”SLA Monitoring
Section titled “SLA Monitoring”measures: { p50ResponseTime: { type: 'median', sql: () => requests.responseTime }, p95ResponseTime: { type: 'p95', sql: () => requests.responseTime }, p99ResponseTime: { type: 'p99', sql: () => requests.responseTime }, stddevResponseTime: { type: 'stddev', sql: () => requests.responseTime }}Financial Analysis
Section titled “Financial Analysis”measures: { avgDailyVolume: { type: 'avg', sql: () => trades.volume }, volatility: { type: 'stddev', sql: () => trades.priceChange }, movingAvg20Day: { type: 'movingAvg', sql: () => trades.price, windowConfig: { orderBy: [{ field: 'date', direction: 'asc' }], frame: { type: 'rows', start: 19, end: 'current' } } }}Sales Rankings
Section titled “Sales Rankings”measures: { salesRank: { type: 'rank', windowConfig: { partitionBy: ['region'], orderBy: [{ field: 'revenue', direction: 'desc' }] } }, percentileRank: { type: 'percentile', sql: () => sales.revenue, statisticalConfig: { percentile: 90 } }}Best Practices
Section titled “Best Practices”- Check database support before relying on statistical functions in production
- Use meaningful defaults for LAG/LEAD when rows might not exist
- Consider performance - window functions can be expensive on large datasets
- Test across databases if you support multiple database backends
- Document measures clearly since statistical measures can be complex
Next Steps
Section titled “Next Steps”- Learn about Calculated Measures for custom formulas
- Explore Time Dimensions for time-series analysis
- Review Performance tips for optimizing statistical queries