Joins

Joins in Drizzle Cube enable you to combine data from multiple tables and cubes, creating rich, interconnected analytics. There are two types of joins: table-level joins within individual cubes and cube-level joins for multi-cube queries.

Overview

Drizzle Cube’s join system leverages Drizzle ORM’s type-safe join capabilities to provide secure, performant data relationships. All joins maintain security context and prevent SQL injection through parameterized queries.

Table-Level Joins

Table-level joins occur within a single cube’s SQL definition, allowing you to join multiple database tables into one logical dataset.

Basic Table Join Structure

sql: (ctx: QueryContext<Schema>): BaseQueryDefinition => ({
  from: employees,
  joins: [
    {
      table: departments,
      on: and(
        eq(employees.departmentId, departments.id),
        eq(departments.organisationId, ctx.securityContext.organisationId)
      ),
      type: 'left'
    }
  ],
  where: eq(employees.organisationId, ctx.securityContext.organisationId)
})

Join Types

Left Join - Most common, includes all records from the main table:

joins: [
  {
    table: departments,
    on: eq(employees.departmentId, departments.id),
    type: 'left'
  }
]

Inner Join - Only records that exist in both tables:

joins: [
  {
    table: departments,
    on: eq(employees.departmentId, departments.id),
    type: 'inner'
  }
]

Right Join - Includes all records from the joined table:

joins: [
  {
    table: departments,
    on: eq(employees.departmentId, departments.id),
    type: 'right'
  }
]

Multi-Table Joins

Join multiple tables in a single cube:

export const productivityCube: Cube<Schema> = defineCube('Productivity', {
  title: 'Productivity Analytics',
  description: 'Employee productivity with department and project data',

  sql: (ctx: QueryContext<Schema>): BaseQueryDefinition => ({
    from: productivity,
    joins: [
      // Join to employees table
      {
        table: employees,
        on: and(
          eq(productivity.employeeId, employees.id),
          eq(employees.organisationId, ctx.securityContext.organisationId)
        ),
        type: 'left'
      },
      // Join to departments through employees
      {
        table: departments,
        on: and(
          eq(employees.departmentId, departments.id),
          eq(departments.organisationId, ctx.securityContext.organisationId)
        ),
        type: 'left'
      },
      // Join to projects table
      {
        table: projects,
        on: and(
          eq(productivity.projectId, projects.id),
          eq(projects.organisationId, ctx.securityContext.organisationId)
        ),
        type: 'left'
      }
    ],
    where: eq(productivity.organisationId, ctx.securityContext.organisationId)
  }),

  dimensions: {
    employeeName: {
      name: 'employeeName',
      title: 'Employee',
      type: 'string',
      sql: employees.name
    },
    departmentName: {
      name: 'departmentName',
      title: 'Department',
      type: 'string',
      sql: departments.name
    },
    projectName: {
      name: 'projectName',
      title: 'Project',
      type: 'string',
      sql: projects.name
    }
  }
  // ... measures
})

Security in Table Joins

Critical: Always include security context filtering in join conditions:

joins: [
  {
    table: departments,
    on: and(
      eq(employees.departmentId, departments.id),
      // REQUIRED: Security filtering on joined table
      eq(departments.organisationId, ctx.securityContext.organisationId)
    ),
    type: 'left'
  }
]

Cube-Level Joins

Cube-level joins define relationships between different cubes, enabling multi-cube queries and cross-cube analysis.

Basic Cube Join Structure

// In the Employees cube
joins: {
  Departments: {
    targetCube: () => departmentsCube,
    relationship: 'belongsTo',
    on: [
      { source: employees.departmentId, target: departments.id }
    ]
  }
}

// In the Productivity cube
joins: {
  Employees: {
    targetCube: () => employeesCube,
    relationship: 'belongsTo',
    on: [
      { source: productivity.employeeId, target: employees.id }
    ]
  },
  Departments: {
    targetCube: () => departmentsCube,
    relationship: 'belongsTo',
    on: [
      { source: productivity.employeeId, target: employees.id },
      { source: employees.departmentId, target: departments.id }
    ]
  }
}

Lazy Reference Pattern

Cube-level joins use lazy references with arrow functions () => cubeName to avoid circular dependency issues:

// Forward declarations prevent circular imports
let employeesCube: Cube<Schema>
let departmentsCube: Cube<Schema>

// Cubes can reference each other bidirectionally
employeesCube = defineCube('Employees', {
  // ... cube definition
  joins: {
    Departments: {
      targetCube: () => departmentsCube,  // Lazy reference
      relationship: 'belongsTo',
      on: [{ source: employees.departmentId, target: departments.id }]
    }
  }
})

departmentsCube = defineCube('Departments', {
  // ... cube definition
  joins: {
    Employees: {
      targetCube: () => employeesCube,  // Bidirectional reference
      relationship: 'hasMany',
      on: [{ source: departments.id, target: employees.departmentId }]
    }
  }
})

Key Benefits:

Type Safety: Full compile-time validation of cube references
No Circular Dependencies: Arrow functions delay resolution until runtime
Bidirectional Joins: Cubes can reference each other in both directions

Relationship Types

belongsTo - Many-to-one relationship:

// Employee belongs to Department
Departments: {
  targetCube: () => departmentsCube,
  relationship: 'belongsTo',
  on: [
    { source: employees.departmentId, target: departments.id }
  ]
}

hasMany - One-to-many relationship:

// Department has many Employees
Employees: {
  targetCube: () => employeesCube,
  relationship: 'hasMany',
  on: [
    { source: departments.id, target: employees.departmentId }
  ]
}

hasOne - One-to-one relationship:

// Employee has one Profile
UserProfiles: {
  targetCube: () => userProfilesCube,
  relationship: 'hasOne',
  on: [
    { source: employees.id, target: userProfiles.employeeId }
  ]
}

Advanced Cube Join Features

Multi-Column Joins - Join on multiple columns:

joins: {
  ProjectTasks: {
    targetCube: () => projectTasksCube,
    relationship: 'hasMany',
    on: [
      { source: projects.id, target: tasks.projectId },
      { source: projects.version, target: tasks.projectVersion }
    ]
  }
}

Custom Comparators - Use different comparison operators:

joins: {
  Activities: {
    targetCube: () => activitiesCube,
    relationship: 'hasMany',
    on: [
      { source: users.id, target: activities.userId },
      {
        source: users.createdAt,
        target: activities.timestamp,
        as: (source, target) => gte(target, source) // timestamp >= user.createdAt
      }
    ]
  }
}

Override SQL Join Type - Force specific SQL join behavior:

joins: {
  Departments: {
    targetCube: () => departmentsCube,
    relationship: 'belongsTo',
    on: [
      { source: employees.departmentId, target: departments.id }
    ],
    sqlJoinType: 'inner' // Override default 'inner' with 'left', 'right', etc.
  }
}

Multi-Cube Query Example

Query data from multiple cubes using cube joins:

const multiCubeQuery = {
  measures: [
    'Employees.count',           // From Employees cube
    'Departments.totalBudget',   // From Departments cube
    'Productivity.avgLinesOfCode' // From Productivity cube
  ],
  dimensions: [
    'Departments.name',          // Group by department
    'Employees.isActive'         // Split by active status
  ],
  timeDimensions: [{
    dimension: 'Productivity.date',
    granularity: 'month'
  }]
}

Advanced Join Patterns

Conditional Joins

Apply conditional logic in join conditions:

sql: (ctx: QueryContext<Schema>): BaseQueryDefinition => ({
  from: orders,
  joins: [
    {
      table: customers,
      on: and(
        eq(orders.customerId, customers.id),
        // Conditional join based on security context
        ctx.securityContext.userRole === 'admin'
          ? sql`true`
          : eq(customers.salesRepId, ctx.securityContext.userId),
        eq(customers.organisationId, ctx.securityContext.organisationId)
      ),
      type: 'left'
    }
  ],
  where: eq(orders.organisationId, ctx.securityContext.organisationId)
})

Self-Joins

Join a table to itself for hierarchical data:

sql: (ctx: QueryContext<Schema>): BaseQueryDefinition => ({
  from: employees,
  joins: [
    {
      table: alias(employees, 'manager'),
      on: and(
        eq(employees.managerId, sql`manager.id`),
        eq(sql`manager.organisationId`, ctx.securityContext.organisationId)
      ),
      type: 'left'
    }
  ],
  where: eq(employees.organisationId, ctx.securityContext.organisationId)
}),

dimensions: {
  name: {
    name: 'name',
    title: 'Employee Name',
    type: 'string',
    sql: employees.name
  },
  managerName: {
    name: 'managerName',
    title: 'Manager Name',
    type: 'string',
    sql: sql`manager.name`
  }
}

Complex Join Conditions

Use complex conditions for specialized joins:

joins: [
  {
    table: productivityTargets,
    on: and(
      eq(employees.id, productivityTargets.employeeId),
      eq(employees.departmentId, productivityTargets.departmentId),
      // Join on date range
      sql`${productivity.date} BETWEEN ${productivityTargets.startDate} AND ${productivityTargets.endDate}`,
      eq(productivityTargets.organisationId, ctx.securityContext.organisationId)
    ),
    type: 'left'
  }
]

Join Resolution and Path Finding

Drizzle Cube automatically resolves join paths between cubes using the JoinPathResolver:

// Automatic path finding from Productivity → Employees → Departments
const query = {
  measures: ['Productivity.totalLinesOfCode'],
  dimensions: ['Departments.name'] // Automatically resolves join path
}

Manual Join Path Control

Control join resolution explicitly:

// Force specific join path with multi-column conditions
joins: {
  Departments: {
    targetCube: () => departmentsCube,
    relationship: 'belongsTo',
    on: [
      // Multi-step join conditions
      { source: productivity.employeeId, target: employees.id },
      { source: employees.departmentId, target: departments.id }
    ],
    // Optional: override SQL join type
    sqlJoinType: 'left'
  }
}

Performance Optimization

Join Order Optimization

Structure joins for optimal performance:

// Good: Start with most selective table
sql: (ctx: QueryContext<Schema>): BaseQueryDefinition => ({
  from: activeEmployees, // Pre-filtered view
  joins: [
    {
      table: departments,
      on: eq(activeEmployees.departmentId, departments.id),
      type: 'left'
    }
  ],
  where: eq(activeEmployees.organisationId, ctx.securityContext.organisationId)
})

Index-Friendly Joins

Ensure join conditions use indexed columns:

joins: [
  {
    table: departments,
    // Good: uses indexed foreign key
    on: eq(employees.departmentId, departments.id),
    type: 'left'
  }
]

Testing Joins

describe('Cube Joins', () => {
  it('should join employees with departments', async () => {
    const query = {
      measures: ['Employees.count'],
      dimensions: ['Employees.departmentName']
    }

    const result = await semanticLayer.load(query, {
      organisationId: 'test-org'
    })

    const data = result.rawData()
    expect(data.every(row => row['Employees.departmentName'])).toBeTruthy()
  })

  it('should handle multi-cube queries', async () => {
    const query = {
      measures: [
        'Employees.count',
        'Productivity.avgLinesOfCode'
      ],
      dimensions: ['Departments.name']
    }

    const result = await semanticLayer.load(query, {
      organisationId: 'test-org'
    })

    const data = result.rawData()
    expect(data).toHaveLength(3) // 3 departments
    data.forEach(row => {
      expect(row['Employees.count']).toBeGreaterThan(0)
      expect(row['Productivity.avgLinesOfCode']).toBeGreaterThan(0)
    })
  })
})

Best Practices

Security First: Always include security context in join conditions
Performance: Use indexed columns for join conditions
Type Safety: Leverage Drizzle’s type system for join validation
Clarity: Use descriptive relationship names
Testing: Verify join correctness and security isolation
Documentation: Document complex join logic
Optimization: Structure joins for query performance

Common Patterns

Basic Foreign Key Join

joins: [
  {
    table: departments,
    on: eq(employees.departmentId, departments.id),
    type: 'left'
  }
]

Secure Multi-Table Join

joins: [
  {
    table: departments,
    on: and(
      eq(employees.departmentId, departments.id),
      eq(departments.organisationId, ctx.securityContext.organisationId)
    ),
    type: 'left'
  }
]

Cube-Level Relationship

joins: {
  Departments: {
    targetCube: () => departmentsCube,
    relationship: 'belongsTo',
    on: [
      { source: employees.departmentId, target: departments.id }
    ]
  }
}

Troubleshooting

Join Issues

Problem: Duplicate records in results Solution: Check for many-to-many relationships and use appropriate aggregation

Problem: Missing data after join Solution: Verify join type (left vs inner) and foreign key integrity

Problem: Security context not applied Solution: Ensure all joined tables include security filtering

Performance Issues

Problem: Slow join queries Solution: Add database indexes on join columns and optimize join order

Problem: Cartesian products Solution: Verify join conditions are specific enough

Next Steps

Learn about Security patterns for multi-tenant systems
Explore Cubes for complete cube definitions
Understand Dimensions and Measures
Review database indexing strategies for optimal join performance

Roadmap Ideas

Visual join relationship designer
Automatic join path optimization suggestions
Join performance analysis tools
Advanced relationship types (polymorphic, conditional)
Join validation and testing framework