Fact Table

A fact table is the central table in a Star Schema or Snowflake Schema, storing quantitative measures (metrics) and foreign keys to dimension tables. It represents the events or transactions being analyzed — each row is one measurable occurrence.

Structure

Column Type	Purpose	Examples
Foreign keys	Link to dimensions (who, what, where, when)	`date_key`, `product_key`, `store_key`
Measures	Numeric values to aggregate	`revenue`, `quantity`, `cost`, `duration`
Degenerate dimensions	Dimension attributes without their own table	`invoice_number`, `order_number`

Fact Types

Type	Grain	Example
Transaction	One row per event	Each sale, each click, each shipment
Periodic snapshot	One row per time period	Daily account balance, monthly inventory
Accumulating snapshot	One row per entity lifecycle	Order tracking (ordered → shipped → delivered)

Measure Types

Additive — Can be summed across all dimensions (revenue, quantity). Most common.
Semi-additive — Can be summed across some dimensions but not time (account balance — average over time, sum across accounts).
Non-additive — Cannot be summed (ratios, percentages). Must be computed from additive components.

Granularity

Granularity defines what one row in the fact table represents — the most important design decision:

Fine grain (e.g., individual transactions) → More detail, larger table, flexible drill-down
Coarse grain (e.g., monthly summaries) → Smaller table, faster aggregation, but can’t drill to detail

Factors: business requirements (what questions must be answerable?), query performance, storage costs, data availability/quality.

Factless Fact Tables

A fact table with no measures — it records the occurrence of events or relationships between dimensions:

CREATE TABLE enrollment_fact (
    student_id INT REFERENCES dim_student(id),
    course_id INT REFERENCES dim_course(id),
    term_id INT REFERENCES dim_term(id),
    PRIMARY KEY (student_id, course_id, term_id)
    -- No measure columns!
);

Enables questions like “How many CS majors take 4+ courses in Fall 2025?” without querying the live OLTP database. Key advantage: can have indexes on all columns since factless fact tables are refreshed in batch (unlike operational tables where many indexes would slow writes during active registration).

Sparsity

When not all dimension combinations have facts, the table becomes sparse (many nulls/zeros):

Design	Strategy	Trade-off
Sparse	Store only non-null records	Smaller table, more complex queries (outer joins, COALESCE)
Dense	Store all possible combinations with null placeholders	Simpler queries, much larger table
Hybrid	Dense for low-sparsity dimensions, sparse for high-sparsity	Balanced

One Big Table (OBT) Pattern

An alternative to star schema — denormalize everything into a single wide table (hundreds of columns):

Advantages: No joins, simpler queries, faster reads, easier data ingestion
Disadvantages: Massive redundancy, consistency challenges, adding new dimensions requires schema change, limited governance
Best for: Big data environments (Apache Hive, Spark, BigQuery) where joins are expensive across distributed nodes
Significant limitation: Cannot add new dimensions without schema migration (unless using schema-less NoSQL, but then SQL analytics aren’t possible)

Performance Optimization

Seven techniques for fast analytical queries:

Indexing — Composite indexes on frequently co-used dimension keys
Partitioning — Time-based (year/quarter/month) or geography-based; see Database Sharding and Partitioning
Aggregation tables — Separate pre-calculated summary tables at higher hierarchy levels
Materialized views — Pre-computed results of frequent complex queries
Query optimization — Use explain plans, filter early, proper join conditions
Caching — Redis/Memcached for frequently accessed results
Parallel processing — Distributed computing (Hadoop/Spark) for concurrent execution across partitions

Quartz 5

Explorer

Fact Table

Fact Table

Structure

Fact Types

Measure Types

Granularity

Factless Fact Tables

Sparsity

One Big Table (OBT) Pattern

Performance Optimization

Graph View

Table of Contents

Backlinks