OLAP

Online Analytical Processing (OLAP) is a category of data processing optimized for complex analytical queries over large datasets. OLAP systems support multidimensional analysis — slicing, dicing, drilling, and pivoting data across multiple dimensions — enabling business intelligence and decision support. OLAP contrasts with OLTP (Online Transaction Processing), which handles high-volume, short-lived read/write transactions.

OLTP vs OLAP

AspectOLTPOLAP
PurposeDay-to-day operationsAnalysis and reporting
QueriesSimple, predefined (INSERT, UPDATE)Complex, ad-hoc (aggregations, joins)
DataCurrent, operationalHistorical, consolidated
SchemaNormalized (3NF)Star Schema / Snowflake Schema
UsersApplication users, clerksAnalysts, executives
VolumeMany small transactionsFew large scans
Response timeMillisecondsSeconds to minutes
OptimizationWrite-optimizedRead-optimized

OLAP Operations

Slice

Select a single value for one dimension, producing a sub-cube with one fewer dimension.

Example: View all sales for Q1 2026 only across all products and regions.

Dice

Select specific values for two or more dimensions, producing a smaller sub-cube.

Example: Sales for Q1 2026 in North America for Electronics.

Drill-Down

Move from a higher level of aggregation to a lower level of detail.

Example: Year → Quarter → Month → Day

Roll-Up (Drill-Up)

Aggregate data by moving from detail to summary.

Example: Day → Month → Quarter → Year

Pivot (Rotate)

Reorient the data cube by swapping axes — e.g., showing products on rows and time on columns, then rotating to time on rows and regions on columns.

OLAP Architectures

MOLAP (Multidimensional OLAP)

Pre-computes and stores data in a specialized multidimensional array (cube). Fast query response but limited scalability and requires pre-computation.

ROLAP (Relational OLAP)

Stores data in a relational database using star or snowflake schemas. Queries are translated to SQL. More scalable but slower than MOLAP.

HOLAP (Hybrid OLAP)

Combines MOLAP for aggregated data with ROLAP for detail-level data. Balances performance and scalability.

The Data Cube

Conceptually, OLAP organizes data as an n-dimensional cube where:

  • Each dimension is an axis (Time, Product, Region, Customer)
  • Each cell contains a measure (revenue, quantity, profit)
  • Dimensions have hierarchies enabling drill-down/roll-up (Year → Quarter → Month)

SQL Support for OLAP

Modern SQL provides constructs for OLAP-style queries:

  • GROUP BY — Basic aggregation
  • Window Functions — Ranking, running totals, moving averages without collapsing rows
  • ROLLUP — Generates subtotals and grand totals across a hierarchy
  • CUBE — Generates all possible subtotal combinations
  • GROUPING SETS — Specifies exactly which groupings to compute
SELECT region, product, SUM(revenue)
FROM sales
GROUP BY ROLLUP (region, product);
-- Returns: (region, product), (region), (grand total)

OLAP Query Examples with Star Schema

Aggregating along dimensions (drill-up to category + year):

SELECT p.Category, t.Year, SUM(f.Sales_Amount) AS Total_Sales
FROM Sales_Fact f
JOIN Product_Dim p ON f.Product_ID = p.Product_ID
JOIN Time_Dim t ON f.Date_ID = t.Date_ID
GROUP BY p.Category, t.Year;

Slicing (filter to one dimension value):

SELECT p.Product_Name, t.Month, SUM(f.Sales_Amount) AS Total_Sales
FROM Sales_Fact f
JOIN Product_Dim p ON f.Product_ID = p.Product_ID
JOIN Time_Dim t ON f.Date_ID = t.Date_ID
WHERE p.Category = 'Electronics'
  AND t.Date BETWEEN '2025-01-01' AND '2025-12-31'
GROUP BY p.Product_Name, t.Month;

OLAP Cube Construction Steps

  1. Identify relevant Fact Table and dimension tables
  2. Select measures — numeric, additive values for the cube cells
  3. Select dimensions and their hierarchies (Year → Quarter → Month → Day)
  4. Define aggregation rules per hierarchy level (SUM for revenue, AVG for ratings)
  5. Load data and pre-calculate aggregations at each hierarchy level
  6. Optimize with indexes, partitions, and materialized views based on expected query patterns

Sources