Generalized Lookup Index - GIN
GIN (Generalized Inverted Index) is a PostgreSQL Index type designed for values that contain multiple elements — arrays, JSONB documents, full-text tsvector columns, and hstore key-value pairs. It is the most preferred text search index type.
How It Works
GIN builds an inverted index — a mapping from each distinct element (lexeme, array value, JSON key) to the set of rows that contain it. This is the same data structure used by search engines.
graph LR subgraph GIN_Index[GIN Index] A["database"] --> R1["row 3, row 7, row 12"] B["performance"] --> R2["row 7, row 15"] C["scaling"] --> R3["row 3, row 22"] end
Advantages
- Exact matches — Fast lookup for any element contained in a composite value
- Efficient multi-value search — Supports
@>(contains),@@(text search match),?(key exists) - Combinable — Multiple GIN conditions can be efficiently combined with AND/OR
Trade-offs
- Costly to insert/update — Each new row may require updating many index entries (one per element). Better to create the Index after the data is populated, or use
fastupdate(default) which batches index updates. - Larger index size — Can be significantly larger than B-Tree indexes due to the inverted structure
- No range queries — Not useful for
<,>,BETWEEN(use B-Tree for those)
Use Cases
| Data Type | Operator | Example |
|---|---|---|
tsvector | @@ | Full Text Search |
jsonb | @>, ?, ?& | JSON containment queries |
array | @>, <@, && | Array overlap/containment |
hstore | @>, ? | Key-value lookups |
Creating a GIN Index
-- Full text search
CREATE INDEX idx_articles_fts ON articles USING GIN (to_tsvector('english', body));
-- JSONB containment
CREATE INDEX idx_data_gin ON events USING GIN (metadata jsonb_path_ops);
-- Array overlap
CREATE INDEX idx_tags_gin ON posts USING GIN (tags);GIN vs GiST for Text Search
| Aspect | GIN | GiST |
|---|---|---|
| Lookup speed | Faster (exact) | Slower (lossy, requires recheck) |
| Build time | Slower | Faster |
| Index size | Larger | Smaller |
| Update cost | Higher | Lower |
| Best for | Read-heavy, static data | Write-heavy, frequently updated |