Pagination with Elasticsearch vs SQL

Introduction

Pagination is a core feature of search and listing experiences. Traditional SQL pagination (LIMIT/OFFSET) is simple but degrades at high offsets and can return inconsistent results as data changes. Elasticsearch offers alternative pagination models (search_after, scroll, point-in-time) that improve latency and consistency for large, sorted result sets. This guide compares approaches and outlines when to choose Elasticsearch over direct SQL.

When to Consider Elasticsearch for Pagination

Search-heavy read patterns with flexible filtering and relevance scoring.
Very large datasets where deep pagination is common (page > 100, offset in the tens of thousands).
Need for full-text search, stemming, fuzzy matching, or custom scoring.
Requirements to keep pagination stable while the underlying data is mutating.
Multi-field sorting (e.g., relevance desc, created_at desc) with fast response times.

High-Level Architecture

graph TD
    Client[Web/Mobile/App] --> API[API Gateway]
    API --> SearchSvc[Search Service]
    SearchSvc --> ES[(Elasticsearch Cluster)]
    API --> DB[(SQL DB)]

Search path (Elasticsearch): Used for search-driven listing with stable cursor-based pagination.
SQL path: Used for transactional writes and small, strongly consistent lists (e.g., user-owned items).
Data pipeline (not shown) feeds SQL changes into Elasticsearch via CDC or change streams.

SQL Pagination Approaches

1) Offset/Limit (Simple, but Slow at High Offsets)

SELECT id, title, created_at
FROM products
ORDER BY created_at DESC
LIMIT 20 OFFSET 10000;

Pros: Easiest to implement.
Cons: Late pages require scanning and discarding many rows; latency grows with offset; unstable if rows are inserted/deleted mid-scan.

2) Keyset (Seek) Pagination (Recommended in SQL)

-- Use the last seen created_at/id as a cursor
SELECT id, title, created_at
FROM products
WHERE created_at < $last_seen_ts
ORDER BY created_at DESC
LIMIT 20;

Pros: O(page_size) instead of O(offset); stable ordering; good for append-only timelines.
Cons: Requires deterministic sort key (often compound: created_at, id); harder to jump to arbitrary pages.

Elasticsearch Pagination Approaches

1) from/size (Offset)

{
  "from": 10000,
  "size": 20,
  "sort": [{ "created_at": "desc" }, { "id": "desc" }]
}

Pros: Familiar, works for shallow pages.
Cons: Expensive for deep pages; index.max_result_window (default 10k) limits depth; re-sorting large windows hurts latency and memory.

2) search_after (Cursor-Based, Preferred)

{
  "size": 20,
  "sort": [{ "created_at": "desc" }, { "id": "desc" }],
  "search_after": ["2025-01-01T00:00:00Z", "987654321"]
}

Use the sort values of the last hit as the cursor.
Pros: Stable, constant-time pagination; no deep window re-sorting; works well with point-in-time (PIT) to keep a consistent view.
Cons: Cannot jump to arbitrary page numbers; clients must hold cursor tokens.

3) Point-in-Time (PIT) + search_after (Stable Snapshot)

POST /my-index/_pit?keep_alive=1m
{ }

POST /_search
{
  "pit": { "id": "<pit-id>", "keep_alive": "1m" },
  "size": 20,
  "sort": [{ "created_at": "desc" }, { "id": "desc" }],
  "search_after": ["2025-01-01T00:00:00Z", "987654321"]
}

Pros: Stable snapshot across pages even while index mutates; ideal for user-facing pagination.
Cons: PITs consume cluster resources; need sensible keep_alive; clean up unused PITs.

4) Scroll API (Batch/Data Export)

Designed for large batch exports and reindexing, not user-facing pagination.
Holds resources on the cluster; avoid for interactive flows.

Comparison: Elasticsearch vs SQL for Pagination

Dimension	Elasticsearch (search_after/PIT)	SQL (Keyset)
Deep pagination latency	Stable; no OFFSET scan	Stable; no OFFSET scan
Arbitrary page jumps	Not supported	Not supported (keyset)
Text search & relevance	Strong (analyzers, scoring)	Weak without extensions
Multi-field sort at scale	Efficient doc values	Depends on indexes; can be costly
Consistency	Snapshot isolation with PIT; eventual between ES and source	Strong within DB; exact source of truth
Operational complexity	Needs ingestion pipeline, index tuning	Simpler ops; fewer moving parts
Result stability under writes	PIT keeps view stable; without PIT may drift	Stable within transaction; may drift across pages without locks

Design Recommendations

Use Elasticsearch + search_after + PIT for search or browse experiences where users navigate many pages and expect stable ordering.
Use SQL keyset pagination for transactional lists, admin tables, or when exact source-of-truth consistency is required and deep pagination is shallow (< few hundred rows).
Avoid offset-based pagination for large offsets in both systems; keep for small pages only.
Define a deterministic composite sort key (e.g., created_at desc, id desc) to prevent tie-related duplication or skipping.
Set guardrails: cap maximum pages, provide load more rather than numeric page jumps, and expire cursors.

Operational Considerations (Elasticsearch)

Index mappings: Store sortable fields as keyword/numeric/date with doc_values enabled.
Hot-warm-cold tiers: Keep recently active data in hot nodes for faster pagination.
Circuit breakers: Monitor heap usage; large size values increase memory.
Refresh vs consistency: Default refresh (1s) is usually enough; PIT stabilizes reads.
Ingestion lag: Monitor CDC/ingest latency; stale data impacts user experience.

Example End-to-End Flow (Search Service)

API receives request with optional cursor token.
If no cursor, create PIT (keep_alive ~1m) and run initial search with size and sort.
Return results plus:
- next_cursor: last hit's sort values and PIT id.
- pit_keep_alive: recommended keep_alive for the client to reuse.
On next page, client sends next_cursor; service reuses PIT and issues search_after.
Close PIT early when user ends session or after timeout.

Checklist

Use deterministic composite sort keys for pagination.
Prefer search_after over from/size for deep pages.
Pair search_after with PIT for stable views.
Keep size reasonable (e.g., <= 100) to avoid heavy heap usage.
Cap max pagination depth; offer "load more" UX instead of page numbers.
Monitor ingest lag and PIT counts; expire unused PITs.
Use SQL keyset pagination for transactional, strongly consistent lists.
Avoid Scroll API for interactive pagination.

Introduction​

When to Consider Elasticsearch for Pagination​

High-Level Architecture​

SQL Pagination Approaches​

1) Offset/Limit (Simple, but Slow at High Offsets)​

2) Keyset (Seek) Pagination (Recommended in SQL)​

Elasticsearch Pagination Approaches​

1) from/size (Offset)​

2) search_after (Cursor-Based, Preferred)​

3) Point-in-Time (PIT) + search_after (Stable Snapshot)​

4) Scroll API (Batch/Data Export)​

Comparison: Elasticsearch vs SQL for Pagination​

Design Recommendations​

Operational Considerations (Elasticsearch)​

Example End-to-End Flow (Search Service)​

Checklist​