Scriptum

Full-text search with branching

Git-like branching for Apache Lucene. Fork a 100GB index by sharing immutable segment files. Time-travel queries, branch isolation, and safe experimentation on your search indices.

GitHub Contact

Why Scriptum

Zero-cost forking - Branch any index in a few ms regardless of size. Copies metadata, not data.
Structural sharing - Branches share immutable Lucene segments via copy-on-write overlay directories.
Time travel - Open readers at any historical commit point. Query past index states.
Full Lucene 10.x - Text search, KNN vectors, facets, highlighting - all branch-aware.
Apache-2.0 - Open source, permissive license.

How it works

Scriptum extends Lucene with four components that enable copy-on-write branching:

BranchedDirectory - Overlay pattern: reads fall back to base, writes go to branch overlay.
BranchDeletionPolicy - Retains all commit points until explicit garbage collection.
BranchAwareMergePolicy - Prevents merging shared segments that would break other branches.
BranchIndexWriter - Main API for create, fork, commit, merge, and GC operations.

See LUCENE_EXTENSION.md for the full technical deep-dive.

Clojure API

What is this syntax?

require('[scriptum.core :as sc])

;; Create an index
def writer: sc/create-index("/tmp/my-index")

;; Add documents
sc/add-doc(writer
  {:title {:type :text, :value "Hello World"}
   :id {:type :string, :value "doc-1"}})
sc/commit!(writer "Initial commit")

;; Fork a branch (3-5ms regardless of index size)
def experiment: sc/fork(writer "experiment")

;; Add to branch (doesn't affect main)
sc/add-doc(experiment
  {:title {:type :text, :value "Branch only"}
   :id {:type :string, :value "doc-2"}})
sc/commit!(experiment "Added experimental doc")

;; Main still has 1 doc, branch has 2
count(sc/search(writer {:match-all {}} 100))
count(sc/search(experiment {:match-all {}} 100))

;; Merge back when ready
sc/merge-from!(writer experiment)

(require '[scriptum.core :as sc])

;; Create an index
(def writer (sc/create-index "/tmp/my-index"))

;; Add documents
(sc/add-doc writer {:title {:type :text :value "Hello World"}
                    :id    {:type :string :value "doc-1"}})
(sc/commit! writer "Initial commit")

;; Fork a branch (3-5ms regardless of index size)
(def experiment (sc/fork writer "experiment"))

;; Add to branch (doesn't affect main)
(sc/add-doc experiment {:title {:type :text :value "Branch only"}
                        :id    {:type :string :value "doc-2"}})
(sc/commit! experiment "Added experimental doc")

;; Main still has 1 doc, branch has 2
(count (sc/search writer {:match-all {}} 100))      ;; => 1
(count (sc/search experiment {:match-all {}} 100))  ;; => 2

;; Merge back when ready
(sc/merge-from! writer experiment)

Java API

import org.replikativ.scriptum.BranchIndexWriter;
import org.apache.lucene.document.*;
import java.nio.file.Path;

// Create an index
BranchIndexWriter main = BranchIndexWriter.create(
    Path.of("/tmp/my-index"), "main");

// Add documents
Document doc = new Document();
doc.add(new TextField("title", "Hello World", Field.Store.YES));
main.addDocument(doc);
main.commit("Initial commit");

// Fork in a few ms regardless of index size)
BranchIndexWriter feature = main.fork("experiment");

// Branches evolve independently
feature.addDocument(anotherDoc);
feature.commit("Feature work");

// Merge back
main.mergeFrom(feature);

When to use Scriptum vs Proximum

Scriptum

Full-text search with Lucene

Keyword search, facets, highlighting
Text analysis pipelines
Document-oriented indices
When you need Lucene's query language

Proximum

Vector similarity search

Embedding-based retrieval (RAG)
Semantic search
Faster parallelized inerstion than Lucene HNSW
Advanced vector search features

Both have branching, snapshots, and time-travel. Choose based on your search workload.

Requirements

Java 21+ - Required for Lucene 10.x (Foreign Memory API, Vector API)
Lucene 10.3.2 - Pulled from Maven Central
Clojure 1.12.0+ - For the Clojure API

Install

Available on Clojars. See the GitHub repository for current version and installation instructions.

Maven/Gradle users: add the Clojars repository to your build configuration.