Scriptum
Full-text search with branching
Git-like branching for Apache Lucene. Fork a 100GB index by sharing immutable segment files. Time-travel queries, branch isolation, and safe experimentation on your search indices.
Why Scriptum
- Zero-cost forking - Branch any index in a few ms regardless of size. Copies metadata, not data.
- Structural sharing - Branches share immutable Lucene segments via copy-on-write overlay directories.
- Time travel - Open readers at any historical commit point. Query past index states.
- Full Lucene 10.x - Text search, KNN vectors, facets, highlighting - all branch-aware.
- Apache-2.0 - Open source, permissive license.
How it works
Scriptum extends Lucene with four components that enable copy-on-write branching:
- BranchedDirectory - Overlay pattern: reads fall back to base, writes go to branch overlay.
- BranchDeletionPolicy - Retains all commit points until explicit garbage collection.
- BranchAwareMergePolicy - Prevents merging shared segments that would break other branches.
- BranchIndexWriter - Main API for create, fork, commit, merge, and GC operations.
See LUCENE_EXTENSION.md for the full technical deep-dive.
Clojure API
(require '[scriptum.core :as sc])
;; Create an index
(def writer (sc/create-index "/tmp/my-index"))
;; Add documents
(sc/add-doc writer {:title {:type :text :value "Hello World"}
:id {:type :string :value "doc-1"}})
(sc/commit! writer "Initial commit")
;; Fork a branch (3-5ms regardless of index size)
(def experiment (sc/fork writer "experiment"))
;; Add to branch (doesn't affect main)
(sc/add-doc experiment {:title {:type :text :value "Branch only"}
:id {:type :string :value "doc-2"}})
(sc/commit! experiment "Added experimental doc")
;; Main still has 1 doc, branch has 2
(count (sc/search writer {:match-all {}} 100)) ;; => 1
(count (sc/search experiment {:match-all {}} 100)) ;; => 2
;; Merge back when ready
(sc/merge-from! writer experiment) Java API
import org.replikativ.scriptum.BranchIndexWriter;
import org.apache.lucene.document.*;
import java.nio.file.Path;
// Create an index
BranchIndexWriter main = BranchIndexWriter.create(
Path.of("/tmp/my-index"), "main");
// Add documents
Document doc = new Document();
doc.add(new TextField("title", "Hello World", Field.Store.YES));
main.addDocument(doc);
main.commit("Initial commit");
// Fork in a few ms regardless of index size)
BranchIndexWriter feature = main.fork("experiment");
// Branches evolve independently
feature.addDocument(anotherDoc);
feature.commit("Feature work");
// Merge back
main.mergeFrom(feature); When to use Scriptum vs Proximum
Scriptum
Full-text search with Lucene
- Keyword search, facets, highlighting
- Text analysis pipelines
- Document-oriented indices
- When you need Lucene's query language
Proximum
Vector similarity search
- Embedding-based retrieval (RAG)
- Semantic search
- Faster parallelized inerstion than Lucene HNSW
- Advanced vector search features
Both have branching, snapshots, and time-travel. Choose based on your search workload.
Requirements
- Java 21+ - Required for Lucene 10.x (Foreign Memory API, Vector API)
- Lucene 10.3.2 - Pulled from Maven Central
- Clojure 1.12.0+ - For the Clojure API
Install
Available on Clojars. See the GitHub repository for current version and installation instructions.
Maven/Gradle users: add the Clojars repository to your build configuration.