versioned · fast · scalable

The Memory Model for Intelligence

Datahike is an open-source immutable database. Every transaction creates a snapshot you can query later, fork without copying data, or verify through the Merkle structure of the storage itself. We built it because we needed a database that remembers everything for production systems, auditable pipelines, and long-running agents. Readers connect directly to S3 or the filesystem. No server required for reads.

The ecosystem

One branching model across Datalog, SQL, vectors, and search. Fork your entire world-state in O(1).

Collaborate without infrastructure

A Datahike database is a value - an immutable snapshot you can hold, share, and query anywhere. Readers connect directly to storage: no server to start, no API to negotiate, no ETL pipeline to maintain. If two teams share an S3 bucket, they can join their databases in a single Datalog expression.

; Two teams, two S3 buckets - no servers, no ETL pipeline
(def catalog   (d/connect {:store {:backend :s3 :bucket "team-a"}}))
(def inventory (d/connect {:store {:backend :s3 :bucket "team-b"}}))

; Join across databases in a single Datalog expression
(d/q '[:find ?name ?stock
       :in $cat $inv
       :where [$cat ?p :product/sku  ?sku]
              [$cat ?p :product/name ?name]
              [$inv ?i :stock/sku    ?sku]
              [$inv ?i :stock/count  ?stock]]
  @catalog @inventory)
; => #{["Widget A" 142] ["Widget B" 88]}

Datalog natively supports multi-database joins via :in. Both values are immutable snapshots - no locking, no coordination required. Learn more →

Show me

Examples in Java. Also: Clojure, JavaScript, Python, C/C++ (libdatahike), CLI (dthk), Babashka pod, HTTP REST.

Connect

import datahike.java.*;
import java.util.*;

var cfg = Database.file("/tmp/db")
    .keepHistory(true)
    .build();
Datahike.createDatabase(cfg);
var conn = Datahike.connect(cfg);

Transact

Datahike.transact(conn, List.of(
    Map.of(":user/name", "Ada",
           ":user/email", "ada@example.com")));

Query

// Datalog query (EDN syntax)
var results = Datahike.q(
    "[:find ?e ?name :where [?e :user/name ?name]]",
    Datahike.deref(conn));
// => #{[1 "Ada"]}

Time-travel

// Query a past snapshot
var oldDb = Datahike.asOf(
    Datahike.deref(conn),
    Date.from(Instant.parse("2024-01-01T00:00:00Z")));
var history = Datahike.q(
    "[:find ?name :where [_ :user/name ?name]]",
    oldDb);

Full Datalog - joins, aggregates, pull expressions, rules.

JavaScript / Node.js (beta)

Install: npm install datahike@next

const d = require('datahike');
const crypto = require('crypto');

const config = {
  store: {
    backend: ':memory',
    id: crypto.randomUUID()
  },
  'schema-flexibility': ':read'  // Allow schemaless data (use kebab-case)
};

await d.createDatabase(config);
const conn = await d.connect(config);
await d.transact(conn, [{ name: 'Alice' }]);
const db = await d.db(conn);  // db() is async for async backends
const results = await d.q('[:find ?n :where [?e :name ?n]]', db);
console.log(results);
// => [['Alice']]

TypeScript definitions included. Same Datalog queries, Promise-based API. Try in your browser →

Notes

Occasional writing on databases, immutability, and semantic search.

In production

Used by developers and government agencies who need data they can trust.

"Datahike is a foundational part of the stub story - going from a rough prototype all the way to finding product-market fit, generating revenue, and raising capital. It's been a critical part of our journey, and if I had to do this all again, you best believe I'd use Datahike again."
Alex Oloo Cofounder & CTO, Stub - accounting platform for 5,000+ SMBs across South Africa

The Swedish Public Employment Service has used Datahike in production since 2024 for the JobTech Taxonomy -40,000+ labour market concepts (occupations, skills, education standards) accessed by thousands of daily caseworkers. Datahike performs competitively in a benchmark against Datomic in their evaluation.

Arbetsförmedlingen Swedish Public Employment Service - government production deployment

Heidelberg University built emotrack on Datahike - a longitudinal emotion tracking application for psychological research, capturing and querying time-series self-report data across study participants.

Heidelberg University Psychological research - emotion tracking application

Get started

Runs on the JVM. Distributed via Clojars.

Maven - add Clojars repository and dependency:

<!-- Enable Clojars -->
<repository>
  <id>clojars</id>
  <url>https://repo.clojars.org/</url>
</repository>

<!-- Datahike dependency -->
<dependency>
  <groupId>org.replikativ</groupId>
  <artifactId>datahike</artifactId>
  <version>LATEST</version>
</dependency>

Clojure CLI, Leiningen, Gradle, and JavaScript: see the README on GitHub. Or try it in your browser →

Work with us

If you need help getting Datahike into production, we can help with integration, custom development, and support contracts.