About
Building together
Datahike is versioned data infrastructure: an immutable database, a columnar SQL engine, vector search, and full-text search - all sharing the same copy-on-write branching model. We're a growing community of systems builders and researchers, and we're looking for collaborators and early customers. The best way to join is through the open source work - high quality contributions that benefit everyone. For commercial partnerships, reach out to contact@datahike.io.
Where Datahike comes from
This work started over a decade ago with Votorola, a liquid democracy project that needed a distributed memory model. That led me to Clojure, functional data structures, and eventually Datahike - an open-source Datalog database built on the insight that data should work like git: immutable snapshots, time travel, and zero-cost branching. Collaborators including Konrad Kühne and the broader replikativ community helped shape what it is today.
Who's building this
Christian Weilbach - Founder
I'm a researcher and systems builder focused on persistent memory and runtimes for intelligent systems. My PhD at the University of British Columbia was in Structured Amortized Variational Inference - building inference systems that accumulate evidence and maintain structured beliefs over time. That research directly shapes how I think about memory substrates for AI: data that versions, forks, and updates beliefs rather than overwriting state.
Before UBC, I completed a Master's at Heidelberg University in Dictionary Learning with Bayesian GANs, and a Bachelor's implementing online learning for Boltzmann machines on spiking neuron models. I also spent several years studying Philosophy and Cultural Anthropology - which gave me a different lens on knowledge representation and how systems encode meaning.
On the research side: papers at ICML, NeurIPS, AISTATS, and TMLR; a Google grant for work on Graphically Structured Diffusion Models; talks at Google DeepMind London, OpenAI San Francisco, and MILA Montreal. I co-organized the Clojure meetup in Mannheim-Heidelberg, and the Machine Learning Meetup Rhein-Neckar.
On the applied side: Among many projects I consulted for Roam Research on their Datalog query backend, and for pol.is on PCA modernization.
The main thread to all of it: how do you build systems that accumulate knowledge, reason with their own history, and remain inspectable over time? Datahike is part of my answer to that question.
Contributors
Many contributors, including our former team at Lambdaforge UG, made substantial contributions to Datahike's early development, including history indices, time travel, and temporal query support. This work was instrumental in achieving feature parity with Datomic's model while keeping it open and distributed.
Today, Datahike is maintained independently and shaped by its broader open source community. Check out the full contributor list.
Join us
Build with us
Datahike is open source and we're actively looking for collaborators. Whether you're interested in distributed systems, programming languages, or AI infrastructure - there's room to contribute. We need help with documentation, integrations, storage backends, and pushing the query engine forward.
Work with us
I'm building a company on top of Datahike. If you need help getting it into production, we can help with integration, custom development, and support contracts.
Email: contact@datahike.io