From KuzuDB to ArcadeDB: A Migration Guide

If you’ve been using KuzuDB for graph queries, you’ve probably already seen the news: Kùzu Inc. was acquired by Apple in October 2025, and the KuzuDB GitHub repository was archived. The project is no longer actively maintained. Community forks like bighorn and LadybugDB are early-stage. You need a production path forward.

This post is about one option: ArcadeDB.

I want to be upfront about what ArcadeDB is and isn’t, because a genuine comparison is more useful than a sales pitch.

What KuzuDB was good at

KuzuDB was an embedded OLAP graph database — it ran in-process, used columnar storage, and was optimized for analytical graph queries over large datasets. It was MIT-licensed, written in C++, and callable from Python, Java, Node.js, and Rust. If you were running complex analytical graph traversals without a separate server process, Kuzu was well-suited to that.

What ArcadeDB is

ArcadeDB is a multi-model DBMS — graph, document, key/value, full-text search, time series, and vector embeddings in a single engine. It is not primarily an embedded analytical database. By default it runs as a server, though it can also run as an embedded Java library (client-less mode). It is written in Java (JVM required) and is fully open source under the Apache 2.0 license — not MIT, but Apache 2.0 is equally permissive for commercial and production use and has no restrictions that would worry enterprise procurement teams.

Where ArcadeDB makes sense as a Kuzu replacement

If your Kuzu use case was one of these, ArcadeDB is worth evaluating:

1. Knowledge graphs with mixed data types You stored graph relationships alongside document properties and needed to query both in the same operation. ArcadeDB handles this natively — vertices and edges can carry full document structures. You can query with SQL (extended), Cypher, Gremlin, or GraphQL.

2. You need vector similarity search alongside graph traversal Kuzu added vector index support relatively late. ArcadeDB has had vector embeddings since earlier and supports hybrid queries: graph traversal + vector similarity + full-text Lucene search in a single query. This is the GraphRAG pattern — retrieving semantically related context through graph relationships.

3. You were building a persistent multi-model store, not just analytics Kuzu was OLAP-first. If your workload includes transactional writes (not just batch analytics), ArcadeDB is ACID with full transaction isolation and is designed for mixed OLTP + OLAP workloads.

4. You want Apache 2.0 with a public no-change commitment The Kuzu situation is a good reminder that MIT license doesn’t protect you from acquisition. Apache 2.0 doesn’t either — but ArcadeDB’s team has publicly committed to never changing the license. (See: Open Source Forever.) We built the database after SAP acquired OrientDB and changed its trajectory. We understand what it means to be abandoned by a vendor.

Where ArcadeDB is NOT a good Kuzu replacement

Be honest with yourself about these:

You need a pure embedded C++ library with no JVM. ArcadeDB requires a JVM. If your stack is Rust, Go, or Python-native and you cannot tolerate a JVM dependency, ArcadeDB is not the right choice. Look at bighorn or GraphLite instead.
Your workload is pure large-scale OLAP graph analytics. Kuzu’s columnar storage and worst-case-optimal join algorithms were optimized for analytical query patterns on very large graphs. ArcadeDB is strong at traversal performance (constant time, not affected by database size) but its architecture is different. Benchmark on your specific workload.
You need a sub-10ms in-process call with zero network overhead. ArcadeDB’s client-server mode introduces network overhead. The embedded library mode reduces this, but the architecture is fundamentally different from Kuzu’s in-process design.

Getting started with ArcadeDB

If the tradeoffs work for you:

# Docker (fastest start)
docker run -p 2480:2480 -p 2424:2424 arcadedata/arcadedb

# Or download: https://arcadedb.com

ArcadeDB’s Cypher support is OpenCypher TCK-compliant. If you were using Cypher in Kuzu, your queries will run largely unchanged. SQL and Gremlin are also available.

Migration tool from OrientDB exists; no direct Kuzu importer currently — you’ll need to re-import your data via the HTTP API or Java client.

Documentation: https://docs.arcadedb.com
GitHub: https://github.com/ArcadeData/arcadedb
Discord: https://discord.gg/arcadedb

Step-by-step migration guide: KuzuDB to ArcadeDB

This section walks you through the full migration process. The recommended path is: export from KuzuDB as CSV, translate the schema, then import into ArcadeDB.

Step 1: Export your KuzuDB database

KuzuDB’s EXPORT DATABASE command dumps everything — schema definitions and data files — into a directory.

import kuzu

db = kuzu.Database("./my_kuzu_db")
conn = kuzu.Connection(db)

# Export the full database as CSV with headers
conn.execute("EXPORT DATABASE '/tmp/kuzu_export' (format='csv', header=true)")

This creates:

/tmp/kuzu_export/
├── schema.cypher      # CREATE NODE TABLE / CREATE REL TABLE statements
├── copy.cypher        # COPY FROM statements (for re-import into Kuzu)
├── macro.cypher       # Macro definitions (if any)
├── Person.csv         # One CSV file per node table
├── Company.csv
├── Follows.csv        # One CSV file per relationship table
└── WorksAt.csv

Open schema.cypher — this is your source of truth for the schema translation.

Step 2: Translate the schema to ArcadeDB

KuzuDB and ArcadeDB use different DDL syntax and type names. Here’s the mapping:

KuzuDB	ArcadeDB
`CREATE NODE TABLE`	`CREATE VERTEX TYPE`
`CREATE REL TABLE`	`CREATE EDGE TYPE`
`PRIMARY KEY`	`CREATE INDEX ... UNIQUE`
`INT64`	`LONG`
`INT32`	`INTEGER`
`INT16`	`SHORT`
`FLOAT`	`FLOAT`
`DOUBLE`	`DOUBLE`
`STRING`	`STRING`
`BOOLEAN`	`BOOLEAN`
`DATE`	`DATE`
`TIMESTAMP`	`DATETIME`

KuzuDB schema example (from schema.cypher):

CREATE NODE TABLE Person(id INT64 PRIMARY KEY, name STRING, age INT64, registered DATE);
CREATE NODE TABLE Company(id INT64 PRIMARY KEY, name STRING, founded INT64);
CREATE REL TABLE Follows(FROM Person TO Person, since INT64);
CREATE REL TABLE WorksAt(FROM Person TO Company, role STRING, since INT64);

Translated ArcadeDB schema (SQL):

-- Create vertex types
CREATE VERTEX TYPE Person;
CREATE PROPERTY Person.id LONG;
CREATE PROPERTY Person.name STRING;
CREATE PROPERTY Person.age LONG;
CREATE PROPERTY Person.registered DATE;
CREATE INDEX ON Person (id) UNIQUE;

CREATE VERTEX TYPE Company;
CREATE PROPERTY Company.id LONG;
CREATE PROPERTY Company.name STRING;
CREATE PROPERTY Company.founded LONG;
CREATE INDEX ON Company (id) UNIQUE;

-- Create edge types
CREATE EDGE TYPE Follows;
CREATE PROPERTY Follows.since LONG;

CREATE EDGE TYPE WorksAt;
CREATE PROPERTY WorksAt.role STRING;
CREATE PROPERTY WorksAt.since LONG;

You can execute these statements via the ArcadeDB console, the HTTP API, or Studio.

Step 3: Start ArcadeDB and create the database

# Start ArcadeDB with Docker
docker run -d --name arcadedb \
  -p 2480:2480 -p 2424:2424 \
  -e JAVA_OPTS="-Darcadedb.server.rootPassword=your_password" \
  arcadedata/arcadedb

Create the database and execute the schema:

# Create a new database via the HTTP API
curl -X POST "http://localhost:2480/api/v1/server" \
  -u root:your_password \
  -d '{"command": "CREATE DATABASE mydb"}'

Then execute each schema statement:

# Execute schema DDL
curl -X POST "http://localhost:2480/api/v1/command/mydb" \
  -u root:your_password \
  -H "Content-Type: application/json" \
  -d '{"language": "sql", "command": "CREATE VERTEX TYPE Person"}'

Repeat for each type and property definition — or use a script (see Step 4).

Step 4: Import node/vertex data from CSV

Import vertices first, edges second. For each node table exported from KuzuDB, load the CSV into ArcadeDB.

Option A: Using the HTTP API with a script

import csv
import requests

ARCADEDB_URL = "http://localhost:2480/api/v1/command/mydb"
AUTH = ("root", "your_password")

def execute_sql(sql):
    resp = requests.post(ARCADEDB_URL, auth=AUTH,
                         json={"language": "sql", "command": sql})
    resp.raise_for_status()
    return resp.json()

# Import Person vertices
with open("/tmp/kuzu_export/Person.csv") as f:
    reader = csv.DictReader(f)
    for row in reader:
        sql = f"""INSERT INTO Person SET id = {row['id']},
                  name = '{row['name'].replace("'", "''")}',
                  age = {row['age']},
                  registered = '{row['registered']}'"""
        execute_sql(sql)

# Import Company vertices
with open("/tmp/kuzu_export/Company.csv") as f:
    reader = csv.DictReader(f)
    for row in reader:
        sql = f"""INSERT INTO Company SET id = {row['id']},
                  name = '{row['name'].replace("'", "''")}',
                  founded = {row['founded']}"""
        execute_sql(sql)

print("Vertices imported.")

Option B: Using ArcadeDB’s SQL IMPORT DATABASE (for simple cases)

If your CSV has a straightforward structure, you can use ArcadeDB’s built-in importer:

IMPORT DATABASE file:///tmp/kuzu_export/Person.csv
  WITH (vertexType = 'Person', commitEvery = 5000)

Option C: Using batch inserts for better performance

For large datasets, batch your inserts inside a single transaction:

curl -X POST "http://localhost:2480/api/v1/begin/mydb" -u root:your_password
# ... send multiple INSERT commands ...
curl -X POST "http://localhost:2480/api/v1/commit/mydb" -u root:your_password

Step 5: Import relationship/edge data from CSV

KuzuDB relationship CSV files include _from_ and _to_ columns containing the primary key values of the source and target nodes. Use these to create edges in ArcadeDB.

# Import Follows edges
with open("/tmp/kuzu_export/Follows.csv") as f:
    reader = csv.DictReader(f)
    for row in reader:
        sql = f"""CREATE EDGE Follows
                  FROM (SELECT FROM Person WHERE id = {row['Person.id']})
                  TO (SELECT FROM Person WHERE id = {row['Person.id_1']})
                  SET since = {row['since']}"""
        execute_sql(sql)

# Import WorksAt edges
with open("/tmp/kuzu_export/WorksAt.csv") as f:
    reader = csv.DictReader(f)
    for row in reader:
        sql = f"""CREATE EDGE WorksAt
                  FROM (SELECT FROM Person WHERE id = {row['Person.id']})
                  TO (SELECT FROM Company WHERE id = {row['Company.id']})
                  SET role = '{row['role'].replace("'", "''")}',
                      since = {row['since']}"""
        execute_sql(sql)

print("Edges imported.")

Important: The exact column names in Kuzu’s exported CSV depend on your schema. Check the CSV headers before writing the import script. Kuzu typically exports relationship CSVs with columns like from, to, or TableName.primaryKey for the source and target references.

Step 6: Create indexes for query performance

After all data is loaded, create indexes on properties you’ll use for lookups:

-- Already created unique indexes on primary keys in Step 2
-- Add additional indexes for common query patterns
CREATE INDEX ON Person (name);
CREATE INDEX ON Company (name);

Step 7: Validate the migration

Run validation queries to confirm data integrity:

-- Check vertex counts match KuzuDB
SELECT count(*) FROM Person;
SELECT count(*) FROM Company;

-- Check edge counts
SELECT count(*) FROM Follows;
SELECT count(*) FROM WorksAt;

-- Spot-check a known record
SELECT * FROM Person WHERE id = 1;

-- Validate a graph traversal
SELECT expand(out('Follows')) FROM Person WHERE name = 'Alice';

Compare these results against your KuzuDB queries:

// In Kuzu (for reference)
MATCH (p:Person) RETURN count(p);
MATCH ()-[f:Follows]->() RETURN count(f);

Step 8: Migrate your application queries

If you were using Cypher in KuzuDB, most queries will work in ArcadeDB with minimal changes. ArcadeDB’s Cypher support is OpenCypher TCK-compliant.

Queries that work as-is:

MATCH (p:Person)-[:Follows]->(friend)
WHERE p.name = 'Alice'
RETURN friend.name;

What to watch for:

Kuzu-specific functions (e.g., list_extract, struct_extract) don’t exist in ArcadeDB. Replace them with ArcadeDB equivalents or standard Cypher.
Kuzu’s recursive path queries (MATCH (a)-[:Follows*1..3]->(b)) work in ArcadeDB Cypher, but performance characteristics may differ for very deep traversals.
Schema queries like CALL show_tables() are Kuzu-specific. In ArcadeDB, use SELECT FROM schema:types or the Studio UI.
COPY FROM/TO does not exist in ArcadeDB Cypher. Use SQL IMPORT/EXPORT or the HTTP API instead.

You can also use ArcadeDB’s SQL dialect, which supports graph traversal operators:

-- ArcadeDB SQL with graph extensions
SELECT out('Follows').name AS friends
FROM Person
WHERE name = 'Alice';

Still evaluating?

If you’re not sure ArcadeDB is the right fit, the honest answer is: try it on your actual workload. Graph database performance is highly workload-specific. The Discord community is active and the team responds quickly to technical questions.