
We’re happy to announce that langchain-arcadedb is now available on PyPI as an official, standalone integration package. If you’re building AI applications with LangChain and want a graph database that handles documents, vectors, and graphs in a single engine, ArcadeDB is ready to go.
pip install langchain-arcadedb
That’s it. You’re ready to build Graph RAG pipelines.
What Is Graph RAG?
Traditional RAG (Retrieval-Augmented Generation) searches a flat vector store for relevant text chunks and feeds them to an LLM. This works well for factual Q&A but falls apart when questions require understanding relationships between entities.
Graph RAG adds a knowledge graph to the pipeline. Instead of just retrieving text, the system can traverse relationships — “Alice works at Acme, which is headquartered in Rome” — and provide the LLM with structured context that a vector search alone would miss.
This is where ArcadeDB shines. As a multi-model database, it can store your knowledge graph and your documents and your vector embeddings in the same engine. No need to glue together three different databases.
How It Works
The langchain-arcadedb package provides ArcadeDBGraph, a graph store that implements the LangChain GraphStore protocol. It connects to ArcadeDB via the native Bolt protocol using the standard Neo4j Python driver — no custom client needed.
Under the hood, it replaces all APOC procedures (which are Neo4j-specific) with pure Cypher queries, making it fully compatible with ArcadeDB’s 97.8% OpenCypher support.
Connect to ArcadeDB
from langchain_arcadedb import ArcadeDBGraph
graph = ArcadeDBGraph(
url="bolt://localhost:7687",
username="root",
password="playwithdata",
database="mydb",
)
# Schema is auto-detected from your data
print(graph.get_schema)
The graph store automatically introspects your database to discover node labels, relationship types, and property types. This schema is then used as context for LLM-powered Cypher generation.
Natural Language Queries with GraphCypherQAChain
This is where things get interesting. GraphCypherQAChain takes a natural language question, generates a Cypher query, runs it against your graph, and returns a human-readable answer:
from langchain_arcadedb import ArcadeDBGraph
from langchain_neo4j import GraphCypherQAChain
from langchain_openai import ChatOpenAI
graph = ArcadeDBGraph(
url="bolt://localhost:7687",
username="root",
password="playwithdata",
database="mydb",
)
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
chain = GraphCypherQAChain.from_llm(llm=llm, graph=graph, verbose=True)
answer = chain.invoke({"query": "Who does Alice know?"})
print(answer["result"])
The chain will:
- Read the graph schema (node types, relationship types, properties)
- Generate a Cypher query:
MATCH (a:Person {name: 'Alice'})-[:KNOWS]->(b) RETURN b.name - Execute it against ArcadeDB
- Format the results into a natural language answer
Because ArcadeDBGraph satisfies the same protocol as Neo4jGraph, it works as a drop-in replacement — no changes to your chain logic.
Import Knowledge Graphs
You can also build your knowledge graph programmatically using graph documents:
from langchain_arcadedb import ArcadeDBGraph, GraphDocument, Node, Relationship
from langchain_core.documents import Document
graph = ArcadeDBGraph(
url="bolt://localhost:7687",
username="root",
password="playwithdata",
database="mydb",
)
nodes = [
Node(id="alice", type="Person", properties={"name": "Alice", "role": "Engineer"}),
Node(id="bob", type="Person", properties={"name": "Bob", "role": "Manager"}),
Node(id="acme", type="Company", properties={"name": "Acme Corp"}),
]
relationships = [
Relationship(source=nodes[0], target=nodes[1], type="KNOWS"),
Relationship(source=nodes[0], target=nodes[2], type="WORKS_AT"),
Relationship(source=nodes[1], target=nodes[2], type="WORKS_AT"),
]
doc = GraphDocument(
nodes=nodes,
relationships=relationships,
source=Document(page_content="Alice and Bob work at Acme Corp"),
)
graph.add_graph_documents([doc], include_source=True)
Nodes are merged by id using MERGE operations, so importing the same document twice won’t create duplicates. Relationships are grouped by type for efficient batched insertion.
Starting ArcadeDB with Bolt
If you don’t have ArcadeDB running yet, the fastest way is Docker:
docker run --rm -p 2480:2480 -p 7687:7687 \
-e JAVA_OPTS="-Darcadedb.server.plugins=Bolt:com.arcadedb.bolt.BoltProtocolPlugin" \
-e arcadedb.server.rootPassword=playwithdata \
arcadedata/arcadedb:latest
Then create a database:
curl -X POST http://localhost:2480/api/v1/server \
-d '{"command":"create database mydb"}' \
-u root:playwithdata \
-H "Content-Type: application/json"
Port 7687 is the Bolt protocol endpoint that langchain-arcadedb connects to. Port 2480 gives you access to the ArcadeDB Studio web interface for visual exploration.
Configuration
Connection parameters can be passed directly or through environment variables:
| Parameter | Environment Variable | Default |
|---|---|---|
url |
ARCADEDB_URI |
bolt://localhost:7687 |
username |
ARCADEDB_USERNAME |
root |
password |
ARCADEDB_PASSWORD |
playwithdata |
database |
ARCADEDB_DATABASE |
(empty) |
This makes it straightforward to configure in CI/CD pipelines or containerized deployments without hardcoding credentials.
Why ArcadeDB for Graph RAG?
There are several graph databases that work with LangChain. Here’s why ArcadeDB stands out:
Multi-model by design. ArcadeDB isn’t just a graph database — it natively supports documents, key-value, time-series, and vector data in the same engine. You don’t need a separate vector store for embeddings or a separate document store for raw content.
Open source, Apache 2.0, forever. No license bait-and-switch. ArcadeDB is and will remain fully open source under the Apache 2.0 license. You can deploy it anywhere — on-premises, cloud, embedded — without licensing concerns.
Bolt protocol compatibility. ArcadeDB speaks the same Bolt protocol as Neo4j, which means the entire ecosystem of drivers and tools works out of the box. The langchain-arcadedb package uses the standard neo4j Python driver — battle-tested and well-maintained.
97.8% OpenCypher compatibility. Your existing Cypher queries work. The langchain-arcadedb package uses pure Cypher for everything — schema introspection, document import, relationship traversal — no proprietary procedures required.
Lightweight and fast. ArcadeDB is written in Java and can run embedded (no server needed) or as a standalone server. It has a small memory footprint and delivers high performance on standard benchmarks.
Part of a Growing Ecosystem
The langchain-arcadedb package joins our growing family of AI framework integrations:
- llama-index-graph-stores-arcadedb — LlamaIndex PropertyGraphStore integration for graph-based RAG
- langchain4j-community-arcadedb — Java LangChain4J embedding store with native HNSW vector indexing
Whether you’re working in Python or Java, with LangChain or LlamaIndex, ArcadeDB has an integration ready for your AI pipeline.
Get Started
Install the package:
pip install langchain-arcadedb
Check out the full source code and documentation on GitHub:
- GitHub: github.com/ArcadeData/langchain-arcadedb
- PyPI: pypi.org/project/langchain-arcadedb
- ArcadeDB Docs: docs.arcadedb.com
Have questions or feedback? Join our Discord community or open an issue on GitHub. We’d love to hear what you’re building.