We’re excited to announce a significant upgrade to ArcadeDB’s SQL parser that will become the default in version 26.2.1. After extensive development and testing, we’ve successfully migrated our SQL parser from JavaCC to ANTLR - a change that brings substantial performance improvements and sets the stage for easier SQL language extensions in the future.

What Changed?
Important: We replaced only the parser, not the SQL engine. The entire query execution engine, optimizer, and all SQL semantics remain exactly the same. We’ve simply changed the front-end component that reads and parses your SQL queries from JavaCC to ANTLR.
Think of it this way: the SQL engine is like the engine in a car - it hasn’t changed. We’ve only upgraded the fuel injection system (the parser) that feeds it. The same proven execution engine processes your queries, using the same optimization strategies, and producing the same results. This approach ensures very high compatibility with existing queries while delivering the benefits of modern parsing technology.
Why ANTLR Over JavaCC?
ANTLR (ANother Tool for Language Recognition) brings several advantages that align perfectly with ArcadeDB’s goals:
1. Modern Language Recognition
ANTLR is the industry standard for building parsers, actively maintained, and used by major projects worldwide. It provides more sophisticated parsing capabilities and better error recovery mechanisms than JavaCC.
2. Easier Grammar Maintenance
The ANTLR grammar syntax is more readable and maintainable. This means when we need to add new SQL features or extend the language, we can do so with greater confidence and speed.
3. Better Tooling and Community Support
ANTLR comes with excellent debugging tools, visualization capabilities, and a large community. This makes it easier to identify and fix parsing issues when they arise.
4. Future-Proof Architecture
By adopting ANTLR, we’re aligning with modern compiler and language design practices, ensuring ArcadeDB’s SQL implementation can evolve smoothly as new requirements emerge.
Performance: The Numbers Speak for Themselves
Beyond maintainability, the new parser delivers impressive performance gains. Here are the benchmark results comparing the JavaCC and ANTLR implementations:
| Query | JavaCC (us) | ANTLR (us) | Diff (us) | Winner |
|---|---|---|---|---|
| Simple SELECT | 150 | 89 | -61 | ANTLR |
| Complex SELECT (AND/OR) | 732 | 202 | -530 | ANTLR |
| SQL Many Parenthesis | 304 | 38 | -266 | ANTLR |
| MATCH Query 1 | 263 | 75 | -188 | ANTLR |
| MATCH Query 2 | 333 | 81 | -252 | ANTLR |
| Mixed SQL (10 cmds) | 983 | 180 | -803 | ANTLR |
| TOTAL | 2765 | 665 | -2100 | ANTLR |
Key Takeaways:
- Overall, ANTLR is 75.9% faster across our benchmark suite
The pattern is clear: the more complex your queries, the more you’ll benefit from the new parser. For real-world applications with diverse query patterns, the ANTLR implementation delivers substantially better performance.
Extending SQL Just Got Easier
One of the most exciting aspects of this change is how much easier it will be to extend ArcadeDB’s SQL dialect going forward. The ANTLR grammar is more declarative and modular, meaning:
-
Faster feature development - new SQL keywords and syntax can be added with less risk of breaking existing functionality
-
Better syntax validation - more precise error messages when queries have syntax issues
-
Cleaner codebase - easier for contributors to understand and enhance the SQL implementation
This foundation will enable us to respond more quickly to community feature requests and keep ArcadeDB’s SQL implementation at the cutting edge.
Migration and Compatibility
Starting with version 26.1.1, the ANTLR parser will be the default SQL parser in ArcadeDB.
The good news: compatibility is very high. Because we only changed the parser (not the execution engine), your existing queries should work exactly as before. Both parsers produce the same Abstract Syntax Tree (AST) that feeds into the same, unchanged SQL execution engine. This means:
- All SQL syntax supported by the JavaCC parser is supported by ANTLR
- Query semantics and behavior remain identical
- Execution plans and optimizations are unchanged
- Results are exactly the same
However, if you do encounter any issues during the transition, you can easily switch back to the original JavaCC parser:
-Darcadedb.sql.parserImplementation=javacc
Simply add this JVM property when starting ArcadeDB to use the legacy parser. This fallback option will remain available while we ensure a smooth transition for all users.
We Need Your Feedback!
This is a significant change, and while we’ve conducted extensive testing, real-world usage always reveals edge cases we might have missed. If you encounter any parsing errors, unexpected behavior, or performance issues with the new parser, please let us know:
- GitHub Issues: https://github.com/ArcadeData/arcadedb/issues
- Discord Community: Join our Discord channel for real-time discussions
Your feedback is invaluable in making ArcadeDB better for everyone. When reporting issues, please include:
- The SQL query that caused the problem
- Expected vs. actual behavior
- Whether switching to JavaCC resolves the issue
Looking Forward
This parser migration represents more than just a technical upgrade - it’s an investment in ArcadeDB’s future. With ANTLR as our parsing foundation, we’re better positioned to:
- Implement advanced SQL syntax and language features more rapidly
- Extend SQL dialect support with greater confidence
- Maintain high code quality as the project grows
- Respond to community needs with greater agility
And because the proven SQL execution engine remains unchanged, you get these benefits without sacrificing the stability and reliability you depend on.
We’re excited about what this enables for ArcadeDB’s roadmap and grateful to everyone who contributed to making this migration successful.
Thank you for being part of the ArcadeDB community. Here’s to faster queries and a more flexible future!