Back to Blog

Boost Your Database Performance with the New 75.9% Faster SQL Parser

We’re excited to announce a significant upgrade to ArcadeDB’s SQL parser that will become the default in version 26.2.1. After extensive development and testing, we’ve successfully migrated our SQL parser from JavaCC to ANTLR - a change that brings substantial performance improvements and sets the stage for easier SQL language extensions in the future.

What Changed?

Important: We replaced only the parser, not the SQL engine. The entire query execution engine, optimizer, and all SQL semantics remain exactly the same. We’ve simply changed the front-end component that reads and parses your SQL queries from JavaCC to ANTLR.

Think of it this way: the SQL engine is like the engine in a car - it hasn’t changed. We’ve only upgraded the fuel injection system (the parser) that feeds it. The same proven execution engine processes your queries, using the same optimization strategies, and producing the same results. This approach ensures very high compatibility with existing queries while delivering the benefits of modern parsing technology.

Why ANTLR Over JavaCC?

ANTLR (ANother Tool for Language Recognition) brings several advantages that align perfectly with ArcadeDB’s goals:

1. Modern Language Recognition

ANTLR is the industry standard for building parsers, actively maintained, and used by major projects worldwide. It provides more sophisticated parsing capabilities and better error recovery mechanisms than JavaCC.

2. Easier Grammar Maintenance

The ANTLR grammar syntax is more readable and maintainable. This means when we need to add new SQL features or extend the language, we can do so with greater confidence and speed.

3. Better Tooling and Community Support

ANTLR comes with excellent debugging tools, visualization capabilities, and a large community. This makes it easier to identify and fix parsing issues when they arise.

4. Future-Proof Architecture

By adopting ANTLR, we’re aligning with modern compiler and language design practices, ensuring ArcadeDB’s SQL implementation can evolve smoothly as new requirements emerge.

Performance: The Numbers Speak for Themselves

Beyond maintainability, the new parser delivers impressive performance gains. Here are the benchmark results comparing the JavaCC and ANTLR implementations:

Query JavaCC (us) ANTLR (us) Diff (us) Winner
Simple SELECT 150 89 -61 ANTLR
Complex SELECT (AND/OR) 732 202 -530 ANTLR
SQL Many Parenthesis 304 38 -266 ANTLR
MATCH Query 1 263 75 -188 ANTLR
MATCH Query 2 333 81 -252 ANTLR
Mixed SQL (10 cmds) 983 180 -803 ANTLR
TOTAL 2765 665 -2100 ANTLR

Key Takeaways:

  • Overall, ANTLR is 75.9% faster across our benchmark suite

The pattern is clear: the more complex your queries, the more you’ll benefit from the new parser. For real-world applications with diverse query patterns, the ANTLR implementation delivers substantially better performance.

Extending SQL Just Got Easier

One of the most exciting aspects of this change is how much easier it will be to extend ArcadeDB’s SQL dialect going forward. The ANTLR grammar is more declarative and modular, meaning:

  • Faster feature development - new SQL keywords and syntax can be added with less risk of breaking existing functionality

  • Better syntax validation - more precise error messages when queries have syntax issues

  • Cleaner codebase - easier for contributors to understand and enhance the SQL implementation

This foundation will enable us to respond more quickly to community feature requests and keep ArcadeDB’s SQL implementation at the cutting edge.

Migration and Compatibility

Starting with version 26.1.1, the ANTLR parser will be the default SQL parser in ArcadeDB.

The good news: compatibility is very high. Because we only changed the parser (not the execution engine), your existing queries should work exactly as before. Both parsers produce the same Abstract Syntax Tree (AST) that feeds into the same, unchanged SQL execution engine. This means:

  • All SQL syntax supported by the JavaCC parser is supported by ANTLR
  • Query semantics and behavior remain identical
  • Execution plans and optimizations are unchanged
  • Results are exactly the same

However, if you do encounter any issues during the transition, you can easily switch back to the original JavaCC parser:

-Darcadedb.sql.parserImplementation=javacc

Simply add this JVM property when starting ArcadeDB to use the legacy parser. This fallback option will remain available while we ensure a smooth transition for all users.

We Need Your Feedback!

This is a significant change, and while we’ve conducted extensive testing, real-world usage always reveals edge cases we might have missed. If you encounter any parsing errors, unexpected behavior, or performance issues with the new parser, please let us know:

Your feedback is invaluable in making ArcadeDB better for everyone. When reporting issues, please include:

  • The SQL query that caused the problem
  • Expected vs. actual behavior
  • Whether switching to JavaCC resolves the issue

Looking Forward

This parser migration represents more than just a technical upgrade - it’s an investment in ArcadeDB’s future. With ANTLR as our parsing foundation, we’re better positioned to:

  • Implement advanced SQL syntax and language features more rapidly
  • Extend SQL dialect support with greater confidence
  • Maintain high code quality as the project grows
  • Respond to community needs with greater agility

And because the proven SQL execution engine remains unchanged, you get these benefits without sacrificing the stability and reliability you depend on.

We’re excited about what this enables for ArcadeDB’s roadmap and grateful to everyone who contributed to making this migration successful.

Thank you for being part of the ArcadeDB community. Here’s to faster queries and a more flexible future!