Real-time Analytics with ArcadeDB

Your Analytics Stack Is Too Complex

70% of data leaders say their current data stack is too complex. 85% cite tool integration as their top challenge. Over 63% spend more than a full day per week on maintenance instead of delivering insights.

The reason is architectural: real-time analytics requires capabilities that no single-model database provides. So teams assemble a stack:

InfluxDB or TimescaleDB for time-series metrics
Neo4j or Neptune for dependency graphs and topology
Elasticsearch for log search and full-text queries
MongoDB for configuration documents and metadata
Kafka to synchronize data between all of them
Grafana to stitch dashboards across multiple data sources

Six systems. Six APIs. Six operational surfaces. Data flowing through sync pipelines that add latency and introduce consistency gaps. Engineers spending a third of their time jumping between tools.

ArcadeDB collapses this into one database. Native time-series ingestion, graph traversal, vector search, document storage, full-text indexing, and a built-in Grafana data source — all accessible from SQL, Cypher, or Gremlin.

Graph + Time Series in One Query

Define a Time-Series Type

Create a high-performance time-series type with tags for filtering, fields for measurements, and automatic data management:

CREATE TIMESERIES TYPE SensorReading
  TIMESTAMP ts PRECISION NANOSECOND
  TAGS (
    sensor_id STRING,
    location  STRING,
    floor     STRING
  )
  FIELDS (
    temperature DOUBLE,
    humidity    DOUBLE,
    pressure    DOUBLE
  )
  SHARDS 16
  RETENTION 90 DAYS
  COMPACTION INTERVAL 30s

Shards are distributed across CPU cores for zero-contention parallel writes. Retention policies and compaction run automatically.

A Purpose-Built Time-Series Engine

ArcadeDB's time-series engine is not an afterthought bolted onto a graph database. It's a dedicated storage layer designed from the ground up for high-throughput ingestion and analytical queries.

Storage architecture:

Shard-per-core parallelism: Each shard maintains independent mutable and sealed storage layers, enabling zero-contention writes across all CPU cores
Delta-of-delta timestamps: 96% compression on timestamp columns
Gorilla XOR encoding: ~1.37 bytes per floating-point sample
Dictionary + RLE tags: 10-100x compression on tag columns
Block-level statistics: Per-block min/max values enable fast-path aggregation without decompression

Query acceleration:

Binary search on sealed blocks: O(log B) block selection for time range queries
Lazy column decompression: Only decode the columns your query actually reads
Early termination: Stop scanning the moment the timestamp range is exhausted
Streaming iterator: Constant memory regardless of dataset size — O(shardCount × blockSize), not O(totalRows)

Ingest Data Your Way

ArcadeDB supports multiple ingestion protocols, so you can connect your existing data pipeline without rewriting anything:

InfluxDB Line Protocol

The de facto standard for time-series ingestion. Compatible with Telegraf and its 200+ input plugins — meaning you can collect metrics from virtually any system (servers, databases, containers, cloud services, IoT devices) and send them directly to ArcadeDB without writing code. If you're migrating from InfluxDB, just change the endpoint URL.

SQL Batch INSERT

Standard SQL inserts with batch support. Ideal for application-level ingestion from backend services, ETL pipelines, or any SQL-compatible client.

Java Embedded API

The fastest ingestion path. Write directly to the time-series engine from JVM applications with columnar batch appends — bypassing SQL parsing entirely for maximum throughput.

HTTP REST API

JSON-based HTTP endpoints for any language or platform. Query results are returned as structured JSON, ready for consumption by web applications, microservices, or analytics tools.

InfluxDB Line Protocol

Send metrics from Telegraf or any ILP-compatible agent with a simple HTTP POST:

# POST /api/v1/ts/{db}/write
# ?precision=ns

SensorReading,sensor_id=s-A,location=hq \
  temperature=22.5,humidity=65.0,\
  pressure=1013.25 1708430400000000000

SensorReading,sensor_id=s-B,location=dc1 \
  temperature=19.1,humidity=70.0 \
  1708430400000000000

SQL Batch INSERT

INSERT INTO SensorReading
  (ts, sensor_id, location,
   temperature, humidity, pressure)
VALUES
  ('2026-02-20T10:00:00Z',
   's-A', 'hq',
   22.5, 65.0, 1013.25),
  ('2026-02-20T10:00:01Z',
   's-A', 'hq',
   22.6, 64.8, 1013.20)

Time Bucketing

SELECT
  ts.timeBucket('1h', ts) AS hour,
  sensor_id,
  avg(temperature) AS avg_temp,
  max(temperature) AS max_temp,
  ts.percentile(temperature, 0.99)
    AS p99_temp,
  count(*) AS samples
FROM SensorReading
WHERE ts BETWEEN 1771581600000
  AND 1771588800000
  AND sensor_id = 's-A'
GROUP BY hour, sensor_id
ORDER BY hour

Rate of Change Detection

SELECT
  ts.timeBucket('10m', ts) AS window,
  service_id,
  ts.rate(request_count, ts)
    AS requests_per_sec,
  ts.percentile(latency_ms, 0.99)
    AS p99_latency
FROM ServiceMetrics
WHERE ts BETWEEN 1771581600000
  AND 1771588800000
GROUP BY window, service_id

Gap Filling with Interpolation

SELECT
  ts.timeBucket('1m', ts) AS minute,
  ts.interpolate(temperature,
    'linear', ts) AS temp_filled
FROM SensorReading
WHERE sensor_id = 's-C'
  AND ts BETWEEN 1771581600000
    AND 1771587000000
GROUP BY minute

Analytical Queries Built for Time-Series

ArcadeDB's time-series engine provides specialized SQL functions designed for temporal analysis — going far beyond standard aggregates:

Time Bucketing

The ts.timeBucket() function aggregates data into configurable intervals: seconds, minutes, hours, days, weeks, or months. Combine it with standard aggregates (avg, min, max, count, sum) for dashboards and rollups.

Rate of Change

The ts.rate() function computes per-second velocity of counters with Prometheus-style counter reset detection. Essential for metrics like requests/second, errors/minute, or bytes/second.

Percentiles

The ts.percentile() function computes p50, p95, p99 latency distributions within each time bucket — the gold standard for SLA monitoring.

Gap Filling

The ts.interpolate() function fills missing data points using linear, previous-value, or next-value interpolation. Critical for sensors that report intermittently.

Correlation & Moving Averages

correlate() computes the Pearson correlation between two series. moving_avg() smooths noisy signals over configurable windows. delta() computes the difference between first and last values in a bucket.

Window Functions

lag(), lead(), row_number(), and rank() enable lookahead/lookbehind operations for change detection and sequential analysis.

Continuous Aggregates & Automatic Downsampling

Raw data is invaluable for short-term analysis, but storing every nanosecond-precision sample for years is prohibitively expensive. ArcadeDB solves this with two complementary features:

Continuous Aggregates

Pre-compute common rollups (hourly averages, daily max values) and keep them incrementally updated as new data arrives. Dashboard queries hit the pre-computed aggregates instead of scanning raw data, turning multi-second queries into sub-millisecond lookups.

Downsampling Policies

Define automatic tiered retention: full resolution for the last 7 days, hourly granularity for the last 30 days, daily granularity beyond that. Storage reduction of 90% or more with no manual intervention.

Retention Policies

Automatically expire old data after a configurable duration. A background scheduler runs every 60 seconds, applying retention and downsampling policies transparently.

Continuous Aggregates

-- Query pre-computed hourly rollups
SELECT *
FROM hourly_sensor_temps
WHERE hour BETWEEN 1771581600000
  AND 1771588800000
ORDER BY hour, sensor_id

Automatic Downsampling

-- Tiered downsampling policy:
-- full resolution → 7 days
-- hourly averages → 30 days
-- daily averages → beyond
ALTER TIMESERIES TYPE SensorReading
  ADD DOWNSAMPLING POLICY
    AFTER 7 DAYS
      GRANULARITY 1 HOUR
    AFTER 30 DAYS
      GRANULARITY 1 DAY

Reduces storage by 90%+ while preserving long-term trends for capacity planning and historical analysis.

Grafana Endpoints

ArcadeDB exposes a native Grafana-compatible HTTP API:

Health check:
GET /api/v1/ts/{db}/grafana/health

Type discovery:
GET /api/v1/ts/{db}/grafana/metadata

Query (DataFrame format):
POST /api/v1/ts/{db}/grafana/query

Latest value lookup:
GET /api/v1/ts/{db}/latest
  ?type=SensorReading
  &tag=sensor_id:s-A

Works with the Grafana Infinity data source plugin — no custom plugin development required. All aggregation is pushed down to ArcadeDB for maximum performance.

Native Grafana Integration

Grafana is the industry standard for observability dashboards, used by millions of engineers worldwide. ArcadeDB integrates natively with Grafana through a dedicated HTTP API that returns data in Grafana's DataFrame format.

This means you can build rich dashboards — time-series charts, gauges, tables, heatmaps, alerts — all powered by ArcadeDB queries, without writing a custom data source plugin.

Zero-config connection: Point Grafana's Infinity data source at ArcadeDB's Grafana endpoints
Automatic type discovery: Grafana auto-discovers available time-series types and their fields
Server-side aggregation: Time bucketing, percentiles, and rate calculations run in ArcadeDB, not in Grafana — following Grafana's own best practice of pushing work to the data source
Mixed queries: Combine time-series metrics with graph topology data and document metadata in the same dashboard
Alerting: Use Grafana's native alerting on ArcadeDB queries for threshold, anomaly, and trend-based alerts

Native PromQL Support

Prometheus is the backbone of modern infrastructure monitoring, and PromQL is the query language that millions of engineers already know. ArcadeDB now speaks PromQL natively — meaning you can query your time-series data using the same syntax you use with Prometheus, without running a separate Prometheus server.

This isn't a translation layer or a compatibility shim. ArcadeDB's time-series engine evaluates PromQL expressions directly against its columnar storage, leveraging the same compressed block scanning and shard parallelism as SQL queries.

What this means for your stack:

Drop-in Grafana compatibility: Point your Grafana Prometheus data source at ArcadeDB — your existing PromQL dashboards and alerts work immediately
No Prometheus server needed: Eliminate an entire component from your monitoring stack. ArcadeDB handles ingestion (InfluxDB Line Protocol, SQL) and PromQL querying
Range queries & instant queries: Full support for both /api/v1/query and /api/v1/query_range endpoints
PromQL + Graph: Combine PromQL metric queries with graph traversal — something Prometheus alone can never do. Query metrics for all services downstream of a failing node, discovered via graph traversal
Long-term storage: Prometheus was designed for short-term retention. ArcadeDB gives you PromQL over months or years of data with automatic downsampling and tiered retention

PromQL Queries on ArcadeDB

Standard PromQL expressions work directly against ArcadeDB's time-series engine:

# Average CPU over 5 minutes
avg_over_time(
  cpu_usage{host="web-01"}[5m]
)

# Request rate per service
rate(
  http_requests_total{
    job="api-server"
  }[5m]
)

# 99th percentile latency
histogram_quantile(0.99,
  rate(
    http_request_duration_bucket[5m]
  )
)

# Alert: error rate > 5%
sum(rate(
  http_errors_total[5m]))
/
sum(rate(
  http_requests_total[5m]))
> 0.05

Prometheus-Compatible API

# Instant query
GET /api/v1/ts/{db}/prometheus/api/v1/query
  ?query=rate(cpu_usage[5m])

# Range query
GET /api/v1/ts/{db}/prometheus/api/v1/query_range
  ?query=avg_over_time(temp[1h])
  &start=2026-02-20T00:00:00Z
  &end=2026-02-21T00:00:00Z
  &step=15m

# Label discovery
GET /api/v1/ts/{db}/prometheus/api/v1/labels
GET /api/v1/ts/{db}/prometheus/api/v1/label/{name}/values

Compatible with Grafana's Prometheus data source — just change the URL to point at ArcadeDB.

Time Series Tells You When. Graphs Tell You Why.

A time-series database can tell you that server CPU spiked at 14:32. But it can't tell you why. Was it a downstream database connection pool exhaustion? A deployment that rolled out to the wrong cluster? A cascading failure triggered by a failing network switch?

Answering "why" requires understanding relationships — which services depend on which infrastructure, which deployment affects which servers, which network segment feeds which rack. That's a graph problem.

ArcadeDB is the only database that lets you traverse a dependency graph and aggregate time-series data for the nodes you discover in a single query. No cross-database JOINs. No application-layer merging. No sync pipelines.

Root-cause analysis: Traverse from a failing service upstream through dependencies to find the originating failure
Impact analysis: Traverse downstream from a failing component to find all affected services and their current metrics
Correlated anomalies: Find all sensors in the same building/zone/rack whose readings deviated simultaneously
Topology-aware alerting: Alert on an aggregated metric across a group of graph-connected devices, not just individual thresholds

Graph + Time-Series Query

Traverse the building's sensor network and aggregate temperature readings for all discovered sensors:

-- Step 1: Find sensors via graph
SELECT sensor.sensor_id AS sensor_id,
       sensor.name AS sensor_name
FROM (
  MATCH {type: Building,
    where: (name = 'HQ')}
    .out('HAS_FLOOR'){as: floor}
    .in('INSTALLED_IN'){as: sensor}
  RETURN sensor
)

-- Step 2: Aggregate time-series
SELECT sensor_id,
  avg(temperature) AS avg_temp,
  max(temperature) AS max_temp,
  count(*) AS samples
FROM SensorReading
WHERE sensor_id IN ['s-A', 's-B', 's-C']
  AND ts BETWEEN 1771581600000
    AND 1771588800000
GROUP BY sensor_id

Cypher + Time Series

-- Step 1: Cypher graph traversal
MATCH (failing:Server {server_id: 'srv-1'})
  <-[:RUNS_ON]-(directSvc:Service)
  -[:DEPENDS_ON*0..3]->(depSvc:Service)
RETURN DISTINCT depSvc.name
    AS service_name,
  depSvc.service_id AS service_id

-- Step 2: SQL time-series query
SELECT service_id,
  ts.rate(request_count, ts)
    AS requests_per_sec,
  sum(error_count) AS total_errors,
  ts.percentile(latency_ms, 0.99)
    AS p99_latency
FROM ServiceMetrics
WHERE service_id IN ['api-gateway',
  'auth-service', 'user-service',
  'payment-service',
  'notification-service']
  AND ts BETWEEN 1771581600000
    AND 1771588800000
GROUP BY service_id

One query: traverse the service dependency graph, then pull real-time metrics for every affected service.

Industry Applications

IoT & Smart Buildings: Sensor monitoring, predictive maintenance, energy optimization across device hierarchies
Infrastructure Observability: Service dependency mapping, cascading failure detection, SLA monitoring
Manufacturing: Production line monitoring, quality control, supply chain analytics
Fintech: Transaction monitoring, market data analysis, risk dashboards
E-commerce: Clickstream analytics, real-time inventory, conversion funnels
Telecommunications: Network performance monitoring, cell tower analytics, capacity planning
Energy & Utilities: Grid monitoring, consumption forecasting, outage correlation

Built for Real-World Complexity

Real-time analytics rarely involves just one data type. An IoT platform needs time-series metrics and a device topology graph and configuration documents and full-text search across maintenance logs. An observability stack needs request metrics and service dependency maps and deployment histories and incident reports.

ArcadeDB handles all of these natively:

Time series: Sensor readings, transaction streams, system metrics with nanosecond precision
Graph: Device topology, service dependencies, supply chain networks, organizational hierarchies
Documents: Configuration files, maintenance logs, deployment manifests, device metadata
Vectors: Anomaly detection via behavioral embeddings, similar-incident search across historical data
Full-text search: Search across unstructured logs, incident reports, and documentation

All queryable from SQL, Cypher, or Gremlin. All in a single database. All consistent, all real-time.

Why ArcadeDB for Real-Time Analytics

Building a production analytics platform typically requires assembling a constellation of specialized tools: a time-series database for metrics, a graph database for topology, a search engine for logs, a document store for configurations, and sync pipelines to keep everything consistent.

ArcadeDB replaces this entire stack:

One database: Time series, graph, documents, vectors, and full-text search in a single engine
Zero sync pipelines: No Kafka, no CDC, no eventual consistency. Data is available across all models the moment it's written
Multiple ingestion protocols: InfluxDB Line Protocol, SQL, Java API, HTTP REST
Grafana-native: Built-in Grafana-compatible endpoints — no custom plugin required
Native PromQL: Query time-series data with Prometheus-compatible PromQL — use Grafana's Prometheus data source directly against ArcadeDB
Three query languages: SQL, Cypher, Gremlin — use whichever fits your team
Automatic data lifecycle: Retention policies, downsampling, and continuous aggregates run transparently
Telegraf compatible: InfluxDB Line Protocol support gives instant access to 200+ Telegraf input plugins

Apache 2.0 — Forever

Analytics infrastructure is foundational. Changing the database underneath it is one of the most expensive engineering projects a team can undertake. You need to trust that the database you choose today won't change its license tomorrow, restrict your deployment options, or demand commercial fees once you're locked in. ArcadeDB is Apache 2.0 forever. Deploy it on-premise, in your cloud, embedded in a commercial product, or as a managed service — the license will never change.

Platform Comparison

Capability	Typical Stack	ArcadeDB
Time-series engine	InfluxDB / TimescaleDB	Built-in
Graph traversal	Neo4j / Neptune	Built-in
Vector search	Pinecone / Weaviate	Built-in
Full-text search	Elasticsearch	Built-in
Document storage	MongoDB	Built-in
Grafana integration	Per-database plugin	Native API
InfluxDB Line Protocol	InfluxDB only	Supported
Data sync pipelines	Kafka / CDC	Not needed
License	Mixed / proprietary	Apache 2.0

Key Results:

2M+ events/sec ingestion rate
45% improvement in predictive maintenance accuracy
70% reduction in infrastructure complexity
Real-time alerting in <100ms
3 databases consolidated into 1

Client Success Story

"Our smart building platform monitors 50,000+ sensors across industrial facilities. Before ArcadeDB, we used separate databases for time-series data and device relationships, causing significant complexity and latency. Now we ingest 2M+ events per second and can instantly correlate anomalies across related devices. Maintenance predictions improved by 45%."

— CTO, Industrial IoT Solutions Provider
(Company identity protected by NDA)

Real-time Analytics

Your Analytics Stack Is Too Complex

Graph + Time Series in One Query

Define a Time-Series Type

A Purpose-Built Time-Series Engine

Ingest Data Your Way

InfluxDB Line Protocol

SQL Batch INSERT

Java Embedded API

HTTP REST API

InfluxDB Line Protocol

SQL Batch INSERT

Time Bucketing

Rate of Change Detection

Gap Filling with Interpolation

Analytical Queries Built for Time-Series

Time Bucketing

Rate of Change

Percentiles

Gap Filling

Correlation & Moving Averages

Window Functions

Continuous Aggregates & Automatic Downsampling

Continuous Aggregates

Downsampling Policies

Retention Policies

Continuous Aggregates

Automatic Downsampling

Grafana Endpoints

Native Grafana Integration

Native PromQL Support

PromQL Queries on ArcadeDB

Prometheus-Compatible API

Time Series Tells You When. Graphs Tell You Why.

Graph + Time-Series Query

Cypher + Time Series

Industry Applications

Built for Real-World Complexity

Why ArcadeDB for Real-Time Analytics

Apache 2.0 — Forever

Platform Comparison

Key Results:

Client Success Story

Ready to Simplify Your Analytics Stack?