Home Expressions
Docs
Drivers Gateway SDKs Benchmarks
Changelog
GitHub
Blog Status Roadmap
← Back to benchmark index
Current Qdrant report January 1, 2026

Qdrant Driver Report: January 2026

January 1, 2026 snapshot for qail-qdrant and the official qdrant-client.

Looking for capability docs instead of benchmark data? Use /qdrant. For all driver pages, use /drivers.
4.00x

batch/sequential ratio under 50 concurrent searches

1.17x
Single-query ratio
140us vs 164us
1.46x
Pool ratio
16.2ms vs 23.6ms
4.00x
HTTP/2 batch ratio
4.8ms vs 19.0ms

Single-Query Search

1,000 sequential searches on localhost.

DriverLatency/queryThroughputRelative to official client
qail-qdrant gRPC140.3us7,126 ops/s1.17x
Official client164.0us6,096 ops/sbaseline

Implementation notes

  • Buffer pooling uses .split() rather than .clone() on the hot path.
  • The transport path talks to h2 directly rather than routing through a heavier wrapper.
  • Protobuf tags are pre-computed before the request loop.
  • Vector copies were reduced to a single memcpy for the 1536-float case.

HTTP/2 Batch Search

50 queries sent concurrently over a single connection.

ApproachTotal timePer queryRelative to sequential
HTTP/2 pipelined4.8ms95us4.00x
Sequential19.0ms380usbaseline

Interpretation

The 50-request HTTP/2 batch reduced per-query latency from 380us to 95us in this harness. Treat the number as a transport-shape result tied to this workload rather than a general claim about every vector search path.

Reproduce Results

git clone https://github.com/qail-io/qail.git
cd qail/qdrant

docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant
python3 examples/seed_qdrant.py
cargo run --example fair_benchmark --release
cargo run --example batch_benchmark --release