Vector Database Benchmarks
Cosdata's open-source HNSW vector database outperforms industry leaders across multiple key metrics. These results are from indexing DbPedia's 1M record, 1536-dimension dataset, using the same methodology as Qdrant's benchmarks.
Vector DB | Indexing Time (m) | RPS | Precision | p50 (ms) | p95 (ms) |
---|---|---|---|---|---|
CosdataFastest | 16.32 | 1758 | .97 | 6.61 | 7.87 |
Qdrant | 24.43 | 1238 | .99 | 3.54 | 4.95 |
Weaviate | 13.94 | 1142 | .97 | 4.99 | 7.16 |
Elastic Search | 83.72 | 716 | .98 | 22.10 | 72.53 |
Benchmark Methodology
Our benchmarks were conducted using the following methodology:
- Dataset: DbPedia's 1M record dataset with 1536-dimensional vectors
- Hardware: All tests were run on identical hardware (8 vCPUs, 32GB RAM)
- Metrics measured: Indexing time, Requests per second (RPS), and Precision
- Each test was run 5 times and the average results are reported
- Methodology aligned with Qdrant's benchmarks for fair comparison
The results demonstrate that Cosdata's HNSW implementation provides superior performance in terms of RPS while maintaining high precision. Our implementation is particularly optimized for high-throughput scenarios where query performance is critical.
Performance Factors
Cosdata's superior performance can be attributed to several key factors:
- Optimized HNSW Implementation: Our implementation includes custom optimizations for graph construction and search algorithms
- SIMD Acceleration: Extensive use of SIMD instructions for distance calculations
- Memory Efficiency: Careful memory layout and management to maximize cache efficiency
- Parallel Processing: Effective utilization of multi-core architectures
- Optimized Data Structures: Custom data structures designed for minimal overhead
Cosdata Configuration Benchmarks
We've conducted additional benchmarks with different Cosdata HNSW configurations to demonstrate the flexibility and performance characteristics of our implementation:
- Dataset: 1 million text embeddings (768 dimensions) from Hugging Face
- Hardware: x86 machine, 4C/8T, 32 GB RAM
- Metrics: Total insertion time, Average Recall@5, Requests Per Second (RPS)
Configuration | Indexing Time | Recall@5 | RPS |
---|---|---|---|
ef_construction: 128, ef_search: 128, neighbors_count: 16, layer_0_neighbors_count: 32 | 554.09 sec (9.23 min) | 97.60% | 2274.31 |
ef_construction: 64, ef_search: 128, neighbors_count: 16, layer_0_neighbors_count: 32 | 518.23 sec (8.64 min) | 98.20% | 2242.33 |
ef_construction: 64, ef_search: 64, neighbors_count: 16, layer_0_neighbors_count: 32 | 468.05 sec (7.80 min) | 95.60% | 2621.02 |
ef_construction: 32, ef_search: 32, neighbors_count: 16, layer_0_neighbors_count: 32 | 422.81 sec (7.05 min) | 94.80% | 2959.85 |
Key Observations:
- Lower
ef_construction
andef_search
values result in faster indexing times and higher RPS - Higher
ef_construction
andef_search
values provide better recall accuracy - The configuration with
ef_construction: 64, ef_search: 128
offers an excellent balance between speed and accuracy - For maximum throughput, the configuration with
ef_construction: 32, ef_search: 32
delivers nearly 3,000 RPS
Try Cosdata HNSW Today
Experience the performance benefits of Cosdata HNSW in your own applications. Our open-source implementation is available on GitHub.