TL;DR – After side‑by‑side testing on a 500 GB/7‑day workload, VictoriaLogs cut query latencies by 94 %, shrank storage by ≈40 %, and used < 50 % of the CPU & RAM we previously allocated to Loki. This post explains why we switched.
Background & Requirements
Truefoundry helps developers run multi‑tenant ML workloads on Kubernetes.
Developers need:
- Fast, ad‑hoc search
- Good Ingestion rate
- Live tailing for debugging.
- Minimal operational overhead – single‑binary deployments preferred.
- Resource‑efficient operation on a 4 vCPU / 16 GiB RAM node
- High compression ratio on stored Logs
- Block Storage > S3 preferred, to reduce overhead. + Latency
Loki served us well initially, but as volume grew we saw >30 s search latencies and high I/O amplification. This triggered an evaluation of VictoriaLogs.
What Is Loki?
Loki is Grafana‑Labs’ log aggregation system that stores logs in compressed chunks accompanied by an index built from labels (key–value pairs). Queries are expressed in LogQL and rely heavily on label filters followed by line filtering.
- Strengths: tight Grafana integration, cheap index, horizontally scalable.
- Limitations for us: label‑only index means expensive full‑scan regex searches; high chunk compaction I/O; Go GC overhead at high ingestion.
What Is VictoriaLogs?
VictoriaLogs is a log database by the VictoriaMetrics team. It uses columnar LSM‑style storage with per‑field indices, SIMD‑accelerated search and SQL‑like LogSQL syntax.
- Strengths: full‑text index on all tokens; single‑binary; very low memory footprint; fast cold‑cache scans.
- Trade‑offs: smaller ecosystem, fewer built‑in integrations (we pipe data via Vector).
Benchmark Methodology
Query Suite
- Stats – total lines for a filter in the last 24 h.
- Needle in a Haystack – 3-4 static line
[UNIQUE-STATIC-LOG] ID=abc123 XYZ
in heavy log filled namespace over 7 days. - Negative – string that does not exist (forces full scan) over 7 days.
🔍 Query Performance
1. Stats Query (log count over 24h)
Purpose: Total log lines from app="servicefoundry-server"
2. Needle-in-Haystack Query (7 days, ~500 GB)
Purpose: Search for a unique static log line in truefoundry
namespace
3. App Restart Log Match (7 days) (Extra query to verify Step 2)
Purpose: Search for known restart pattern :3000
in a small subset of logs (targeting a single shard)
Result sets were verified to be identical.
4. Non-Existence Log Match (7 days)
Purpose: Search for non-existing log, triggering a full data search
Result sets were verified to be identical.
On 500GB processing data, Loki behaved strangely. The resources got choked. and query response halted.
Loki vs VictoriaLogs: Results at a Glance
Our evaluation focused on three dimensions that matter day‑to‑day for platform engineers:
- How fast can we get answers?
- How many resources does that speed cost?
- How stable is the experience under real load?
Query Performance
Why the gap? VictoriaLogs maintains a per‑token index, so even regex‑like scans are index‑assisted. Loki, by contrast, filters line‑by‑line after a label query, which devolves to a brute‑force scan when the label set is broad.
🌪️ · Ingestion Performance
We also stress-tested ingestion with 120 replicas of our flog generator.
The results were eye-opening:

Loki:

Loki spikes to 3–4 vCPU, nears its 8 GiB limit and shows throttling under the same workload
VictoriaLogs:

👉 Key takeaway: VictoriaLogs delivered 3× higher ingestion speed while consuming 72% less CPU and 87% less memory compared to Loki.
VictoriaLogs stays comfortably below its 4 vCPU / 8 GiB limits even during ingestion bursts
2 · Resource Footprint (7‑day retention)
Loki

Memory Used: consistently using 6-7GB Ram
Cpu peak: 3 vCPU
VictoriaLogs

Memory Used: 800 MB - 900 MB
Peak Cpu Usage: 1.1 vCPU
3 · Real‑World Load (Locust 2‑min run @ 10 users & 2 Rampup )
Queries were similar, with Random Limits and Random Time-range, to ensure Cache bursts.
Victoria Logs

Loki

📌 Despite handling 36% higher RPS, VictoriaLogs showed lower p95% and tail latencies—proving its indexing model holds under pressure with 3.6x faster p99%ile

This test reinforced our decision: VictoriaLogs isn't just faster in theory—it scales better under stress in production-like workloads.
TL;DR of the Numbers
- 70–94 % faster across common search patterns.
- ≈40 % smaller on disk with the same retention window.
- Half the compute, freeing a full vCPU and ~1–2 GiB RAM on our smallest nodes.
Bottom line: For a log‑heavy, search‑centric use case, VictoriaLogs lets us answer questions in seconds instead of minutes while cutting infra costs.
Key Findings
- Full‑text index matters – VictoriaLogs’ per‑token index eliminates brute‑force line filtering.
- Storage layout – columnar + LSM vastly reduces on‑disk size and disk seeks.
- Memory efficiency – we freed ~2 GiB RAM per node, allowing denser scheduling.
- Operational simplicity – both are single‑binary, but VictoriaLogs needed zero custom tuning to hit these numbers.
Conclusion
For workload profiles heavy on ad‑hoc text search, VictoriaLogs provided order‑of‑magnitude faster queries and material cost savings. Loki remains an excellent choice when tight Grafana integration and label‑first queries dominate, but VictoriaLogs is now our default for high‑ingestion, developer‑centric clusters.
References
- Loki Documentation
- VictoriaLogs Documentation
- Promtail Documentatio
- Vector Documentation
- Alloy Documentation
- Grafana Alloy Configurator
Blazingly fast way to build, track and deploy your models!
