Victorialogs vs Loki - Benchmarking Results

July 9, 2025
min read
Share this post
https://www.truefoundry.com/blog/victorialogs-vs-loki
URL
Victorialogs vs Loki - Benchmarking Results
TL;DR – After side‑by‑side testing on a 500 GB/7‑day workload, VictoriaLogs cut query latencies by 94 %, shrank storage by ≈40 %, and used < 50 % of the CPU & RAM we previously allocated to Loki. This post explains why we switched.

Background & Requirements

Truefoundry helps developers run multi‑tenant ML workloads on Kubernetes.
Developers need:

  • Fast, ad‑hoc search
  • Good Ingestion rate
  • Live tailing for debugging.
  • Minimal operational overhead – single‑binary deployments preferred.
  • Resource‑efficient operation on a 4 vCPU / 16 GiB RAM node
  • High compression ratio on stored Logs
  • Block Storage  > S3 preferred, to reduce overhead. + Latency

Loki served us well initially, but as volume grew we saw >30 s search latencies and high I/O amplification. This triggered an evaluation of VictoriaLogs.

What Is Loki?

Loki is Grafana‑Labs’ log aggregation system that stores logs in compressed chunks accompanied by an index built from labels (key–value pairs). Queries are expressed in LogQL and rely heavily on label filters followed by line filtering.

  • Strengths: tight Grafana integration, cheap index, horizontally scalable.
  • Limitations for us: label‑only index means expensive full‑scan regex searches; high chunk compaction I/O; Go GC overhead at high ingestion.

What Is VictoriaLogs?

VictoriaLogs is a log database by the VictoriaMetrics team. It uses columnar LSM‑style storage with per‑field indices, SIMD‑accelerated search and SQL‑like LogSQL syntax.

  • Strengths: full‑text index on all tokens; single‑binary; very low memory footprint; fast cold‑cache scans.
  • Trade‑offs: smaller ecosystem, fewer built‑in integrations (we pipe data via Vector).

Benchmark Methodology

CategoryDetails
Requests/Memory (4 vCPU, 8 GiB RAM) – identical for both systems, QoS: Guaranteed
Log Generator flog pushing 65 MB/s to Vector → Loki / Vector → VictoriaLogs
Data Set ~500 GB over 7 days; mixed duplicate & unique lines; 20 namespaces, 40 apps
Retention 7 days
Clients Locust 2.27.1, 10 virtual users, sustained 43 RPS to /select/logsql/query
Grafana
Queries Tested Stats, Needle in a Haystack, Regex, Negative (details below)
Caching Block-cache disabled for both systems to simulate cold reads; pods were restarted to ensure this.
Index Tweaks Loki: defaults; VictoriaLogs: defaults

Query Suite

  1. Stats – total lines for a filter in the last 24 h.
  2. Needle in a Haystack – 3-4 static line [UNIQUE-STATIC-LOG] ID=abc123 XYZ in heavy log filled  namespace over 7 days.
  3. Negative – string that does not exist (forces full scan) over 7 days.

🔍 Query Performance

1. Stats Query (log count over 24h)

Purpose: Total log lines from app="servicefoundry-server"

System Query Latency
Loki sum(count_over_time({app="servicefoundry-server"}[24h])) 2.5s
VictoriaLogs {app="servicefoundry-server"} | stats count() 1.5s

2. Needle-in-Haystack Query (7 days, ~500 GB)

Purpose: Search for a unique static log line in truefoundry namespace

SystemQueryLatency
Loki {namespace="truefoundry", app!="grafana"} |= "[UNIQUE-STATIC-LOG] ID=abc123 XYZ" 12s
VictoriaLogs {namespace="truefoundry", app!="grafana"} "[UNIQUE-STATIC-LOG] ID=abc123 XYZ" ~900ms

3. App Restart Log Match (7 days) (Extra query to verify Step 2)

Purpose: Search for known restart pattern :3000 in a small subset of logs (targeting a single shard)
Result sets were verified to be identical.

SystemQueryLatency
Loki {app="servicefoundry-server"} |= ":3000" ~2.2 s
VictoriaLogs {app="servicefoundry-server"} ":3000" ~2.2 s

4. Non-Existence Log Match (7 days)

Purpose: Search for non-existing log, triggering a full data search
Result sets were verified to be identical.

On 500GB processing data, Loki behaved strangely. The resources got choked. and query response halted.

SystemQueryLatency
Loki (500 GB) {namespace="truefoundry"} |= "non-existent log line" Timeout
VictoriaLogs (500 GB) {namespace="truefoundry"} "non-existent log line" 2.2s
Loki (300 GB) {namespace="truefoundry"} |= "non-existent log line" 2.6s
VictoriaLogs (300 GB) {namespace="truefoundry"} "non-existent log line" 266ms

Loki vs VictoriaLogs: Results at a Glance

Our evaluation focused on three dimensions that matter day‑to‑day for platform engineers:

  1. How fast can we get answers?
  2. How many resources does that speed cost?
  3. How stable is the experience under real load?

Query Performance

WorkloadLokiVictoriaLogsSpeed-up
Stats (24h)2.5s1.5s40 % faster
Needle (500 GB)12s1s12× faster
Pattern “:3000” (7d)2.2s2.2ssame result
Negative (7d)2.6s266ms10× faster

Why the gap? VictoriaLogs maintains a per‑token index, so even regex‑like scans are index‑assisted. Loki, by contrast, filters line‑by‑line after a label query, which devolves to a brute‑force scan when the label set is broad.

🌪️ · Ingestion Performance

We also stress-tested ingestion with 120 replicas of our  flog generator.
The results were eye-opening:

Metric Loki VictoriaLogs Outcome
Peak Ingestion 20 MB/s 66 MB/s 3× higher throughput
vCPU Usage 4 vCPUs
(100 % throttled)
2 vCPUs peak ≥50 % reduction
Memory Usage ≈4 GiB ≈1.3 GiB ~3× lower

Loki:

Loki spikes to 3–4 vCPU, nears its 8 GiB limit and shows throttling under the same workload

VictoriaLogs:

👉 Key takeaway: VictoriaLogs delivered 3× higher ingestion speed while consuming 72% less CPU and 87% less memory compared to Loki.
VictoriaLogs stays comfortably below its 4 vCPU / 8 GiB limits even during ingestion bursts

2 · Resource Footprint (7‑day retention)

Loki

Memory Used:  consistently using 6-7GB Ram
Cpu peak: 3 vCPU

VictoriaLogs

Memory Used: 800 MB - 900 MB
Peak Cpu Usage: 1.1  vCPU
LokiVictoriaLogsDelta
Storage501 GiB318 GiB37 %
Memory6–7 GiB steady0.6–2 GiB33–80 %
CPU peak4 vCPU (throttled)1.1 vCPU73 %

3 · Real‑World Load (Locust 2‑min run @ 10 users & 2 Rampup )

Queries were similar, with Random Limits and Random Time-range, to ensure Cache bursts.

Victoria Logs

Loki

📌 Despite handling 36% higher RPS, VictoriaLogs showed lower p95% and tail latencies—proving its indexing model holds under pressure with 3.6x faster p99%ile
This test reinforced our decision: VictoriaLogs isn't just faster in theory—it scales better under stress in production-like workloads.

TL;DR of the Numbers

  • 70–94 % faster across common search patterns.
  • ≈40 % smaller on disk with the same retention window.
  • Half the compute, freeing a full vCPU and ~1–2 GiB RAM on our smallest nodes.

Bottom line: For a log‑heavy, search‑centric use case, VictoriaLogs lets us answer questions in seconds instead of minutes while cutting infra costs.

Key Findings

  1. Full‑text index matters – VictoriaLogs’ per‑token index eliminates brute‑force line filtering.
  2. Storage layout – columnar + LSM vastly reduces on‑disk size and disk seeks.
  3. Memory efficiency – we freed ~2 GiB RAM per node, allowing denser scheduling.
  4. Operational simplicity – both are single‑binary, but VictoriaLogs needed zero custom tuning to hit these numbers.

Conclusion

For workload profiles heavy on ad‑hoc text search, VictoriaLogs provided order‑of‑magnitude faster queries and material cost savings. Loki remains an excellent choice when tight Grafana integration and label‑first queries dominate, but VictoriaLogs is now our default for high‑ingestion, developer‑centric clusters.

References

Discover More

May 16, 2024

Volumes on Kubernetes

Kubernetes
February 15, 2024

A Guide to Cloud Node Auto-Provisioning

Kubernetes
December 7, 2023

Fractional GPUs in Kubernetes

LLMs & GenAI
GPU
Kubernetes
May 1, 2023

Authenticated gRPC service on Kubernetes

Engineering and Product
Kubernetes
Authentication

Related Blogs

No items found.

Blazingly fast way to build, track and deploy your models!

pipeline

The Complete Guide to AI Gateways and MCP Servers

Simplify orchestration, enforce RBAC, and operationalize agentic AI with battle-tested patterns from TrueFoundry.