More

valyala · 2026-02-13T10:20:18 1770978018

If you are struggling with observability solutions which require object storage for production setups after such news (i.e. Thanos, Loki, Mimir, Tempo), then try alternatives without this requirement, such as VictoriaMetrics, VictoriaLogs and VictoriaTraces. They scale to petabytes of data on regular block storage, and they provide higher performance and availability than systems, which depend on manually managed object storage such as MinIO.

valyala · 2026-02-13T10:11:37 1770977497

Why do you need non-trivial dependency on the object storage for the database for logs in the first place?

Object storage has advantages over regular block storage if it is managed by cloud, and if it has a proven record on durability, availability and "infinite" storage space at low costs, such as S3 at Amazon or GCS at Google.

Object storage has zero advantages over regular block storage if you run it on yourself:

- It doesn't provide "infinite" storage space - you need to regularly monitor and manually add new physical storage to the object storage.

- It doesn't provide high durability and availability. It has lower availability comparing to a regular locally attached block storage because of the complicated coordination of the object storage state between storage nodes over network. It usually has lower durability than the object storage provided by cloud hosting. If some data is corrupted or lost on the underlying hardware storage, there are low chances it is properly and automatically recovered by DIY object storage.

- It is more expensive because of higher overhead (and, probably, half-baked replication) comparing to locally attached block storage.

- It is slower than locally attached block storage because of much higher network latency compared to the latency when accessing local storage. The latency difference is 1000x - 100ms at object storage vs 0.1ms at local block storage.

- It is much harder to configure, operate and troubleshoot than block storage.

So I'd recommend taking a look at other databases for logs, which do not require object storage for large-scale production setups. For example, VictoriaLogs. It scales to hundreds of terabytes of logs on a single node, and it can scale to petabytes of logs in cluster mode. Both modes are open source and free to use.

Disclaimer: I'm the core developer of VictoriaLogs.

lucideer · 2026-02-13T12:55:21 1770987321

> Object storage has zero advantages over regular block storage if you run it on yourself

Worth adding, this depends on what's using your block storage / object storage. For Loki specifically, there are known edge-cases with large object counts on block storage (this isn't related to object size or disk space) - this obviously isn't something I've encountered & I probably never will, but they are documented.

For an application I had written myself, I can see clearly that block storage is going to trump object storage for all self-hosted usecases, but for 3P software I'm merely administering, I have less control over its quirks & those pros -vs- cons are much less clear cut.

lucideer · 2026-02-13T10:15:49 1770977749

Initially I was just following recommendations blindly - I've never run Loki off-cloud before so my typical approach to learning a system would be to start with defaults & tweak/add/remove components as I learn it. Grafana's docs use object storage everywhere, so it's a lot easier with you're aligned, you can rely more heavily on config parity.

While I try to avoid complexity, idiomatic approaches have their advantages; it's always a trade-off.

That said my first instinct when I saw minio's status was to use filestorage but the rustfs setup has been pretty painless sofar. I might still remove it, we'll see.

valyala · 2026-01-25T10:34:11 1769337251

You can expose Unikernel application metrics in Prometheus text exposition format at `/metrics` http page and collect them with Prometheus or any other collector, which can scrape Prometheus-compatible targets. Alternatively, you can push metrics from the Unikernel to the centralized database for metrics for further investigation. Both pull-based and push-based metrics' collection is supported by popular client libraries for metrics such as https://github.com/VictoriaMetrics/metrics .

You can emit logs by the Unikernel app and send them to a centralized database for logs via syslog protocol (or any other protocol) for further analysis. See, for example, how to set up collect ing logs via syslog protocol at VictoriaLogs - https://docs.victoriametrics.com/victorialogs/data-ingestion...

You can expose various debug endpoints via http at the Unikernel application for debugging assistance. For example, if the application is written in Go, it is recommended exposing endpoints for collecting CPU, memory and goroutines profiles from the running application.

valyala · 2026-01-25T10:14:07 1769336047

There is no need in the operating system to run Unikernels. Every Unikernel includes parts of operating system needed for interacting with the underlying hardware. So Unikernels can run on bare metal if they know how to interact with the underlying hardware (i.e. if they have drivers for that hardware). Usually Unikernels are targeted to run on virtual machines because virtual machines have unified virtualised hardware. This allows running the same Unikernel on virtual machines across multiple cloud providers, since they have similar virtual hardware.

valyala · 2026-01-06T13:04:55 1767704695

Coroot is the future of observability

skybrian · 2026-01-06T13:41:15 1767706875

Could you say more about your experience and what you like about it?

valyala · 2026-01-07T10:44:48 1767782688

It monitors all the applications in your network and automatically detects the most common issues with the applications. It also collects metrics, traces, logs and CPU profiles for the monitored applications, so you could quickly investigate the root cause of various issues if needed.

I like that Coroot works out of the box without the need in complicated configuration.

valyala · 2026-01-05T01:00:55 1767574855

The number of active users at StackOverflow started dropping in the middle of 2020, i.e. long time before ChatGPT release in the end of 2022.

https://data.stackexchange.com/stackoverflow/revision/192836...

valyala · 2025-12-22T16:34:28 1766421268

Nice article about the usefulness of wide events! It's pity it doesn't name open-source solutions optimized for wide events such as VictoriaLogs.

valyala · 2025-12-22T16:32:48 1766421168

Try open-source databases specially designed for traces, such as Grafana Tempo or VictoriaTraces. They can handle the data ingestion rate of hundreds of thousands trace spans per second on a regular laptop.

valyala · 2025-12-03T21:42:43 1764798163

What is the purpose of MinIO, Seaweedfs and similar object storage systems? They lack durability guarantees provided by S3 and GCS. They lack "infinite" storage promise contrary to S3 and GCS. They lack "infinite" bandwidth unlike S3 and GCS. They are more expensive than other storage options, unlike S3 and GCS.

cortesoft · 2025-12-03T21:54:22 1764798862

We use it because we are already running our own k8s clusters in our datacenters, and we have large storage requirements for tools that have native S3 integration, and running our own minio clusters in the same datacenter as the tools that generate and consume that data is a lot faster and cheaper than using S3.

For example, we were running a 20 node k8s cluster for our Cortex (distributed Prometheus) install, monitoring about 30k servers around the world, and it was generating a bit over a TB of data a day. It was a lot more cost effective and performant to create a minio cluster for that data than to use S3.

Also, you can get durability with minio with multi cluster replication.

valyala · 2025-12-04T18:27:36 1764872856

Consider migrating to VictoriaMetrics and saving on storage costs and operations costs. You also won't need MinIO, since it stores data to local filesystem (aka to regular persistent volumes). See real-world reports from happy users who saved costs on a large-scale Prometheus-compatible monitoring - https://docs.victoriametrics.com/victoriametrics/casestudies...

cortesoft · 2025-12-05T17:22:44 1764955364

I can't imagine switching at this point. We spent quite a while building up our Cortex and Minio infrastructure management, as well as our alerting and inventory management systems, and it is all very stable right now. We don't really touch it anymore, it just hums along.

We have already worked through all the pain points and it all works smoothly. No reason to change something that isn't a problem.

onionisafruit · 2025-12-03T21:43:50 1764798230

I haven't used it in a while, but it used to be great as a test double for s3

wasmitnetzen · 2025-12-03T22:00:41 1764799241

S3 is a widely supported API schema, so if you need something on-prem, you use these.

valyala · 2025-12-04T18:29:16 1764872956

But what's the point to use these DIY object storage systems, when they do not provide durability and other important guarantees provided by S3 and GCS?

lima · 2025-12-04T21:44:54 1764884694

When you want just the API for compatibility, I guess?

Self-hosted S3 clones with actual durability guarantees exist, but the only properly engineered open source choices are Ceph + radosgw (single-region, though) or Garage (global replication based on last-writer-wins CRDS conflict resolution).

maartin0 · 2025-12-03T21:53:19 1764798799

It's great for a prototype which doesn't need to store a huge amount of data, you can run it on the same VM as a node server behind Cloudflare and get a fairly reliable setup going

spapas82 · 2025-12-03T22:43:58 1764801838

Minio allows you to have an s3 like interface when you have your own servers and storage.

valyala · 2025-12-04T18:30:14 1764873014

MinIO also allows losing your data, since it doesn't provide high durability guarantees unlike S3 and GCS.

valyala · 2025-11-18T14:22:59 1763475779

> we did, I think the entire journey was 7 years long, communicated many times, over at least 6 major releases. maintaining dashboards in two languages increased complexity, whilst reducing compatibility, and gave a very large security surface to be worried about. we communicated clearly, provided migration tools, put it in release notes, updated docs, repeated it at conferences and on community calls

If this migration appeared to be so painful, why did you decide to finish it (and make users unhappy) instead of cancelling the migration at early stages? What are benefits of this migration?