We have a three-node etcd cluster. The master node fails to send snapshots. The log is as follows:
{"level":"warn","ts":"2026-03-03T12:31:17.794522Z","caller":"etcdserver/snapshot_merge.go:72","msg":"failed to send database snapshot to writer","size":"274 MB","error":"EOF"}
{"level":"warn","ts":"2026-03-03T12:31:17.794889Z","caller":"rafthttp/snapshot_sender.go:102","msg":"failed to send database snapshot","snapshot-index":1427265,"remote-peer-id":"adb3a32b44be819c","bytes":486880304,"size":"487 MB","error":"ioutil: short read"}
{"level":"info","ts":"2026-03-03T12:31:17.795075Z","caller":"etcdserver/server.go:2218","msg":"sent merged snapshot","from":"265ab714481be3b3","to":"adb3a32b44be819c","bytes":486880304,"size":"487 MB","took":"1.66390738s"}
{"level":"info","ts":"2026-03-03T12:31:18.130443Z","caller":"etcdserver/server.go:2200","msg":"sending merged snapshot","from":"265ab714481be3b3","to":"adb3a32b44be819c","bytes":486880304,"size":"487 MB"}
{"level":"info","ts":"2026-03-03T12:31:18.130728Z","caller":"rafthttp/snapshot_sender.go:84","msg":"sending database snapshot","snapshot-index":1427265,"remote-peer-id":"adb3a32b44be819c","bytes":486880304,"size":"487 MB"}
{"level":"warn","ts":"2026-03-03T12:31:19.871649Z","caller":"etcdserver/snapshot_merge.go:72","msg":"failed to send database snapshot to writer","size":"274 MB","error":"EOF"}
{"level":"warn","ts":"2026-03-03T12:31:19.872006Z","caller":"rafthttp/snapshot_sender.go:102","msg":"failed to send database snapshot","snapshot-index":1427265,"remote-peer-id":"adb3a32b44be819c","bytes":486880304,"size":"487 MB","error":"ioutil: short read"}
{"level":"info","ts":"2026-03-03T12:31:19.872133Z","caller":"etcdserver/server.go:2218","msg":"sent merged snapshot","from":"265ab714481be3b3","to":"adb3a32b44be819c","bytes":486880304,"size":"487 MB","took":"1.74165973s"}
{"level":"info","ts":"2026-03-03T12:31:20.129962Z","caller":"etcdserver/server.go:2200","msg":"sending merged snapshot","from":"265ab714481be3b3","to":"adb3a32b44be819c","bytes":486880304,"size":"487 MB"}
{"level":"info","ts":"2026-03-03T12:31:20.130151Z","caller":"rafthttp/snapshot_sender.go:84","msg":"sending database snapshot","snapshot-index":1427265,"remote-peer-id":"adb3a32b44be819c","bytes":486880304,"size":"487 MB"}
At the same time, the size of the etcd database file is much smaller than the 487 MB displayed in the log.
-rw-r----- 1 3001 2000 9235 Mar 3 14:50 0000000000000012-0000000000154a8e.snap
-rw-r----- 1 3001 2000 9235 Mar 3 14:53 0000000000000012-0000000000155e1b.snap
-rw-r----- 1 3001 2000 9235 Mar 3 15:08 0000000000000014-0000000000158eb8.snap
-rw-r----- 1 3001 2000 9234 Mar 3 15:12 0000000000000015-000000000015a241.snap
-rw-r----- 1 3001 2000 9235 Mar 3 15:15 0000000000000015-000000000015b5d3.snap
-rw------- 1 3001 2000 274079744 Mar 3 15:08 db
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://111.9.0.26:2379 | 265ab714481be3b3 | 3.5.11 | 487 MB | true | false | 23 | 1427238 | 1427238 | |
| https://111.9.0.27:2379 | adb3a32b44be819c | 3.5.11 | 414 MB | false | false | 20 | 1412792 | 1412792 | |
| https://111.9.0.28:2379 | 7cec064e78275691 | 3.5.11 | 476 MB | false | false | 23 | 1427238 | 1427238 | |
+-------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
The master etcd node successfully sends snapshots, and the cluster is running properly.
Bug report criteria
What happened?
We have a three-node etcd cluster. The master node fails to send snapshots. The log is as follows:
At the same time, the size of the etcd database file is much smaller than the 487 MB displayed in the log.
What did you expect to happen?
The master etcd node successfully sends snapshots, and the cluster is running properly.
How can we reproduce it (as minimally and precisely as possible)?
N/A
Anything else we need to know?
No response
Etcd version (please run commands below)
Details
3.5
Etcd configuration (command line flags or environment variables)
Details
paste your configuration here
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
Details
Relevant log output