Append is not cancel safe

If `append_record` is not "polled to its end", for instance if it is wrapped in tokio timeout, or if the task running it is cancelled,
then we can end up in a corrupted state.


The code looks as follows
```
- get next record position by looking at the last record in RAM.
- write on disk
- (C) write on RAM
```

If the task is stopped in the middle of (C), we end up in a state where what is on disk does not match what is in RAM.
In particular, on the next add, we will use a record position that might actually already be on disk.

As we reload the mrecordlog from disk, this is identified as a corruption.
This has been observed in prod.

A second case also observed is a straight panic.
Here the cause the preemption is assumed to have happened after we appended the record metas
and before we had populated the `concatenated_records` rolling buffer.

```
        self.record_metas.push(record_meta);
        self.concatenated_records.extend(payload);
  ```

The panic reported is 

```
2024-02-28T23:41:54Z app[7816406b969758] iad [info]thread 'tokio-runtime-worker' panicked at /usr/local/cargo/git/checkouts/mrecordlog-34aad39ce3e0e659/bc6a998/src/mem/queue.rs:87:46:
2024-02-28T23:41:54Z app[7816406b969758] iad [info]slice index starts at 928 but ends at 0
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Append is not cancel safe #52

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Append is not cancel safe #52

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions