Skip to content

Duplicated records when row is updated between transactions #381

@leticiarina

Description

@leticiarina

What happens?

We started seeing duplicated rows when extracting data from PostgresDB and noticed that it happens when the row is updated/created when different transaction starts, which seems to be similar to what was reported back in #163.

I checked our logs with the flag pg_debug_show_queries=true after I saw this change and validated all connections are using BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ READ ONLY as expected.

However, I noticed SET TRANSACTION SNAPSHOT 'some-snapshot-id' is never called and neither the expected query before that: SELECT pg_is_in_recovery(), pg_export_snapshot(), (select count(*) from pg_stat_wal_receiver).

This query works when executed directly in our database and provides the following result:

SELECT pg_is_in_recovery(), pg_export_snapshot(), (select count(*) from pg_stat_wal_receiver);
 pg_is_in_recovery | pg_export_snapshot  | count 
-------------------+---------------------+-------
 f                 | 0000102C-00400680-1 |     0

I suspect the code is not reaching the part where it gets pg_export_snapshot for some reason and therefore affecting the results of the reading operation even when the isolation level is correct. Our workflow has pg_connection_limit=4.

To Reproduce

A bit hard to reproduce since the update operation of a row needs to happen between the start of concurrent transactions (we see this issue happening a couple of times per week), but we're happy to test some code in our workflows.

OS:

Linux

PostgreSQL Version:

17.5

DuckDB Version:

1.4.1

DuckDB Client:

Python

Full Name:

Rina Sakurai

Affiliation:

Mentimeter

Have you tried this on the latest main branch?

  • I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • I agree

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions