Skip to content

Memory Leak in PostgreSQL Extension While Exporting Tables to Parquet #274

@vineetver

Description

@vineetver

What happens?

Memory usage keeps increasing as the process progresses.
It seems that the results from PostgreSQL queries are not being released or cleaned up properly.

To Reproduce

I am using DuckDB to export tables from my local PostgreSQL database to Parquet files. However, I am noticing a significant memory increase during the process, suggesting a potential memory leak. Below is the code I am using:

import duckdb

con = duckdb.connect(database='my_database.duckdb')

con.install_extension("postgres_scanner")
con.load_extension("postgres_scanner")
con.sql("SET memory_limit = '20GB';")
con.sql("SET threads TO 3;")
con.sql("SET enable_progress_bar = true;")
con.sql("""
    ATTACH 'dbname=** user=** host=127.0.0.1 password=**' AS db (TYPE POSTGRES, READ_ONLY);
""")

all_tables = con.sql("SHOW ALL tables;").fetchdf()
tables = all_tables['name'].to_list()

for table in tables:
    con.execute(f"COPY db.public.{table} TO '{table}.parquet' (FORMAT PARQUET);")
    print(f"Table {table} copied to {table}.parquet")

con.close()

OS:

Ubuntu, x86_64

DuckDB Version:

1.1.3

DuckDB Client:

Python

Hardware:

VM: 32 GB RAM, 8 Core

Full Name:

Vineet Verma

Affiliation:

Harvard University

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

No - I cannot easily share my data sets due to their large size

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • Yes, I have

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions