Skip to content

After COPY DATABASE FROM <pg>, the types from pg are improperly formatted in EXPORT commands #429

@NickCrews

Description

@NickCrews

What happens?

I have some custom enum and composite types in postgres. When I copy that entire database over to a duckdb instance, some things get garbled. First, let me show what I'm doind.

Repro using the @duckdb/node-api and @electric-sql/pglite-socket nodejs packages. mkdir repro && cd repro && pnpm init && pnpm add @duckdb/node-api==1.5.1-r.1 @electric-sql/pglite==0.4.3 @electric-sql/pglite-socket==0.1.3 and then node repro.js for the following js:

import { rm, mkdir } from 'node:fs/promises';
import duckdb from '@duckdb/node-api';
import { PGlite } from '@electric-sql/pglite';
import { PGLiteSocketServer } from '@electric-sql/pglite-socket';
import { join } from 'node:path';


async function create(repro_dir) {
    await rm(repro_dir, { recursive: true, force: true });
    await mkdir(repro_dir, { recursive: true });
    const path = join(repro_dir, 'dst.duckdb');

    const pgdb = await PGlite.create();
    pgdb.exec(`
            CREATE TYPE my_status_enum AS ENUM ('active', 'inactive');
            CREATE TYPE my_address AS (
                street TEXT,
                city TEXT,
                zip TEXT
            );
            CREATE TABLE t (id INTEGER, status my_status_enum, address my_address);
            INSERT INTO t VALUES (1, 'active', ROW('123 Main St', 'Anytown', '12345')), (2, 'inactive', ROW('456 Elm St', 'Othertown', '67890')), (3, NULL, NULL);
        `
    );
    const server = new PGLiteSocketServer({db: pgdb, host: 'localhost', port: 5432});
    await server.start();

    try {
        const dst = await duckdb.DuckDBInstance.create(path);
        const dstConn = await dst.connect();

        console.log('Attaching PGlite Postgres catalog and running COPY FROM DATABASE...');
        await dstConn.run('INSTALL postgres;');
        await dstConn.run('LOAD postgres;');
        await dstConn.run(
            'ATTACH \'host=localhost port=5432 dbname=postgres user=postgres sslmode=disable\' AS pg (TYPE postgres);'
        );

        const catalogReader = await dstConn.runAndReadAll('SELECT current_database();');
        const targetCatalog = String(catalogReader.getRows()[0][0]).replace(/"/g, '""');
        await dstConn.run(`COPY FROM DATABASE pg TO "${targetCatalog}";`);
        await dstConn.run(`EXPORT DATABASE '${join(repro_dir, 'postcreate')}' (FORMAT CSV);`);
        const results = await dstConn.runAndReadAll('SELECT * FROM duckdb_types WHERE type_name ILIKE \'my_%\';');
        console.log('Data in duckdb_types table after COPY FROM DATABASE:');
        console.table(results.getRows());
    } finally {
        await pgdb.close();
        await server.stop();
    }
};

async function reconnect(repro_dir){
    const path = join(repro_dir, 'dst.duckdb');
    console.log('Attempting to reopen destination database...');
    try {
        const instance = await duckdb.DuckDBInstance.create(path);
        const con = await instance.connect();
        await con.run(`EXPORT DATABASE '${join(repro_dir, 'after_reopen')}' (FORMAT CSV);`);
        const results = await con.runAndReadAll('SELECT * FROM duckdb_types WHERE type_name ILIKE \'my_%\';');
        console.log('Data in duckdb_types table after reopening database:');
        console.table(results.getRows());
        console.log('OK reopen succeeded');
    } catch (error) {
        console.log(`FAIL reopen failed: ${error instanceof Error ? error.message : String(error)}`);
    }
}

async function main() {
    const cmd = process.argv[2];
    const dir = process.argv[3] || 'repro';

    if (cmd === 'create') {
        await create(dir);
    } else if (cmd === 'reopen') {
        await reconnect(dir);
    } else {
        console.error(`Unknown command: ${cmd}`);
        process.exit(1);
    }
}

main();

After this runs, if I look at the repro/postcreate/schema.sql file, which was created from the EXPORT command, then I get this:

CREATE SCHEMA public;
CREATE TYPE public.my_address AS my_address
CREATE TYPE public.my_status_enum AS ENUM ( 'active', 'inactive' );
CREATE TABLE public.t(id INTEGER, status ENUM('active', 'inactive'), address STRUCT(street VARCHAR, city VARCHAR, zip VARCHAR));

Note how:

  • CREATE TYPE public.my_address AS my_address isn't legal. If you actually tried to IMPORT from this directory you would get errors.
  • CREATE TYPE public.my_status_enum AS ENUM ( 'active', 'inactive' ); is legal and worked.
  • The CREATE TABLE statement decided to redefine the column types with anonymous types, instead of using the stored types.

What I would expect:

CREATE SCHEMA public;
CREATE TYPE public.my_address AS STRUCT(street VARCHAR, city VARCHAR, zip VARCHAR)
CREATE TYPE public.my_status_enum AS ENUM ( 'active', 'inactive' );
CREATE TABLE public.t(id INTEGER, status public.my_status_enum, address public.my_address);

Or, if that's not possible, then it would be less ideal, but workable if this just used the anonymous types and didn't generate illegal CREATE TYPE statements:

CREATE SCHEMA public;
CREATE TABLE public.t(id INTEGER, status ENUM('active', 'inactive'), address STRUCT(street VARCHAR, city VARCHAR, zip VARCHAR));

I have not tested on main for any of the packages.

To Reproduce

see above.

OS:

macOS

PostgreSQL Version:

17

DuckDB Version:

1.5.1

DuckDB Client:

nodejs

Full Name:

Nick Crews

Affiliation:

Ship Creek Group

Have you tried this on the latest main branch?

  • I have NOT done this.

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • I agree

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions