Skip to content

Add interconnect regression tests#1

Closed
MutableFire wants to merge 43 commits into
mainfrom
gp_interconnect_stats
Closed

Add interconnect regression tests#1
MutableFire wants to merge 43 commits into
mainfrom
gp_interconnect_stats

Conversation

@MutableFire

Copy link
Copy Markdown
Owner

Fixes #ISSUE_Number

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

CI Skip Instructions


@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @MutableFire welcome!🎊 Thanks for taking the effort to make our project better! 🙌 Keep making such awesome contributions!

nathan-bossart and others added 29 commits June 3, 2026 16:38
This omission allowed table owners to create statistics in any
schema, potentially leading to unexpected naming conflicts.  For
ALTER TABLE commands that require re-creating statistics objects,
skip this check in case the user has since lost CREATE on the
schema.  The addition of a second parameter to CreateStatistics()
breaks ABI compatibility, but we are unaware of any impacted
third-party code.

Reported-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Security: CVE-2025-12817
Backpatch-through: 13
Several functions could overflow their size calculations, when presented
with very large inputs from remote and/or untrusted locations, and then
allocate buffers that were too small to hold the intended contents.

Switch from int to size_t where appropriate, and check for overflow
conditions when the inputs could have plausibly originated outside of
the libpq trust boundary. (Overflows from within the trust boundary are
still possible, but these will be fixed separately.) A version of
add_size() is ported from the backend to assist with code that performs
more complicated concatenation.

Reported-by: Aleksey Solovev (Positive Technologies)
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Security: CVE-2025-12818
Backpatch-through: 13
pgp_pub_decrypt_bytea() was missing a safeguard for the session key
length read from the message data, that can be given in input of
pgp_pub_decrypt_bytea().  This can result in the possibility of a buffer
overflow for the session key data, when the length specified is longer
than PGP_MAX_KEY, which is the maximum size of the buffer where the
session data is copied to.

A script able to rebuild the message and key data that can trigger the
overflow is included in this commit, based on some contents provided by
the reporter, heavily editted by me.  A SQL test is added, based on the
data generated by the script.

Reported-by: Team Xint Code as part of zeroday.cloud
Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Noah Misch <noah@leadboat.com>
Security: CVE-2026-2005
Backpatch-through: 14
While EUC_CN supports only 1- and 2-byte sequences (CS0, CS1), the
mb<->wchar conversion functions allow 3-byte sequences beginning SS2,
SS3.

Change pg_encoding_max_length() to return 3, not 2, to close a
hypothesized buffer overrun if a corrupted string is converted to wchar
and back again in a newly allocated buffer.  We might reconsider that in
master (ie harmonizing in a different direction), but this change seems
better for the back-branches.

Also change pg_euccn_mblen() to report SS2 and SS3 characters as having
length 3 (following the example of EUC_KR).  Even though such characters
would not pass verification, it's remotely possible that invalid bytes
could be used to compute a buffer size for use in wchar conversion.

Security: CVE-2026-2006
Backpatch-through: 14
Author: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
When converting multibyte to pg_wchar, the UTF-8 implementation would
silently ignore an incomplete final character, while the other
implementations would cast a single byte to pg_wchar, and then repeat
for the remaining byte sequence.  While it didn't overrun the buffer, it
was surely garbage output.

Make all encodings behave like the UTF-8 implementation.  A later change
for master only will convert this to an error, but we choose not to
back-patch that behavior change on the off-chance that someone is
relying on the existing UTF-8 behavior.

Security: CVE-2026-2006
Backpatch-through: 14
Author: Thomas Munro <thomas.munro@gmail.com>
Reported-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
A corrupted string could cause code that iterates with pg_mblen() to
overrun its buffer.  Fix, by converting all callers to one of the
following:

1. Callers with a null-terminated string now use pg_mblen_cstr(), which
raises an "illegal byte sequence" error if it finds a terminator in the
middle of the sequence.

2. Callers with a length or end pointer now use either
pg_mblen_with_len() or pg_mblen_range(), for the same effect, depending
on which of the two seems more convenient at each site.

3. A small number of cases pre-validate a string, and can use
pg_mblen_unbounded().

The traditional pg_mblen() function and COPYCHAR macro still exist for
backward compatibility, but are no longer used by core code and are
hereby deprecated.  The same applies to the t_isXXX() functions.

Security: CVE-2026-2006
Backpatch-through: 14
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Co-authored-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reported-by: Paul Gerste (as part of zeroday.cloud)
Reported-by: Moritz Sanft (as part of zeroday.cloud)
A security patch changed them today, so close the coverage gap now.
Test that buffer overrun is avoided when pg_mblen*() requires more
than the number of bytes remaining.

This does not cover the calls in dict_thesaurus.c or in dict_synonym.c.
That code is straightforward.  To change that code's input, one must
have access to modify installed OS files, so low-privilege users are not
a threat.  Testing this would likewise require changing installed
share/postgresql/tsearch_data, which was enough of an obstacle to not
bother.

Security: CVE-2026-2006
Backpatch-through: 14
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Co-authored-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
pgp_sym_decrypt() and pgp_pub_decrypt() will raise such errors, while
bytea variants will not.  The existing "dat3" test decrypted to non-UTF8
text, so switch that query to bytea.

The long-term intent is for type "text" to always be valid in the
database encoding.  pgcrypto has long been known as a source of
exceptions to that intent, but a report about exploiting invalid values
of type "text" brought this module to the forefront.  This particular
exception is straightforward to fix, with reasonable effect on user
queries.  Back-patch to v14 (all supported versions).

Reported-by: Paul Gerste (as part of zeroday.cloud)
Reported-by: Moritz Sanft (as part of zeroday.cloud)
Author: shihao zhong <zhong950419@gmail.com>
Reviewed-by: cary huang <hcary328@gmail.com>
Discussion: https://postgr.es/m/CAGRkXqRZyo0gLxPJqUsDqtWYBbgM14betsHiLRPj9mo2=z9VvA@mail.gmail.com
Backpatch-through: 14
Security: CVE-2026-2006
These data types are represented like full-fledged arrays, but
functions that deal specifically with these types assume that the
array is 1-dimensional and contains no nulls.  However, there are
cast pathways that allow general oid[] or int2[] arrays to be cast
to these types, allowing these expectations to be violated.  This
can be exploited to cause server memory disclosure or SIGSEGV.
Fix by installing explicit checks in functions that accept these
types.

Reported-by: Altan Birler <altan.birler@tum.de>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
Security: CVE-2026-2003
Backpatch-through: 14
An upcoming patch requires this cache so that it can track updates
in the pg_extension catalog.  So far though, the EXTENSIONOID cache
only exists in v18 and up (see 490f869).  We can add it in older
branches without an ABI break, if we are careful not to disturb the
numbering of existing syscache IDs.

In v16 and before, that just requires adding the new ID at the end
of the hand-assigned enum list, ignoring our convention about
alphabetizing the IDs.  But in v17, genbki.pl enforces alphabetical
order of the IDs listed in MAKE_SYSCACHE macros.  We can fake it
out by calling the new cache ZEXTENSIONOID.

Note that adding a syscache does change the required contents of the
relcache init file (pg_internal.init).  But that isn't problematic
since we blow those away at postmaster start for other reasons.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
Security: CVE-2026-2004
Backpatch-through: 14-17
Selectivity estimators come in two flavors: those that make specific
assumptions about the data types they are working with, and those
that don't.  Most of the built-in estimators are of the latter kind
and are meant to be safely attachable to any operator.  If the
operator does not behave as the estimator expects, you might get a
poor estimate, but it won't crash.

However, estimators that do make datatype assumptions can malfunction
if they are attached to the wrong operator, since then the data they
get from pg_statistic may not be of the type they expect.  This can
rise to the level of a security problem, even permitting arbitrary
code execution by a user who has the ability to create SQL objects.

To close this hole, establish a rule that built-in estimators are
required to protect themselves against being called on the wrong type
of data.  It does not seem practical however to expect estimators in
extensions to reach a similar level of security, at least not in the
near term.  Therefore, also establish a rule that superuser privilege
is required to attach a non-built-in estimator to an operator.
We expect that this restriction will have little negative impact on
extensions, since estimators generally have to be written in C and
thus superuser privilege is required to create them in the first
place.

This commit changes the privilege checks in CREATE/ALTER OPERATOR
to enforce the rule about superuser privilege, and fixes a couple
of built-in estimators that were making datatype assumptions without
sufficiently checking that they're valid.

Reported-by: Daniel Firer as part of zeroday.cloud
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
Security: CVE-2026-2004
Backpatch-through: 14
While the preceding commit prevented such attachments from occurring
in future, this one aims to prevent further abuse of any already-
created operator that exposes _int_matchsel to the wrong data types.
(No other contrib module has a vulnerable selectivity estimator.)

We need only check that the Const we've found in the query is indeed
of the type we expect (query_int), but there's a difficulty: as an
extension type, query_int doesn't have a fixed OID that we could
hard-code into the estimator.

Therefore, the bulk of this patch consists of infrastructure to let
an extension function securely look up the OID of a datatype
belonging to the same extension.  (Extension authors have requested
such functionality before, so we anticipate that this code will
have additional non-security uses, and may soon be extended to allow
looking up other kinds of SQL objects.)

This is done by first finding the extension that owns the calling
function (there can be only one), and then thumbing through the
objects owned by that extension to find a type that has the desired
name.  This is relatively expensive, especially for large extensions,
so a simple cache is put in front of these lookups.

Reported-by: Daniel Firer as part of zeroday.cloud
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
Security: CVE-2026-2004
Backpatch-through: 14
RAT check expection added for PG16 section as it is effitinely comes
from PG16 backport
liburing is the Linux-only userspace API for the io_uring kernel
interface (Linux 5.1+). It is not available on macOS or *BSD.

PAX already has a sync fallback path (SyncFastIO using pread(2)) and
LocalFile::ReadBatch picks between the two via IOUringFastIO::available().
This commit lets the build proceed without liburing:

  * configure: AC_CHECK_LIB(uring) no longer aborts when missing; PAX
    falls back to SyncFastIO at runtime.
  * fast_io.h / fast_io.cc: wrap the IOUringFastIO class declaration
    and methods with #ifdef __linux__ so they only exist where the
    header is available.
  * local_file_system.cc: gate the IOUringFastIO::available() branch
    with #ifdef __linux__; non-Linux unconditionally uses SyncFastIO.
ic_udpifc.c uses the HZ macro (kernel timer frequency) for RTO
calculations. On Linux glibc, HZ is exposed via <sys/param.h>
(typically 100). macOS's <sys/param.h> does not define HZ.

Define HZ=100 as a fallback when the platform headers don't provide
it. The exact value only affects retransmit pacing constants
(UDP_RTO_MIN, TIME_TICK); 100 matches the Linux default.
src/backend/gporca/gporca.mk forced -Werror -Wextra -Wpedantic on top
of the project CXXFLAGS via 'override CXXFLAGS := ...'. clang reports
many more warnings than gcc under -Wextra/-Wpedantic, which then
become errors and break the macOS build (unused-but-set-variable,
inconsistent-missing-override, ...).

Keep -fno-omit-frame-pointer (used by ORCA's backtracing); drop the
strict warning flags. The base project CXXFLAGS still includes -Wall
and per-feature -Werror= flags.
With --enable-shared-postgres-backend (default) the libpostgres.so
recipe filters out main/main.o, but other backend objects still
reference symbols defined in main.o (progname, etc.). On Linux the
default linker behaviour permits undefined references in a shared
library; on macOS, ld -dynamiclib rejects them with 'Undefined
symbols ... _progname'.

Pass -Wl,-undefined,dynamic_lookup only when PORTNAME=darwin so those
symbols resolve at load time against the postgres executable that
loads libpostgres.so. The Linux behaviour is unchanged.
Several gcc / GNU-ld specific bits in the PAX cmake setup break under
Apple clang and Apple ld. Make them conditional on Linux.

  * CMakeLists.txt: gate -no-pie / -Wl,--allow-multiple-definition /
    -fno-access-control / -Wno-pmf-conversions (gcc-only) behind
    if(NOT APPLE). Replace gcc-only warning disables (-Wno-clobbered,
    -Wno-sized-deallocation, -Wno-parameter-name) with
    -Wno-unknown-warning-option on APPLE, plus a set of
    -Wno-error= demotions for clang-stricter diagnostics
    (inconsistent-missing-override, overloaded-virtual,
    sometimes-uninitialized, unused-private-field, format,
    mismatched-tags, pessimizing-move, unused-but-set-variable,
    deprecated-copy, unused-result).
  * Makefile: pass -DBUILD_GTEST=OFF to the inner cmake (googletest's
    own flags clash with clang). Drop SHLIB_LINK += -luuid on darwin
    (uuid_* is in libSystem). Move the ifneq to after the include of
    Makefile.global so PORTNAME is actually defined. Cope with cmake
    naming the artefact libpax.so on every platform now (see below).
  * src/cpp/CMakeLists.txt: skip the standalone libpaxformat target on
    macOS. It links to backend symbols (write_stderr, ...) directly
    and isn't needed to load PAX inside postgres.
  * pax.cmake: on APPLE, build pax as a MODULE (Mach-O bundle, what
    PG extensions are) instead of SHARED, and link with
    -Wl,-undefined,dynamic_lookup -Wl,-bundle_loader,<postgres>. This
    is the standard PG extension pattern; it guarantees that backend
    globals (e.g. process_shared_preload_libraries_in_progress) have
    one shared instance between postgres and pax.so. Also gate -luring
    on Linux; pull abseil deps via pkg-config on macOS (Homebrew's
    protobuf v22+ split them into separate libs); replace the
    Linux-only $ORIGIN INSTALL_RPATH and -Wl,--enable-new-dtags with
    @loader_path on macOS.
  * pax_format.cmake: same Linux-only treatment for -luuid / -luring
    and the abseil pkg-config.
Compile-time fixes to let PAX build with Apple clang against
Homebrew's modern protobuf (v22+).

  * file_system.h: typedef off64_t = off_t on macOS. glibc exposes
    off64_t for 32-bit programs opting into 64-bit file offsets;
    macOS's off_t is already 64-bit so there is no separate symbol.
  * pax_encoding_utils.{h,cc}: change BuildHistogram/ZigZagBuffers
    signatures from int64_t* to PG's int64* (long). On macOS x86_64
    int64_t is 'long long', distinct from 'long' for overload
    resolution even though both are 64-bit. Using PG's int64
    (consistently 'long' on every supported port) keeps callers from
    PG-typed buffers (e.g. DataBuffer<int64>::StartT) working on both
    platforms.
  * pax_delta_encoding.cc: replace 'uint8_t bit_widths[var] = {0};'
    (a gcc VLA-with-initializer extension that clang rejects) with
    std::vector<uint8_t>.
  * proto_wrappers.h: PG's c.h defines Min/Max macros and
    xlog_internal.h defines IsPowerOf2; abseil (pulled in by modern
    protobuf headers) declares identifiers with the same names.
    #undef them around the protobuf include, then restore PG's
    definitions afterwards.
  * protobuf_stream.{h,cc}: protobuf v22 removed the
    google::protobuf::int64 typedef. Use int64_t directly (which is
    the type of the base ZeroCopy{Output,Input}Stream::ByteCount()
    override anyway).
paxformat is the standalone PAX file reader meant to be linked by
external tools. The previous commit 'macOS: PAX cmake portability'
skipped it on APPLE because it references PG backend functions
(write_stderr, xlog_check_consistency_hook, ...) that have no
libpostgres.so to satisfy them on macOS.

Build it after all by deferring those undefined symbols to load time
with -Wl,-undefined,dynamic_lookup, just like Linux's default ld
behaviour for shared libraries. The smoke-test executable
paxformat_test uses the same flag and drops the explicit 'postgres'
link library (there is no libpostgres.so to link against on macOS).

paxformat is a regular dylib (SHARED) here — not a bundle — because
the test executable links against it at link time.
The prototype in src/include/utils/numeric.h declared the function as
returning 'const bool', which conflicts with the definition in
numeric.c that returns plain 'bool'. gcc tolerates this (const on a
non-pointer return is silently meaningless), but clang flags it as
'conflicting types' and the build fails on macOS.

Introduced by 'apache#392 Export numeric interface to public'.

Match the .c definition: plain 'bool'.
PG 16 upstream commit b55f62a ('Unify DLSUFFIX on Darwin') changed
DLSUFFIX on macOS from .so to .dylib so the suffix would match both
linkable shared libraries and dlopen'd modules. Cloudberry, however,
has many places that still hard-code '$libdir/foo.so' — the catalog
SQL bootstrap scripts, cdb_init.d, and a number of expected/*.out
files. When PG sees an explicit '.so' suffix in a library reference
it does NOT re-append DLSUFFIX, so a .so / .dylib divergence breaks
the catalog bootstrap (FATAL: could not access file 'foo').

Restore the pre-PG16 behaviour of DLSUFFIX=.so on darwin (via
src/template/darwin) so all those hard-coded references resolve.

Two follow-on adjustments are needed:

  * configure: the Python-shared-library probe builds a candidate path
    as '$python_libdir/lib$ldlibrary$DLSUFFIX'. macOS Python ships
    its shared lib as .dylib regardless of what DLSUFFIX is set to for
    modules; without a .dylib fallback the probe fails with
    'could not find shared library for Python'. Try .dylib alongside
    DLSUFFIX on darwin.

  * src/test/regress/GNUmakefile: a handful of install/uninstall
    lines hard-coded '.so' for the test-helper modules
    (regress, test_hook, query_info_hook_test). Replace with
    $(DLSUFFIX) so they keep working regardless of the value.
After this commit, plain 'make' (without any CUSTOM_COPT= on the
command line) builds cleanly on macOS with Apple clang. Previously
users had to remember a long override.

The warning categories that needed demoting (darwin only):

  -Wuninitialized           — clang flags spots upstream gcc accepts.
  -Wgnu-variable-sized-type-not-at-end
                            — clang-only; fires on PG catalog headers
                              like pg_task.h with inline struct-plus-
                              trailing-text declarations.
  -Wunused-function         — clang flags static functions never
                              referenced in the TU (a few exist in
                              currently-disabled code paths, e.g.
                              ic_udpifc.c).
  -Wdeprecated-non-prototype
                            — clang-only; flags K&R-style 'foo()'
                              forward declarations. Where the
                              mismatch is a real bug we fix it
                              inline; this demotion covers the rest.

The block is gated to PORTNAME=darwin so the Linux gcc build path
is unchanged: -Werror remains in effect for all categories there.
The forward declaration at line 838 was

    static void initSndBufferPool();

which under C's old K&R rules means 'function with unspecified
arguments' — but the actual definition takes a SendBufferPool *:

    static void
    initSndBufferPool(SendBufferPool *p)

and the single caller at line 3677 passes &snd_buffer_pool. gcc
silently accepts the mismatch; Apple clang correctly rejects it
under -Wdeprecated-non-prototype. C2x will reject it on every
compiler. Match the forward declaration to the definition.
An duct tape for this was already added as fc8aab8, through redo
path was not patched there. Copy same guard into
redoDistributedCommitRecord function boby.
Previously, MERGE ... WHEN MATCHED THEN UPDATE SET <dist_key> = ...
was rejected with "cannot update column in merge with distributed
column". This commit adds support by extending the SplitMerge node
to handle distribution key updates via DELETE + INSERT routing.

Key changes:

Planner (preptlist.c, createplan.c, setrefs.c, cdbpath.c):
- Detect distribution key modification in MERGE UPDATE actions and
  set merge_need_split_update flag
- Add all target table columns to subplan targetlist so SplitMerge
  can project complete rows for INSERT
- Expand UPDATE action targetlists to include all columns (not just
  SET columns) using expand_insert_targetlist
- Build SplitMerge targetlist in N+M+1 format: N target table columns
  + M subplan columns + 1 DMLAction column
- Always use root table action lists (not per-partition adjusted lists)
  to ensure hashAttnos match root attribute numbers
- Add set_splitmerge_tlist_references for proper OUTER_VAR conversion

Executor (nodeSplitMerge.c, nodeModifyTable.c):
- SplitMerge splits MATCHED UPDATE into DELETE + INSERT tuple pair,
  each routed to the correct segment via hash computation
- NOT MATCHED rows get PASSTHROUGH action for normal ExecMerge processing
- ModifyTable handles DMLAction-tagged tuples from SplitMerge output
- Support lazy partition routing for CMD_MERGE DML_INSERT

Refactoring (nodeSplitMerge.c):
- Extract computeTargetSegment(), SwitchResultRelForPartition(),
  BuildRootUpdateTupleDesc() helper functions
- Define SPLITMERGE_ACTION_PASSTHROUGH constant
- Remove dead code and consolidate duplicated logic
Uncomment the two test blocks that verify view behavior when pg_depend
entries are corrupted (DELETE FROM pg_depend + ALTER TABLE + ROLLBACK).

These were previously disabled because DELETE FROM pg_depend only
affects the coordinator in a distributed environment. The fix uses
allow_segment_DML GUC combined with a helper function marked
EXECUTE ON ALL SEGMENTS to delete from segment catalogs as well.
lss602726449 and others added 9 commits June 8, 2026 15:40
ORCA now correctly supports runtime filter pushdown, so uncomment
the previously disabled test block that verifies pushdown behavior
with optimizer on.
Restore register_dirty_segment() in mdcreate() and
register_dirty_segment_ao() in ao_insert_replay() that were removed
during PG16 merge, ensuring newly created or written segments are
properly fsync'd at the next checkpoint.

Fix two bugs that cause checkpointer PANIC on standby when processing
AO fsync requests for truncated/dropped files:

1. aosyncfiletag(): return -1 instead of elog(ERROR) when file cannot
   be opened, matching mdsyncfiletag() behavior.

2. ao_truncate_replay(): send SYNC_FORGET_REQUEST after truncating an
   AO segment file to cancel previously registered fsync requests.

Add register_forget_request_ao() as the AO counterpart to
register_forget_request(), using SYNC_HANDLER_AO.
After a gang loss (e.g. QE terminated by pg_terminate_backend), the QD
assigns a new gp_session_id via GpDropTempTables(), but the QD's
pg_stat_activity entry (MyBEEntry->st_session_id) was never updated
because pgstat_report_sessionid() was commented out and the function
was removed during the PG16 merge.

This caused pg_stat_activity.sess_id on QD to become stale, diverging
from the actual gp_session_id. Any query that joins QD's
pg_stat_activity with segment pg_stat_activity on sess_id would fail
to match, making gp_sync_lc_gucs test flaky.

Restore pgstat_report_sessionid() following the existing
pgstat_report_resgroup() pattern, and uncomment the call in
GpDropTempTables(). Update the gp_sync_lc_gucs expected output since
the second pg_terminate_backend now correctly finds and terminates QEs.
…rror down

The vacuum_progress tests simulate mirror failure during vacuum to test
worker process change behavior. Previously, after stopping the mirror and
resuming the walsender, the tests relied on passive FTS detection which
could time out in CI. Add wait_for_mirror_down() helper that actively
triggers FTS probe scans until mirror is confirmed down, ensuring the
FTS version change propagates to QD and gang gets properly reset.

Also update vacuum_progress_column expected output for post-cleanup
phase progress values, which now reflect actual work done by the new
vacuum worker (heap_blks_vacuumed, index_vacuum_count) after the
worker change is reliably triggered.
Two related changes that together make PAX aux toasting match the
rest of the Cloudberry / Postgres tree:

1. src/backend/catalog/toasting.c

   Drop the Cloudberry-only branch that routed TOAST for
   pg_ext_aux parents back into pg_ext_aux:

       else if (IsExtAuxNamespace(rel->rd_rel->relnamespace))
           namespaceid = PG_EXTAUX_NAMESPACE;

   PAX aux tables (pg_pax_blocks_<oid>) now get a normal TOAST
   companion in pg_toast, the same as every other heap.

2. contrib/pax_storage/.../pax_aux_table.cc

   Aux inherits the parent's persistence for PERMANENT / UNLOGGED
   verbatim, but TEMP is clamped down to PERMANENT.  Background:
   the aux always lives in pg_ext_aux (not in pg_temp_<N>), so a
   TEMP-persistence row in pg_ext_aux ends up mis-classified by
   RELATION_IS_OTHER_TEMP — relcache.c sets rd_islocaltemp=false
   because pg_ext_aux is not a temp namespace, and then
   reindex_index() (or any catalog walk that touches the aux)
   bails with "cannot reindex temporary tables of other
   sessions".  Clamping TEMP→PERMANENT avoids the mis-trigger;
   the trade-off is that the aux of a TEMP PAX table outlives
   the session, which is acceptable given the long-standing
   FIXME in the same file ("temporary table in aux namespace is
   not supported yet").

Resulting layout:

  PERMANENT pax_tab    aux    in pg_ext_aux (p)
                       toast  in pg_toast (p)
                       idx    in pg_toast (p)

  UNLOGGED  u_pax      aux    in pg_ext_aux (u)
                       toast  in pg_toast (u)
                       idx    in pg_toast (u)

  TEMP      pax_tmp    parent in pg_temp_<N> (t)
                       aux    in pg_ext_aux (p)   <-- clamped
                       toast  in pg_toast (p)
                       idx    in pg_toast (p)

Verified:
  - CREATE TABLE / UNLOGGED / TEMP USING pax all produce the
    layout above.
  - INSERT round-trip works for all three persistence modes.
Commit e2d4ef8 (the fix for CVE-2017-7484) added security checks
to the selectivity estimation functions to prevent them from running
user-supplied operators on data obtained from pg_statistic if the user
lacks privileges to select from the underlying table. In cases
involving inheritance/partitioning, those checks were originally
performed against the child RTE (which for plain inheritance might
actually refer to the parent table). Commit 553d2ec then extended
that to also check the parent RTE, allowing access if the user had
permissions on either the parent or the child. It turns out, however,
that doing any checks using the child RTE is incorrect, since
securityQuals is set to NULL when creating an RTE for an inheritance
child (whether it refers to the parent table or the child table), and
therefore such checks do not correctly account for any RLS policies or
security barrier views. Therefore, do the security checks using only
the parent RTE. This is consistent with how RLS policies are applied,
and the executor's ACL checks, both of which use only the parent
table's permissions/policies. Similar checks are performed in the
extended stats code, so update that in the same way, centralizing all
the checks in a new function.

In addition, note that these checks by themselves are insufficient to
ensure that the user has access to the table's data because, in a
query that goes via a view, they only check that the view owner has
permissions on the underlying table, not that the current user has
permissions on the view itself. In the selectivity estimation
functions, there is no easy way to navigate from underlying tables to
views, so add permissions checks for all views mentioned in the query
to the planner startup code. If the user lacks permissions on a view,
a permissions error will now be reported at planner-startup, and the
selectivity estimation functions will not be run.

Checking view permissions at planner-startup in this way is a little
ugly, since the same checks will be repeated at executor-startup.
Longer-term, it might be better to move all the permissions checks
from the executor to the planner so that permissions errors can be
reported sooner, instead of creating a plan that won't ever be run.
However, such a change seems too far-reaching to be back-patched.

Back-patch to all supported versions. In v13, there is the added
complication that UPDATEs and DELETEs on inherited target tables are
planned using inheritance_planner(), which plans each inheritance
child table separately, so that the selectivity estimation functions
do not know that they are dealing with a child table accessed via its
parent. Handle that by checking access permissions on the top parent
table at planner-startup, in the same way as we do for views. Any
securityQuals on the top parent table are moved down to the child
tables by inheritance_planner(), so they continue to be checked by the
selectivity estimation functions.

Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
Backpatch-through: 13
Security: CVE-2025-8713
Maliciously-crafted object names could achieve SQL injection during
restore.  CVE-2012-0868 fixed this class of problem at the time, but
later work reintroduced three cases.  Commit
bc8cd50 (back-patched to v11+ in
2023-05 releases) introduced the pg_dump case.  Commit
6cbdbd9 (v12+) introduced the two
pg_dumpall cases.  Move sanitize_line(), unchanged, to dumputils.c so
pg_dumpall has access to it in all supported versions.  Back-patch to
v13 (all supported versions).

Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Backpatch-through: 13
Security: CVE-2025-8715
A malicious server could inject psql meta-commands into plain-text
dump output (i.e., scripts created with pg_dump --format=plain,
pg_dumpall, or pg_restore --file) that are run at restore time on
the machine running psql.  To fix, introduce a new "restricted"
mode in psql that blocks all meta-commands (except for \unrestrict
to exit the mode), and teach pg_dump, pg_dumpall, and pg_restore to
use this mode in plain-text dumps.

While at it, encourage users to only restore dumps generated from
trusted servers or to inspect it beforehand, since restoring causes
the destination to execute arbitrary code of the source superusers'
choice.  However, the client running the dump and restore needn't
trust the source or destination superusers.

Reported-by: Martin Rakhmanov
Reported-by: Matthieu Denais <litezeraw@gmail.com>
Reported-by: RyotaK <ryotak.mail@gmail.com>
Suggested-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Security: CVE-2025-8714
Backpatch-through: 13
@MutableFire MutableFire force-pushed the gp_interconnect_stats branch 3 times, most recently from 8d37957 to efee6dd Compare June 16, 2026 08:20
Add a gpcontrib debug extension that rejects full scans on partitioned
ables when partition pruning is ineffective.

The extension is built only with enable_debug_extensions=yes and supports
Planner and ORCA plans, including Append, MergeAppend, PartitionSelector,
and DynamicScan cases.
@reshke reshke force-pushed the gp_interconnect_stats branch from 8ab7115 to 2e6b87d Compare June 18, 2026 09:42
@MutableFire MutableFire force-pushed the gp_interconnect_stats branch 2 times, most recently from ceef651 to 8dce88c Compare June 18, 2026 14:07
@MutableFire MutableFire force-pushed the gp_interconnect_stats branch from b8ddf34 to fdbd304 Compare June 18, 2026 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.