From 275ce96037e4734ee20814b00924c97980be5264 Mon Sep 17 00:00:00 2001 From: Abdeltoto Date: Sat, 25 Apr 2026 20:49:44 -0400 Subject: [PATCH] docs: clarify parallel validation scope Made-with: Cursor --- docs/console/validate.md | 4 ++++ docs/framework/inquiry.md | 2 +- docs/guides/validating-data.md | 9 ++++++++- frictionless/console/common.py | 2 +- frictionless/resource/resource.py | 3 ++- 5 files changed, 16 insertions(+), 4 deletions(-) diff --git a/docs/console/validate.md b/docs/console/validate.md index 4b93563309..9478bb8df4 100644 --- a/docs/console/validate.md +++ b/docs/console/validate.md @@ -14,3 +14,7 @@ With `validate` command you can validate your tabular files (indivisual or the w ```bash script tabs=CLI frictionless validate table.csv invalid.csv ``` + +The `--parallel` option enables multiprocessing for validation jobs that contain multiple +independent resources or tasks, such as Data Packages and Inquiries. It does not split validation +of a single file into multiple processes. diff --git a/docs/framework/inquiry.md b/docs/framework/inquiry.md index cd44bf0bdf..bb36377414 100644 --- a/docs/framework/inquiry.md +++ b/docs/framework/inquiry.md @@ -30,7 +30,7 @@ Tasks in the Inquiry accept the same arguments written in camelCase as the corre frictionless validate capital.inquiry-example.yaml ``` -At first sight, it's no clear why such a construct exists but when your validation workflow gets complex, the Inquiry can provide a lot of flexibility and power. Last but not least, the Inquiry will use multiprocessing if there are more than 1 task provided. +At first sight, it's no clear why such a construct exists but when your validation workflow gets complex, the Inquiry can provide a lot of flexibility and power. If the `parallel` flag is provided, Inquiry validation can use multiprocessing to run independent tasks concurrently; it does not split validation of a single file/resource across multiple processes. ## Reference diff --git a/docs/guides/validating-data.md b/docs/guides/validating-data.md index 7d0d8d119a..a8b35021b1 100644 --- a/docs/guides/validating-data.md +++ b/docs/guides/validating-data.md @@ -176,6 +176,11 @@ print(report) As we can see, the result is in a similar format to what we have already seen, and shows errors as we expected: we have one invalid resource and one valid resource. +> Package validation can use multiprocessing if the `parallel` flag is provided. This runs +> independent resources in the package concurrently; it does not split validation of a single +> file/resource across multiple processes. Parallel execution is also disabled when foreign keys +> are used, because those checks can depend on multiple resources. + ## Validating an Inquiry > The Inquiry is an advanced concept mostly used by software integrators. For example, under the hood, Frictionless Framework uses inquiries to implement client-server validation within the built-in API. Please skip this section if this information feels unnecessary for you. @@ -208,7 +213,9 @@ print(report) At first sight, it might not be clear why such a construct exists, but when your validation workflow gets complex, the Inquiry can provide a lot of flexibility and power. -> The Inquiry will use multiprocessing if there is the `parallel` flag provided. It might speed up your validation dramatically especially on a 4+ cores processor. +> The Inquiry will use multiprocessing if there is the `parallel` flag provided. This runs +> independent inquiry tasks concurrently; it does not split validation of a single file/resource +> across multiple processes. ## Validation Report diff --git a/frictionless/console/common.py b/frictionless/console/common.py index 36cac41d88..cc719392a7 100644 --- a/frictionless/console/common.py +++ b/frictionless/console/common.py @@ -266,7 +266,7 @@ parallel = Option( default=None, - help="Enable multiprocessing", + help="Enable multiprocessing for package/inquiry validation", ) output_path = Option( diff --git a/frictionless/resource/resource.py b/frictionless/resource/resource.py index 707a6124ae..7f7198d19d 100644 --- a/frictionless/resource/resource.py +++ b/frictionless/resource/resource.py @@ -611,7 +611,8 @@ def validate( checklist: a Checklist object name: limit validation to one resource (if applicable) on_row: callbacke for every row - paraller: allow parallel validation (multiprocessing) + parallel: accepted for API compatibility; resource validation itself + is not split across multiple processes limit_rows: limit amount of rows to this number limit_errors: limit amount of errors to this number