Skip to content

[ENH] V1 -> V2 Migration - Flows (module)#1609

Open
Omswastik-11 wants to merge 301 commits intoopenml:mainfrom
Omswastik-11:flow-migration-stacked
Open

[ENH] V1 -> V2 Migration - Flows (module)#1609
Omswastik-11 wants to merge 301 commits intoopenml:mainfrom
Omswastik-11:flow-migration-stacked

Conversation

@Omswastik-11
Copy link
Copy Markdown
Contributor

@Omswastik-11 Omswastik-11 commented Jan 8, 2026

Fixes #1601

added a Create method in FlowAPI for publishing flow but not refactored with old publish . (Needs discussion on this)
Added tests using fake_methods so that we can test without local V2 server . I have tested the FlowsV2 methods (get and exists ) and delete and list were not implemented in V2 server so I skipped them .

@Omswastik-11 Omswastik-11 changed the title [ENH] V1 -> V2 Migration [ENH] V1 -> V2 Migration - Flows (module) Jan 8, 2026
@Omswastik-11 Omswastik-11 marked this pull request as ready for review January 8, 2026 15:09
@geetu040 geetu040 mentioned this pull request Jan 9, 2026
18 tasks
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jan 12, 2026

Codecov Report

❌ Patch coverage is 43.11927% with 62 lines in your changes missing coverage. Please review.
✅ Project coverage is 54.67%. Comparing base (e653ef6) to head (d0d8f17).

Files with missing lines Patch % Lines
openml/_api/resources/flow.py 44.18% 48 Missing ⚠️
openml/flows/flow.py 26.66% 11 Missing ⚠️
openml/flows/functions.py 62.50% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1609      +/-   ##
==========================================
- Coverage   54.67%   54.67%   -0.01%     
==========================================
  Files          63       63              
  Lines        5108     5149      +41     
==========================================
+ Hits         2793     2815      +22     
- Misses       2315     2334      +19     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Collaborator

@geetu040 geetu040 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done overall.

@Omswastik-11 Omswastik-11 force-pushed the flow-migration-stacked branch from 614411f to 36184e5 Compare January 14, 2026 14:56
@Omswastik-11 Omswastik-11 requested a review from geetu040 January 14, 2026 15:10
@Omswastik-11
Copy link
Copy Markdown
Contributor Author

Hi @geetu040 !! Can you review the pre-commit failure It was due to merge conflicts more specifically for tasks . Should I change it on my branch ?

@geetu040
Copy link
Copy Markdown
Collaborator

Hi @geetu040 !! Can you review the pre-commit failure It was due to merge conflicts more specifically for tasks . Should I change it on my branch ?

can you try again, sync your branch with mine? It should be fixed now.

@Omswastik-11
Copy link
Copy Markdown
Contributor Author

I think that due to conflicts it did not synced properly . I have to revert it manually

Copy link
Copy Markdown
Collaborator

@geetu040 geetu040 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice overall, few changes needed. I'll look at the tests later when the implementation is final.

geetu040 and others added 2 commits January 21, 2026 13:36
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
@Omswastik-11 Omswastik-11 requested a review from geetu040 January 21, 2026 18:25
@Omswastik-11 Omswastik-11 marked this pull request as draft January 27, 2026 07:30
Copilot AI review requested due to automatic review settings March 23, 2026 18:45
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 9 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +64 to +65
data = {"name": name, "external_version": external_version, "api_key": self._http.api_key}
xml_response = self._http.post("flow/exists", data=data).text
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FlowV1API.exists() calls HTTPClient.post() with the default use_api_key=True, which raises OpenMLAuthenticationError when no API key is configured. The legacy flow/exists endpoint worked without an API key (it only included api_key when present), so this is a behavior regression for anonymous users. Consider calling self._http.post(..., use_api_key=False) and omit api_key from the payload (or make auth optional).

Suggested change
data = {"name": name, "external_version": external_version, "api_key": self._http.api_key}
xml_response = self._http.post("flow/exists", data=data).text
data = {"name": name, "external_version": external_version}
if self._http.api_key:
data["api_key"] = self._http.api_key
xml_response = self._http.post("flow/exists", data=data, use_api_key=False).text

Copilot uses AI. Check for mistakes.
Comment on lines +210 to +217
if not (isinstance(name, str) and len(name) > 0):
raise ValueError("Argument 'name' should be a non-empty string")
if not (isinstance(external_version, str) and len(external_version) > 0):
raise ValueError("Argument 'version' should be a non-empty string")

try:
response = self._http.get(f"flows/exists/{name}/{external_version}/")
result = response.json()
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FlowV2API.exists() interpolates name and external_version directly into the URL path. These values can contain characters that require URL escaping, which can break requests or cause ambiguous routing. Quote/escape each path segment (e.g., urllib.parse.quote) or send them as query parameters if supported by the v2 endpoint.

Copilot uses AI. Check for mistakes.
import xmltodict

import openml
import openml._api_calls
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openml._api_calls is imported but no longer used in this module after switching to the new backend APIs. This will fail linting/static checks; please remove the unused import.

Suggested change
import openml._api_calls

Copilot uses AI. Check for mistakes.
from openml._api_calls import _perform_api_call
from openml.testing import SimpleImputer, TestBase
import requests
from openml.testing import SimpleImputer, TestBase, create_request_response
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import: create_request_response is imported but not used anywhere in this test file. Please remove it to keep the test module clean and avoid lint failures.

Suggested change
from openml.testing import SimpleImputer, TestBase, create_request_response
from openml.testing import SimpleImputer, TestBase

Copilot uses AI. Check for mistakes.
Comment on lines +92 to +105
mock_request.assert_called_once_with(
method="POST",
url=openml.config.server + "flow/exists",
params={},
data={
"name": flow_name,
"external_version": external_version,
"api_key": test_apikey_v1,
},
headers=openml.config._HEADERS,
files=None,
)


Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test asserts that FlowV1API.exists() sends api_key inside the POST body. If FlowV1API.exists() is adjusted to not require authentication (and rely on HTTPClient to include api_key only when configured), this assertion will become incorrect. Consider relaxing the assertion to not require api_key for the exists call (or explicitly set use_api_key=False in the implementation and test for that behavior).

Suggested change
mock_request.assert_called_once_with(
method="POST",
url=openml.config.server + "flow/exists",
params={},
data={
"name": flow_name,
"external_version": external_version,
"api_key": test_apikey_v1,
},
headers=openml.config._HEADERS,
files=None,
)
mock_request.assert_called_once()
_, kwargs = mock_request.call_args
assert kwargs["method"] == "POST"
assert kwargs["url"] == openml.config.server + "flow/exists"
assert kwargs["params"] == {}
assert kwargs["headers"] == openml.config._HEADERS
assert kwargs["files"] is None
assert kwargs["data"]["name"] == flow_name
assert kwargs["data"]["external_version"] == external_version

Copilot uses AI. Check for mistakes.
Comment on lines +47 to +50
ignore_cache : bool, default=False
Whether to ignore the cache. If ``true`` this will download and overwrite the flow xml
even if the requested flow is already cached.

Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring around ignore_cache/caching behavior still refers to downloading/overwriting a flow.xml file in cache_directory/flows/{id}. Since get_flow() now delegates to HTTPClient caching (request-path-based cache entries), please update the docstring wording to match the new cache mechanism (or avoid specifying the on-disk layout).

Copilot uses AI. Check for mistakes.
flow_exists_mock.return_value = False
).encode()
mock_request.return_value = response
flow_exists_mock.return_value = None # Flow doesn't exist yet, so try to publish
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flow_exists() is documented/typed to return int | bool (with False indicating non-existence). This test sets flow_exists_mock.return_value = None, which is outside the contract and can mask type/logic issues. Please return False here to match the real API behavior and keep the test faithful to production semantics.

Suggested change
flow_exists_mock.return_value = None # Flow doesn't exist yet, so try to publish
flow_exists_mock.return_value = False # Flow doesn't exist yet, so try to publish

Copilot uses AI. Check for mistakes.
>>> openml.flows.delete_flow(23) # doctest: +SKIP
"""
return openml.utils._delete_entity("flow", flow_id)
return openml._backend.flow.delete(flow_id) # type: ignore
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This # type: ignore on the return from openml._backend.flow.delete(flow_id) looks unnecessary now that FlowAPI inherits delete() from ResourceAPI. If type checking is complaining, it may be better to fix the underlying type (e.g., ensure FlowAPI exposes delete with the right signature) rather than suppressing it here.

Suggested change
return openml._backend.flow.delete(flow_id) # type: ignore
return openml._backend.flow.delete(flow_id)

Copilot uses AI. Check for mistakes.
Comment on lines 17 to 19
FLOWS_CACHE_DIR_NAME = "flows"


Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FLOWS_CACHE_DIR_NAME is now unused after removing the filesystem flow cache helpers. Please remove it (and any related unused imports) to avoid lint/static-analysis failures.

Suggested change
FLOWS_CACHE_DIR_NAME = "flows"

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 24, 2026 17:10
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 12 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +220 to +222
except (OpenMLServerError, KeyError):
# v2 returns 404 when flow doesn't exist, which raises OpenMLServerError
return False
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FlowV2API.exists() currently catches OpenMLServerError and returns False, which will also hide real failures (e.g., authentication/authorization problems, 500s) by reporting "flow does not exist". Only treat the specific "not found" case as non-existence (e.g., catch OpenMLServerNoResult / an OpenMLServerException with the v2 not-found code or 404) and re-raise all other exceptions.

Suggested change
except (OpenMLServerError, KeyError):
# v2 returns 404 when flow doesn't exist, which raises OpenMLServerError
return False
except OpenMLServerError as err:
# v2 returns HTTP 404 when the flow doesn't exist; only treat that case
# as "not found" and propagate all other server errors.
if getattr(err, "code", None) == 404:
return False
raise

Copilot uses AI. Check for mistakes.
Comment on lines +497 to +499
if self.flow_id is None:
raise ValueError("Flow does not have an ID. Please publish the flow before untagging.")
openml._backend.flow.untag(self.flow_id, tag)
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as push_tag(): raising ValueError for an unpublished flow changes the public exception contract compared to OpenMLBase.remove_tag() (which raises ObjectNotPublishedError). Align the exception type/message (or delegate to the base implementation) for consistency across entities.

Copilot uses AI. Check for mistakes.
flow_exists_mock.return_value = False
).encode()
mock_request.return_value = response
flow_exists_mock.return_value = None # Flow doesn't exist yet, so try to publish
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flow_exists_mock is typed/used as int | bool across the codebase; returning None here can mask type/logic issues (e.g., code paths that explicitly check is False). Prefer returning False to represent non-existence.

Suggested change
flow_exists_mock.return_value = None # Flow doesn't exist yet, so try to publish
flow_exists_mock.return_value = False # Flow doesn't exist yet, so try to publish

Copilot uses AI. Check for mistakes.
Comment on lines 457 to 458
openml.config.start_using_configuration_for_example()
content_file = test_files_directory / "mock_responses" / "flows" / "flow_delete_not_owned.xml"
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start_using_configuration_for_example() emits a warning and mutates global config state; this test suite already uses the with_server autouse fixture to switch to test servers. Consider removing this call (and the other occurrences below) to avoid unnecessary warnings/global state changes, or ensure you restore the prior config via stop_using_configuration_for_example() in a finally block.

Copilot uses AI. Check for mistakes.
def test_delete_flow_with_run(mock_delete, test_files_directory, test_server_v1, test_apikey_v1):
@mock.patch.object(requests.Session, "request")
def test_delete_flow_with_run(mock_request, test_files_directory, test_apikey_v1):
openml.config.start_using_configuration_for_example()
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above: start_using_configuration_for_example() is redundant here (given the autouse server fixtures) and adds warnings/global state mutation. Prefer relying on fixtures or restoring config after the test.

Copilot uses AI. Check for mistakes.
):
flow_v2.publish(path="flow", files={"description": "<flow/>"})


Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flow_v1.get(...) here triggers a real HTTP request (and the test output depends on live server state). Please mark this as @pytest.mark.test_server() or mock the V1 GET response so this comparison test is deterministic/offline.

Suggested change
@pytest.mark.test_server()

Copilot uses AI. Check for mistakes.
Comment on lines +485 to +488
if self.flow_id is None:
raise ValueError("Flow does not have an ID. Please publish the flow before tagging.")
openml._backend.flow.tag(self.flow_id, tag)

Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenMLBase.push_tag() raises ObjectNotPublishedError when the entity has no server id. This override raises ValueError instead, which is an observable API behavior change for flows. Consider preserving the exception type/message (e.g., raise openml.exceptions.ObjectNotPublishedError here, or delegate to the base implementation) so tagging an unpublished flow behaves consistently with other OpenML entities.

Copilot uses AI. Check for mistakes.
def test_delete_subflow(mock_delete, test_files_directory, test_server_v1, test_apikey_v1):
@mock.patch.object(requests.Session, "request")
def test_delete_subflow(mock_request, test_files_directory, test_apikey_v1):
openml.config.start_using_configuration_for_example()
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above: start_using_configuration_for_example() is redundant here (given the autouse server fixtures) and adds warnings/global state mutation. Prefer relying on fixtures or restoring config after the test.

Copilot uses AI. Check for mistakes.
def test_delete_flow_success(mock_delete, test_files_directory, test_server_v1, test_apikey_v1):
@mock.patch.object(requests.Session, "request")
def test_delete_flow_success(mock_request, test_files_directory, test_apikey_v1):
openml.config.start_using_configuration_for_example()
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above: start_using_configuration_for_example() is redundant here (given the autouse server fixtures) and adds warnings/global state mutation. Prefer relying on fixtures or restoring config after the test.

Copilot uses AI. Check for mistakes.
@pytest.mark.xfail(reason="failures_issue_1544", strict=False)
def test_delete_unknown_flow(mock_delete, test_files_directory, test_server_v1, test_apikey_v1):
def test_delete_unknown_flow(mock_request, test_files_directory, test_apikey_v1):
openml.config.start_using_configuration_for_example()
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above: start_using_configuration_for_example() is redundant here (given the autouse server fixtures) and adds warnings/global state mutation. Prefer relying on fixtures or restoring config after the test.

Copilot uses AI. Check for mistakes.
flow_id: int,
reinstantiate: bool = False, # noqa: FBT002
strict_version: bool = True, # noqa: FBT002
ignore_cache: bool = False, # noqa: FBT002
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not recommend adding this parameter: ignore_cache

Suggested change
ignore_cache: bool = False, # noqa: FBT002

ignore_cache : bool, default=False
Whether to ignore the cache. If ``true`` this will download and overwrite the flow xml
even if the requested flow is already cached.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as said above

Suggested change

"""
flow_id = int(flow_id)
flow = _get_flow_description(flow_id)
flow = openml._backend.flow.get(flow_id, reset_cache=ignore_cache)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as said above

Suggested change
flow = openml._backend.flow.get(flow_id, reset_cache=ignore_cache)
flow = openml._backend.flow.get(flow_id)


return _create_flow_from_xml(flow_xml)
return cast("OpenMLFlow", new_flow)
return cast("OpenMLFlow", flow)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for explicit cast now

Suggested change
return cast("OpenMLFlow", flow)
return new_flow
return flow

if not (isinstance(name, str) and len(name) > 0):
raise ValueError("Argument 'name' should be a non-empty string")
if not (isinstance(name, str) and len(external_version) > 0):
if not (isinstance(external_version, str) and len(external_version) > 0):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if not (isinstance(external_version, str) and len(external_version) > 0):
if not (isinstance(name, str) and len(external_version) > 0):

@mock.patch.object(requests.Session, "delete")
def test_delete_flow_not_owned(mock_delete, test_files_directory, test_server_v1, test_apikey_v1):
@mock.patch.object(requests.Session, "request")
def test_delete_flow_not_owned(mock_request, test_files_directory, test_apikey_v1):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. there is no need for openml.config.start_using_configuration_for_example()
  2. keep test_server_v1 in params
  3. you have removed the check for url

@mock.patch.object(requests.Session, "delete")
def test_delete_flow_with_run(mock_delete, test_files_directory, test_server_v1, test_apikey_v1):
@mock.patch.object(requests.Session, "request")
def test_delete_flow_with_run(mock_request, test_files_directory, test_apikey_v1):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@mock.patch.object(requests.Session, "delete")
def test_delete_subflow(mock_delete, test_files_directory, test_server_v1, test_apikey_v1):
@mock.patch.object(requests.Session, "request")
def test_delete_subflow(mock_request, test_files_directory, test_apikey_v1):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@mock.patch.object(requests.Session, "delete")
def test_delete_flow_success(mock_delete, test_files_directory, test_server_v1, test_apikey_v1):
@mock.patch.object(requests.Session, "request")
def test_delete_flow_success(mock_request, test_files_directory, test_apikey_v1):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@mock.patch.object(requests.Session, "delete")
@mock.patch.object(requests.Session, "request")
@pytest.mark.xfail(reason="failures_issue_1544", strict=False)
def test_delete_unknown_flow(mock_delete, test_files_directory, test_server_v1, test_apikey_v1):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

The server-assigned flow id.
"""
path = "flow"
return super().publish(path, files)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed

Suggested change
return super().publish(path, files)

Comment on lines -441 to +449
super().publish()
assert self.flow_id is not None # for mypy
flow_id = self.flow_id

file_elements = self._get_file_elements()
if "description" not in file_elements:
file_elements["description"] = self._to_xml()

# Use openml._backend.flow.publish which internally calls ResourceV1.publish
flow_id = openml._backend.flow.publish(path="flow", files=file_elements)
self.flow_id = flow_id
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, this is how it's expected to work

Tag to attach to the flow.
"""
if self.flow_id is None:
raise ValueError("Flow does not have an ID. Please publish the flow before tagging.")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use ObjectNotPublishedError as in _tag_openml_base

Tag to remove from the flow.
"""
if self.flow_id is None:
raise ValueError("Flow does not have an ID. Please publish the flow before untagging.")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use ObjectNotPublishedError as in _tag_openml_base

Copilot AI review requested due to automatic review settings March 31, 2026 14:20
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

openml/flows/functions.py:67

  • get_flow’s docstring still describes the old per-flow XML cache (writing to cache_directory/flows/{flow_id}/flow.xml and raising OpenMLCacheException). The implementation now delegates to openml._backend.flow.get and uses HTTPClient caching, so the documented side effects/raised exceptions are no longer accurate; please update the docstring to match current behavior (and describe the new cache location/semantics if relevant).
    """Fetch an OpenMLFlow by its server-assigned ID.

    Queries the OpenML REST API for the flow metadata and returns an
    :class:`OpenMLFlow` instance. If the flow is already cached locally,
    the cached copy is returned. Optionally the flow can be re-instantiated
    into a concrete model instance using the registered extension.

    Parameters
    ----------
    flow_id : int
        The OpenML flow id.
    reinstantiate : bool, optional (default=False)
        If True, convert the flow description into a concrete model instance
        using the flow's extension (e.g., sklearn). If conversion fails and
        ``strict_version`` is True, an exception will be raised.
    strict_version : bool, optional (default=True)
        When ``reinstantiate`` is True, whether to enforce exact version
        requirements for the extension/model. If False, a new flow may
        be returned when versions differ.

    ignore_cache : bool, default=False
        Whether to ignore the cache. If ``true`` this will download and overwrite the flow xml
        even if the requested flow is already cached.

    Returns
    -------
    OpenMLFlow
        The flow object with metadata; ``model`` may be populated when
        ``reinstantiate=True``.

    Raises
    ------
    OpenMLCacheException
        When cached flow files are corrupted or cannot be read.
    OpenMLServerException
        When the REST API call fails.

    Side Effects
    ------------
    - Writes to ``openml.config.cache_directory/flows/{flow_id}/flow.xml``
      when the flow is downloaded from the server.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +43 to +66
def test_flow_v1_get(flow_v1):
flow = flow_v1.get(flow_id=1)
_assert_flow_shape(flow)


def test_flow_v1_list(flow_v1):
limit = 5
flows_df = flow_v1.list(limit=limit)

assert len(flows_df) == limit
assert "id" in flows_df.columns
assert "name" in flows_df.columns
assert "version" in flows_df.columns
assert "external_version" in flows_df.columns
assert "full_name" in flows_df.columns
assert "uploader" in flows_df.columns


def test_flow_v1_list_with_offset(flow_v1):
limit = 5
flows_df = flow_v1.list(limit=limit, offset=10)

assert len(flows_df) == limit

Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests call the real test server (flow_v1.get/list) but are missing the @pytest.mark.test_server() marker that other API tests use. Without the marker they’ll run in offline/unit-only environments and fail due to network dependency; either mark them as test_server or mock Session.request like the other tests in this file.

Copilot uses AI. Check for mistakes.
response = requests.Response()
response.status_code = 200
response._content = (
"<oml:upload_flow>\n" " <oml:id>1</oml:id>\n" "</oml:upload_flow>"
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mocked upload XML in this test uses the oml: prefix but does not declare the xmlns:oml namespace, which makes the XML invalid and will cause xmltodict.parse() (used by ResourceV1API.publish) to fail with an “unbound prefix” parse error. Use a valid response payload (e.g., include xmlns:oml="http://openml.org/openml" like other tests do, or reuse create_request_response fixtures).

Suggested change
"<oml:upload_flow>\n" " <oml:id>1</oml:id>\n" "</oml:upload_flow>"
'<oml:upload_flow xmlns:oml="http://openml.org/openml">\n'
" <oml:id>1</oml:id>\n"
"</oml:upload_flow>"

Copilot uses AI. Check for mistakes.
from openml._api_calls import _perform_api_call
from openml.testing import SimpleImputer, TestBase
import requests
from openml.testing import SimpleImputer, TestBase, create_request_response
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_request_response is imported here but not used anywhere in this test module after the refactor. Please drop the unused import to keep the test file clean and avoid lint failures.

Suggested change
from openml.testing import SimpleImputer, TestBase, create_request_response
from openml.testing import SimpleImputer, TestBase

Copilot uses AI. Check for mistakes.
Comment on lines 455 to +471
@@ -466,15 +467,15 @@ def test_delete_flow_not_owned(mock_delete, test_files_directory, test_server_v1
):
openml.flows.delete_flow(40_000)

flow_url = test_server_v1 + "flow/40000"
assert flow_url == mock_delete.call_args.args[0]
assert test_apikey_v1 == mock_delete.call_args.kwargs.get("params", {}).get("api_key")
assert mock_request.call_args.kwargs.get("method") == "DELETE"
assert test_apikey_v1 == mock_request.call_args.kwargs.get("params", {}).get("api_key")
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These delete_flow unit tests now only assert the HTTP method and api_key, but they no longer assert the request URL/path. Given other HTTPClient tests assert the full URL, this weakens coverage and could miss regressions in endpoint construction; also, start_using_configuration_for_example() emits warnings and is redundant here since the autouse fixture already configures test servers. Prefer asserting the url kwarg and using openml.config.use_test_servers()/existing fixtures instead of start_using_configuration_for_example().

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 31, 2026 17:45
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

openml/flows/functions.py:59

  • The get_flow docstring still describes the legacy per-flow XML cache (openml.config.cache_directory/flows/{flow_id}/flow.xml) and lists OpenMLCacheException, but get_flow now delegates to openml._backend.flow.get(...) which uses the HTTP client cache instead. Please update the docstring (side effects + raises section) to match the new caching/error behavior.
    Queries the OpenML REST API for the flow metadata and returns an
    :class:`OpenMLFlow` instance. If the flow is already cached locally,
    the cached copy is returned. Optionally the flow can be re-instantiated
    into a concrete model instance using the registered extension.

    Parameters
    ----------
    flow_id : int
        The OpenML flow id.
    reinstantiate : bool, optional (default=False)
        If True, convert the flow description into a concrete model instance
        using the flow's extension (e.g., sklearn). If conversion fails and
        ``strict_version`` is True, an exception will be raised.
    strict_version : bool, optional (default=True)
        When ``reinstantiate`` is True, whether to enforce exact version
        requirements for the extension/model. If False, a new flow may
        be returned when versions differ.

    Returns
    -------
    OpenMLFlow
        The flow object with metadata; ``model`` may be populated when
        ``reinstantiate=True``.

    Raises
    ------
    OpenMLCacheException
        When cached flow files are corrupted or cannot be read.
    OpenMLServerException
        When the REST API call fails.

    Side Effects
    ------------
    - Writes to ``openml.config.cache_directory/flows/{flow_id}/flow.xml``
      when the flow is downloaded from the server.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +226 to +229
@staticmethod
def _convert_v2_to_v1_format(v2_json: dict[str, Any]) -> dict[str, dict]:
"""Convert v2 JSON response to v1 XML-dict format for OpenMLFlow._from_dict().

Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_convert_v2_to_v1_format is annotated as returning dict[str, dict], but the returned structure contains non-dict values (e.g., strings, lists). This type annotation is incorrect and can break static type checking; consider using dict[str, Any] or dict[str, dict[str, Any]] (or a dedicated TypedDict) instead.

Copilot uses AI. Check for mistakes.
TestBase._mark_entity_for_removal("flow", flow.flow_id, flow.name)
TestBase.logger.info(f"collected from {__file__.split('/')[-1]}: {flow.flow_id}")


Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This blank line contains trailing whitespace. Please remove the whitespace to satisfy common linters (e.g., W293) and keep diffs clean.

Suggested change

Copilot uses AI. Check for mistakes.
model = sklearn.ensemble.RandomForestClassifier()
flow = self.extension.model_to_flow(model)
api_call_mock.return_value = (

Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This blank line contains trailing whitespace. Please remove the whitespace to satisfy common linters (e.g., W293) and keep diffs clean.

Copilot uses AI. Check for mistakes.
get_flow_mock.return_value = flow

flow.publish()
# Not collecting flow_id for deletion since this is a test for failed upload
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says this is a test for a “failed upload”, but the mocked response is a successful <oml:upload_flow> with an id and flow.publish() is expected to succeed initially. Please update/remove the comment to reflect the actual test behavior.

Suggested change
# Not collecting flow_id for deletion since this is a test for failed upload
# The first publish is expected to succeed; cleanup is handled after the error scenario

Copilot uses AI. Check for mistakes.
@Omswastik-11 Omswastik-11 requested a review from geetu040 March 31, 2026 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENH] V1 → V2 API Migration - flows

7 participants