Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions ajet/backbone/main_verl.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
from torch.utils.data import Dataset as TorchDataset

# Create training and validation datasets.
from ajet.backbone.warm_up import warm_up_process
from ajet.task_reader import RouterTaskReader, task_to_standard_dataset
Comment on lines 29 to 31
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The import of warm_up_process is currently placed between the comment # Create training and validation datasets. and the actual dataset-related imports, which is confusing. Furthermore, for Ray remote classes like TaskRunner, it is a common practice to keep imports local to the methods where they are used to keep the actor definition lightweight and avoid unnecessary side effects during module import on the head node. This import should be moved inside the run method.

Suggested change
# Create training and validation datasets.
from ajet.backbone.warm_up import warm_up_process
from ajet.task_reader import RouterTaskReader, task_to_standard_dataset
# Create training and validation datasets.
from ajet.task_reader import RouterTaskReader, task_to_standard_dataset

from ajet.utils.process_dataset import create_rl_sampler
from ajet.utils.core_env_vars import get_runtime_env
Expand Down Expand Up @@ -116,6 +117,7 @@ def run(self, config):
from loguru import logger
from omegaconf import OmegaConf
from verl.utils.fs import copy_to_local
warm_up_process(config)
Comment on lines 117 to +120
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The warm_up_process function uses configuration values such as experiment_dir and experiment_name to initialize logging. These values often contain Hydra interpolations (e.g., ${hydra:runtime.output_dir}) that must be resolved before they can be used as valid file system paths. Calling warm_up_process before OmegaConf.resolve(config) may lead to incorrect log paths or initialization failures. It is recommended to resolve the configuration first. Additionally, importing warm_up_process locally here maintains consistency with other imports in this method. Note that the existing call to OmegaConf.resolve(config) at line 124 will become redundant and should be removed.

Suggested change
from loguru import logger
from omegaconf import OmegaConf
from verl.utils.fs import copy_to_local
warm_up_process(config)
from loguru import logger
from omegaconf import OmegaConf
from verl.utils.fs import copy_to_local
from ajet.backbone.warm_up import warm_up_process
OmegaConf.resolve(config)
warm_up_process(config)


logger.info(f"TaskRunner hostname: {socket.gethostname()}, PID: {os.getpid()}")
pprint(OmegaConf.to_container(config, resolve=True))
Expand Down
Loading