Skip to content

[FEATURE] support model.from_pretrained without the need of init distributed #20

@Jiaxin-Wen

Description

@Jiaxin-Wen
from model_center.layer import CPM1
CPM1.from_pretrained("cpm1-large")

currently could not work since the function check_web_and_convert_path calls bmt.rank() or bmt.print_rank() to prevent every process downloads the checkpoint in a multi-gpu scenario.

While ModelCenter is mainly designed to support distributed training, I think it is still important to support such a common code snippet.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions