Skip to content

[Audio Codec] Lhotse data loading updates and fixes#15742

Open
rfejgin wants to merge 12 commits into
NVIDIA-NeMo:mainfrom
rfejgin:codec_32k
Open

[Audio Codec] Lhotse data loading updates and fixes#15742
rfejgin wants to merge 12 commits into
NVIDIA-NeMo:mainfrom
rfejgin:codec_32k

Conversation

@rfejgin
Copy link
Copy Markdown
Collaborator

@rfejgin rfejgin commented Jun 1, 2026

  1. Move the random-segment-selection functionality from Lhotse to our dataset class, AudioCodecLhotseDataset. The corresponding built-in Lhotse functionality (truncate_duration) operates on the parent recording, which is not what we want.

  2. Switch from batch_duration to batch_size for specifying the training batch size. In our setting, they are equivalent since the item size is fixed for all batch items, and it's clearer this way, now that segment selection is happening in the Dataset class.

rfejgin added 7 commits May 1, 2026 17:45
Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>
…length

Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>
Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>
Need to set truncate duration in both the dataset and the loader otherwise Lhotse's tracking of batch duration will be incorrect.

Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>
Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Jun 1, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-actions github-actions Bot added the TTS label Jun 1, 2026
Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>
@rfejgin
Copy link
Copy Markdown
Collaborator Author

rfejgin commented Jun 2, 2026

/ok to test c9ac8c9

rfejgin added 4 commits June 1, 2026 17:43
Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>
Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>
Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>
Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>
@rfejgin rfejgin changed the title Codec 32k [Audio Codec] Fix segment selection in data loader Jun 2, 2026
@rfejgin rfejgin changed the title [Audio Codec] Fix segment selection in data loader [Audio Codec] Lhotse data loading updates and fixes Jun 2, 2026
@rfejgin rfejgin marked this pull request as ready for review June 2, 2026 22:36
@rfejgin rfejgin requested review from blisc and rlangman June 2, 2026 22:37
@rfejgin rfejgin marked this pull request as draft June 2, 2026 22:52
@rfejgin rfejgin marked this pull request as ready for review June 2, 2026 23:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant