[Model] Implement LoRA support for Qwen3ASRForConditionalGeneration by petern48 · Pull Request #37247 · vllm-project/vllm

petern48 · 2026-03-17T02:27:07Z

Purpose

This PR adds LoRA support for Qwen3ASRForConditionalGeneration model.

For this to work for the audio tower, I had to make a few additional changes:

Implement get_num_mm_encoder_tokens()
Replaced some nn.Linears with ReplicatedLinear along the audio tower path.
Qwen3ASR seems to be our first model with a tower, but no connector. In gpu_model_runner.py, I found that the hasattr(self.model, "get_num_mm_connector_tokens") was improperly evaluating to True due to inheritance, despite the model not implementing get_num_mm_connector_tokens(). This was leading us to incorrectly go down that path and encounter an error. I've modified the condition to check if connector actually exists in the mapping.

Test Plan

I tested on the following public adaptor available on HuggingFace: ha0yuan/Qwen3-ASR-LoRa-ChineseAviation-Tiny.

vllm serve Qwen/Qwen3-ASR-1.7B \
  --enable-lora \
  --enable-tower-connector-lora \
  --lora-modules aviation=ha0yuan/Qwen3-ASR-LoRa-ChineseAviation-Tiny \
  --port 8000

I also double-checked that the adapters are properly shown when querying the /v1/models endpoint:

curl localhost:8000/v1/models | jq .

{
  "object": "list",
  "data": [
    {
      "id": "Qwen/Qwen3-ASR-1.7B",
      "object": "model",
      ...
      "root": "Qwen/Qwen3-ASR-1.7B",
      ...
    },
    {
      "id": "medpl",
       ...
      "root": "AleksanderObuchowski/Qwen3-ASR-1.7B-med-pl-lora-decoder-only",
      ...
    }
}

Then I used a Python script to load in a .wav file and query the /v1/chat/completions endpoint. Specifically, I used this audio file as input.

Test Result

Before this PR, the server would error with
ValueError: Qwen3ASRForConditionalGeneration does not support LoRA yet.

After this change, the server starts up properly, and I successfully queried the /v1/chat/completions endpoint.

Querying both the raw model and the adaptor, I verified that the output differs when the LoRa adaptor is enabled. The outputs are below. (Notice, the raw model transcribes to numbers (e.g 9 and 10), while the adaptor transcribes the numbers to words ("nine" and "ten).

== no LoRA ==
language English<asr_text>November the 10th, Wednesday, 9 p.m. I'm standing in a dark alley. After waiting several hours, the time has come. A woman with long dark hair approaches. I have to act, and fast, before she realizes what has happened. I must find out.

== with LoRA ==
language English<asr_text>November the tenth, Wednesday, nine p.m. I'm standing in a dark alley. After waiting several hours, the time has come. A woman with long dark hair approaches. I have to act, and fast, before she realizes what has happened. I must find out.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Peter Nguyen <petern0408@gmail.com>

gemini-code-assist

Code Review

This pull request adds LoRA support for the Qwen3ASRForConditionalGeneration model. The changes introduce the SupportsLoRA interface and define the necessary attributes (packed_modules_mapping, embedding_modules, lora_skip_prefixes) to correctly apply LoRA adapters to the language model portion of the model, while skipping the audio tower. My review of these changes did not identify any issues.

Signed-off-by: Peter Nguyen <petern0408@gmail.com>

mergify · 2026-03-17T02:34:34Z

Documentation preview: https://vllm--37247.org.readthedocs.build/en/37247/

github-actions · 2026-03-17T02:35:10Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

Signed-off-by: Peter Nguyen <petern0408@gmail.com>

Previously, the hasattr(self.model, "get_num_mm_connector_tokens") condition would evaluate to True for this case due to inheritance, despite the method not being overrided. Signed-off-by: Peter Nguyen <petern0408@gmail.com>

jeejeelee · 2026-03-21T05:59:21Z

Have you tested this PR with the real LoRA adapter?

petern48 · 2026-03-21T17:20:03Z

@jeejeelee Yes, I have. I listed the exact public adapter I used in the PR description. I also just linked the data I used and the text output it generated, verifying that querying the adapter successfully leads to output that is different from the raw base model.

mergify · 2026-03-24T18:33:07Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @petern48.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Peter Nguyen <petern0408@gmail.com>

jeejeelee · 2026-03-31T00:52:25Z

Please fix the pre-commit failure, thank you

petern48 · 2026-03-31T01:15:57Z

Error: PR must have the 'ready' label or the author must have at least 4 merged PRs (found 0).

@jeejeelee I think you just need to add the ready label. #37544 introduced a change that prevents pre-commit from running by default. I have the pre-commit setup locally, so it should already be formatted correctly. It was also already passing before, and all I've done since then is merge with main. Could you add it, please?

…llm-project#37247) Signed-off-by: Peter Nguyen <petern0408@gmail.com>

…llm-project#37247) Signed-off-by: Peter Nguyen <petern0408@gmail.com> Signed-off-by: Avinash Singh <avinashsingh.rcoem@gmail.com>

Implement LoRA support for Qwen3ASRForConditionalGeneration

b73186c

Signed-off-by: Peter Nguyen <petern0408@gmail.com>

mergify Bot added the qwen Related to Qwen models label Mar 17, 2026

gemini-code-assist Bot reviewed Mar 17, 2026

View reviewed changes

Update docs

2e85c72

Signed-off-by: Peter Nguyen <petern0408@gmail.com>

mergify Bot added the documentation Improvements or additions to documentation label Mar 17, 2026

petern48 marked this pull request as ready for review March 17, 2026 03:13

petern48 requested a review from sighingnow as a code owner March 17, 2026 03:13

Isotr0py requested a review from jeejeelee March 17, 2026 06:00

jeejeelee reviewed Mar 17, 2026

View reviewed changes

Comment thread vllm/model_executor/models/qwen3_asr.py Outdated

jeejeelee reviewed Mar 17, 2026

View reviewed changes

Comment thread vllm/model_executor/models/qwen3_asr.py Outdated

petern48 added 4 commits March 17, 2026 12:05

Feedback: clean up and avoid skipping audio tower

afb0c98

Signed-off-by: Peter Nguyen <petern0408@gmail.com>

Implement get_num_mm_encoder_tokens() function for tower lora support

c62db56

Signed-off-by: Peter Nguyen <petern0408@gmail.com>

Replace nn.Linear layers with ReplicatedLinear to support LoRA

6eea18d

Signed-off-by: Peter Nguyen <petern0408@gmail.com>

petern48 requested a review from njhill as a code owner March 17, 2026 19:11

mergify Bot added the v1 label Mar 17, 2026

petern48 requested a review from jeejeelee March 18, 2026 23:11

This was referenced Mar 21, 2026

[Feature] Add LoRA support for Qwen3ASRForConditionalGeneration #37734

Closed

[Feature] Add LoRA support for Qwen3ASRForConditionalGeneration #37377

Closed

mergify Bot added the needs-rebase label Mar 24, 2026

Merge branch 'main' into lora_Qwen3ASRForConditionalGeneration

c29a6b4

Signed-off-by: Peter Nguyen <petern0408@gmail.com>

mergify Bot removed the needs-rebase label Mar 24, 2026

jeejeelee mentioned this pull request Mar 31, 2026

[Feature] Add LoRA support for Qwen3ASRForConditionalGeneration #38581

Closed

DarkLight1337 added verified Run pre-commit for new contributors without triggering other tests ready ONLY add when PR is ready to merge/full CI is needed and removed verified Run pre-commit for new contributors without triggering other tests labels Apr 9, 2026

Merge branch 'main' into lora_Qwen3ASRForConditionalGeneration

8fdb666

petern48 requested a review from vadiklyutiy as a code owner April 9, 2026 18:03

petern48 added 2 commits April 9, 2026 18:02

Merge branch 'main' into lora_Qwen3ASRForConditionalGeneration

408a74b

Merge branch 'main' into lora_Qwen3ASRForConditionalGeneration

3161231

vadiklyutiy approved these changes Apr 10, 2026

View reviewed changes

vadiklyutiy merged commit 8d0f908 into vllm-project:main Apr 10, 2026
64 checks passed

petern48 deleted the lora_Qwen3ASRForConditionalGeneration branch April 10, 2026 15:21

wojciech-wais pushed a commit to wojciech-wais/vllm that referenced this pull request Apr 13, 2026

[Model] Implement LoRA support for Qwen3ASRForConditionalGeneration (v…

7307676

…llm-project#37247) Signed-off-by: Peter Nguyen <petern0408@gmail.com>

whk-lab pushed a commit to whk-lab/vllm that referenced this pull request Apr 23, 2026

[Model] Implement LoRA support for Qwen3ASRForConditionalGeneration (v…

8906757

…llm-project#37247) Signed-off-by: Peter Nguyen <petern0408@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Model] Implement LoRA support for Qwen3ASRForConditionalGeneration#37247

[Model] Implement LoRA support for Qwen3ASRForConditionalGeneration#37247
vadiklyutiy merged 10 commits intovllm-project:mainfrom
petern48:lora_Qwen3ASRForConditionalGeneration

petern48 commented Mar 17, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

mergify Bot commented Mar 17, 2026

Uh oh!

github-actions Bot commented Mar 17, 2026

Uh oh!

Uh oh!

Uh oh!

jeejeelee commented Mar 21, 2026

Uh oh!

petern48 commented Mar 21, 2026

Uh oh!

mergify Bot commented Mar 24, 2026

Uh oh!

jeejeelee commented Mar 31, 2026

Uh oh!

petern48 commented Mar 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

petern48 commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

mergify Bot commented Mar 17, 2026

Uh oh!

github-actions Bot commented Mar 17, 2026

Uh oh!

Uh oh!

Uh oh!

jeejeelee commented Mar 21, 2026

Uh oh!

petern48 commented Mar 21, 2026

Uh oh!

mergify Bot commented Mar 24, 2026

Uh oh!

jeejeelee commented Mar 31, 2026

Uh oh!

petern48 commented Mar 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

petern48 commented Mar 17, 2026 •

edited

Loading