Skip to content

[REQUEST] Enabling support for vision draft models. #384

@MoetasimR

Description

@MoetasimR

Problem

I am using a quantized version of pixtral large and I can't load the vision modules of a smaller variant. I cannot perform inference with images, I can only perform inference with text.
I imagine this will be a much needed feature as multimodal inference is always less performant than raw text.

Solution

Create a config for enabling this feature, I have a very strong feeling that this is low-hanging fruit.

Alternatives

No response

Explanation

I imagine this will be a much needed feature as multimodal inference is always less performant than raw text.

Examples

No response

Additional context

No response

Acknowledgements

  • I have looked for similar requests before submitting this one.
  • I understand that the developers have lives and my issue will be answered when possible.
  • I understand the developers of this program are human, and I will make my requests politely.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions