Conversation
pwilkin
left a comment
There was a problem hiding this comment.
There doesn't seem to be any support for structured outputs.
@pwilkin What does |
|
@pwilkin Actually, I have another idea. I could further improve the chat template to recognize formatted tool names from MCP servers (e.g., However, this would require, as an example, mapping How can I implement this kind of custom transformation using the new PEG parser? |
|
|
||
| {#- ========== Workaround for llama.cpp crashing ========== #} | ||
| {%- for message in messages %} | ||
| {%- if message.role == "assistant" %} | ||
| {%- if message.tool_calls | length == 0 %} | ||
| {%- set fake_function = namespace(name='fake_name', arguments='{}') %} | ||
| {%- set fake_function = namespace(function=fake_function) %} | ||
| {%- set message.tool_calls = [fake_function, fake_function] %} | ||
| {%- endif %} | ||
| {%- endif %} | ||
| {%- endfor %} | ||
| {#- ========== Workaround for llama.cpp crashing ========== #} |
There was a problem hiding this comment.
Was this fixed? If not, someone should look at it.
There was a problem hiding this comment.
This part add fake functions to assistant messages and it will prevent llama-server from crash.
There was a problem hiding this comment.
The crash seems to occur here:
llama.cpp/common/jinja/caps.cpp
Lines 228 to 252 in 740a447
It assumes that the message list passed into the chat template is immutable. However, in llama.cpp’s Jinja engine, a reference is passed to the template rather than a copy, which makes it effectively mutable.
You can create a custom mapper and add another chat format. See chat-peg-parser.cpp, you can likely inherit the one there. Definitely an interesting model... Is it not possible to hardcode the server name to something like "localhost" for better compatibility? |
Yes, that is possible. However, this model is fine-tuned to use specific server names such as Mapping everything to a generic server would in pratically require additional reasoning tokens during inference, and the results would not be as good as when using the original format. |
Don't even need to create a custom mapper since for the analysis I made a tagged mapper that can be used out-of-the-box for this :) See the parser usages in |
Basically this: if (has_response_format) {
auto response_format = p.rule("response-format", p.content(p.schema(p.json(), "response-format-schema", inputs.json_schema)));
return ctx.reasoning_parser + p.space() + p.choice({
p.literal("```json") + p.space() + response_format + p.space() + p.literal("```"),
response_format
}) + p.end();
} |
Oh... I know what you means. I'll implement it later. Let me convert this PR to draft before I fully implement it. |
|
@pwilkin Mind taking another look? |
|
@pwilkin Is current implementation good for merge now? |
|
Yeah, almost good - please add proper tests to |
|
@pwilkin Done |
|
@aldehir care to take a look? |
|
Aight going to run CI and merge if green. |
The MiroThinker series v1.0–v1.7 (and likely every version before v2.0) uses an MCP-style tool call:
It requires the MCP server name to be included in the system prompt, which makes it impossible for the autoparser to work with it.