[NeurIPS 2025] Thinkless: LLM Learns When to Think
-
Updated
Sep 26, 2025 - Python
[NeurIPS 2025] Thinkless: LLM Learns When to Think
Parameter Efficient Fine-Tuning of various Qwen3 family of models on the Mult-It dataset using various approaches
Hybrid Reasoning Policy Optimization (HRPO): a research prototype for hybrid latent reasoning with RL.
Local AI workbench for embeddings, summarization, and OpenAI Agent SDK–compatible workflows. Supports Gemma models, GPT-OSS tool-calling, hardware acceleration, caching, and rate limiting, plus cloud-offloaded, persona-driven summarization through Gemini.
Add a description, image, and links to the hybrid-reasoning topic page so that developers can more easily learn about it.
To associate your repository with the hybrid-reasoning topic, visit your repo's landing page and select "manage topics."