Voice2Code is a macOS-focused voice-to-instruction refiner for developer workflows.
Latest Release · Download Installer · Report Issues
It is built for a simple workflow:
- dictate or paste rough text into any macOS text field
- select the text
- trigger a Quick Action
- replace the selection with a cleaner instruction
Voice input is fast, but spoken engineering text is usually noisy:
- filler words
- misrecognized technical terms
- weak structure
- unclear action / condition / scope boundaries
Voice2Code keeps the interaction local and lightweight, then uses an LLM to turn rough text into something you can actually reuse in engineering workflows.
- Cross-app Quick Action
Works through
AI提纯指令.workflow, so the same trigger can be used across macOS text fields. - Minimal app shell
Voice2Code.appis only a small control shell for setup, provider selection, network config, and runtime entry. - Structured refinement core
The Python refiner core keeps the current architecture focused on:
- two-stage intent + generation
- bilingual contracts
- provider-neutral execution
flowchart LR
A["Selected text"] --> B["Quick Action"]
B --> C["Preprocess + RequestContext"]
C --> D["Intent Router"]
D --> E["Resolved Contract"]
E --> F["Prompt Assembler"]
F --> G["LLM Generation"]
G --> H["Parser / Validator"]
H --> I["Replace selected text"]
This is the core value of the project:
- keep the entry point simple
- keep routing minimal
- keep generation contract-driven
- return a directly usable result back into the current text field
flowchart LR
A["RequestContext"] --> B["IntentAnalysisResult"]
B --> C1["main_scene"]
B --> C2["structure_mode"]
D["forced_rewrite_strategy"] --> C3["rewrite_id"]
subgraph R["Resolved Contract"]
C1 --> R1["scene_policies"]
C2 --> R2["structure_policies"]
C3 --> R3["rewrite_policies"]
end
G["global_contract"] --> H["Prompt Assembler"]
R1 --> H
R2 --> H
R3 --> H
I["runtime_context"] --> H
J["glossary_result"] --> H
K["user_input"] --> H
H --> L["PromptBundle"]
L --> M["LLM output"]
This is where Voice2Code is better than a simple single-prompt rewrite tool:
- stage 1 only decides the minimum routing fields
- stage 2 does minimal dynamic assembly, not full template stuffing
- local code does deterministic parsing and validation, instead of semantic over-correction
Current delivery shape:
Quick Action + Voice2Code.app
Current release goal:
- stable local delivery
- not a fully packaged notarized macOS app
Current provider state:
- Gemini is the primary release baseline
- OpenAI is integrated and minimally validated
- Doubao is integrated in code but still needs real-key validation
Build the current installer locally:
python3 scripts/build_dist.pyMain references:
The current installer flow is intentionally simplified into two stages:
- install confirmation
- initialization window
- provider selection
- direct / proxy choice
- API key input
- connectivity test
- automatic refinement smoke test
- in-window completion state
The successful path no longer opens a third standalone completion dialog.
Top-level folders:
config/runtime configurationdocs/architecture, PRD, implementation and closeout docsscripts/build, installer, app shell, and refiner codetests/regression, smoke, and evaluation tooling
This repository is in closeout / stabilization phase.
In scope:
- stable install flow
- Quick Action registration
- initialization flow
- provider selection / network config / connectivity test
- regression, token smoke, and quality evaluation assets
Not a current release gate:
- full macOS app notarization
- complete
SecItem* + codesign + entitlementdelivery - stronger system-level secret persistence guarantees
- plugin productization
Common local commands:
python3 scripts/build_dist.py
python3 tests/run_voice2code_regression.py
python3 tests/run_voice2code_token_smoke.py
python3 tests/run_voice2code_quality_eval.pyThe build produces versioned installer artifacts under dist/.
Voice2Code does not ship with embedded provider keys.
Current behavior:
- environment variables can explicitly provide provider API keys
- the app shell may persist configuration when the current environment supports it
- plaintext API keys are not written into repo config files
Important boundary:
- this repository does not currently claim that system-level seamless secure storage is fully solved as a release guarantee
This project is licensed under the Apache License 2.0. See LICENSE.