Skip to content

Pull requests: OpenHands/benchmarks

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Strengthen restriction against accessing installed package versions
#691 opened Apr 23, 2026 by juanmichelini Collaborator Loading…
fix(swebench): release disk during full image assembly
#690 opened Apr 23, 2026 by simonrosenberg Collaborator Loading…
eval: honor GPT-5 prompt when available
#686 opened Apr 21, 2026 by enyst Collaborator Draft
fix(llm_config): disable reasoning_effort for Opus 4.7
#670 opened Apr 17, 2026 by juanmichelini Collaborator Loading…
Filter SWE-Bench Multimodal image builds to curated subset
#644 opened Apr 6, 2026 by juanmichelini Collaborator Loading…
Add agent-serving profiling benchmark
#633 opened Apr 4, 2026 by neubig Contributor Draft
fix: reset BuildKit cache between retries for base/assembly builds
#631 opened Apr 4, 2026 by simonrosenberg Collaborator Loading…
3 tasks
Update Claude ACP package references
#629 opened Apr 3, 2026 by simonrosenberg Collaborator Loading…
build(deps): bump the version-all group across 1 directory with 21 updates dependencies Pull requests that update a dependency file python:uv Pull requests that update python:uv code
#596 opened Mar 31, 2026 by dependabot Bot Loading…
Add benchmark-side Apptainer workspace support
#509 opened Mar 12, 2026 by neubig Contributor Draft
build(deps): bump the version-all group across 1 directory with 5 updates dependencies Pull requests that update a dependency file github_actions Pull requests that update GitHub Actions code
#492 opened Mar 9, 2026 by dependabot Bot Loading…
NeMo Evaluator Integration
#455 opened Feb 26, 2026 by simonrosenberg Collaborator Loading…
Add security benchmark with ASTRA
#361 opened Jan 26, 2026 by XZ-X Draft
Agentic code search
#141 opened Dec 8, 2025 by adityasoni9998 Contributor Loading…
ProTip! Exclude everything labeled bug with -label:bug.