yourconscience

Follow

🎯

Focusing

Kirill Korikov yourconscience

🎯

Focusing

Follow

ML Engineer

5 followers · 18 following

Pinned Loading

llm_quest_benchmark llm_quest_benchmark Public

Benchmark for LLM agent architectures on interactive fiction quests. Can your model plan its way out of a space station?

Python 6 1