mariusargatu

Marius Argatu

Senior SDET · AI & LLM Quality Engineering

About me

Senior SDET, 10 years building test infrastructure where reliability is non-negotiable: airline payments, multi-tenant healthcare SaaS, fintech, financial market data.

For the past few years my focus is testing AI-powered features: LLM evaluation harnesses, RAG retrieval testing, golden-dataset regression, agentic system testing. I build the frameworks and runners, and wire evaluation into CI/CD.

→ Full background, experience and competencies on mariusargatu.com/about

What I work on

AI / LLM testing: does the model answer faithfully, and can you prove it? faithfulness, answer relevancy, and hallucination scoring · RAG retrieval metrics (hit-rate, MRR, precision/recall@k) · agentic multi-turn and tool-call testing
Test architecture: the frameworks and runners under the tests. model-based testing (xState) · property-based testing (Schemathesis, Hypothesis, fast-check) · Pydantic schemas

From the blog

- [The Golden Dataset: Building the Oracle You Test Against](https://www.mariusargatu.com/blog/the-golden-dataset/)

→ Read everything on mariusargatu.com/blog

Stack

AI / LLM: DeepEval, RAGAS, Langfuse, Pydantic
Languages: Python, TypeScript
Test: Playwright, Pytest, Vitest
API and contract: GraphQL, OpenAPI
CI/CD and infra: GitHub Actions, Docker

“A test suite is a liability as much as an asset. Every test earns its place.”

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mariusargatu

Block or report mariusargatu

Marius Argatu

About me

What I work on

From the blog

Stack

Popular repositories Loading

Uh oh!