Skip to content
View Yaxin9Luo's full-sized avatar

Organizations

@MetaAgentX

Block or report Yaxin9Luo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Yaxin9Luo/README.md

Typing SVG

Homepage Google Scholar MetaAgentX Email

MBZUAI ML PhD LongCat Research Intern


Mission Control

Multimodal Agents

Models and systems that understand, generate, reason, plan, and act across modalities.

Interactive Worlds

Benchmarks and defenses for MLLM agents operating through web and device interfaces.

Editable Design

Long-horizon agentic workflows for visual design, plotting, and multimodal content creation.

Research Code Impact

Research code stars Research code forks Tracked repositories

Selected projects I lead or contribute to have received 450 GitHub stars and 24 forks across 4 personal and organization repositories.

Tracked repositories
Repository Stars Forks Focus
VILA-Lab/FigMirror 315 19 Automated plotting from paper figure styles.
MetaAgentX/OpenCaptchaWorld 72 3 Web-based benchmark and platform for evaluating multimodal LLM agents.
Yaxin9Luo/Gamma-MOD 43 2 Mixture-of-Depth adaptation for efficient multimodal large language models.
MetaAgentX/NextGen-CAPTCHAs 20 0 Scalable GUI-agent defense framework based on cognitive gaps.

Last updated: 2026-05-27. Managed from data/research-repos.json.

Stack

Python PyTorch Transformers Diffusers DeepSpeed Docker Linux Weights & Biases

Activity

Activity graph


Homepage · Publications · CV · LinkedIn · X · Yaxin.Luo@mbzuai.ac.ae

Pinned Loading

  1. VILA-Lab/FigMirror VILA-Lab/FigMirror Public

    An Automated AI Agent Tool for Plotting Your Data in Any Paper's Figure Style.

    Python 330 19

  2. MetaAgentX/OpenCaptchaWorld MetaAgentX/OpenCaptchaWorld Public

    [NeurIPS 2025] The first web-based benchmark and platform to evaluate visual reasoning and interaction capabilities of MLLM powered agents through diverse and dynamic CAPTCHA puzzles.

    JavaScript 72 3

  3. Gamma-MOD Gamma-MOD Public

    [ICLR2025] γ -MOD: Mixture-of-Depth Adaptation for Multimodal Large Language Models

    Python 43 2

  4. MetaAgentX/NextGen-CAPTCHAs MetaAgentX/NextGen-CAPTCHAs Public

    [ICML 2026]A defense framework against MLLM-based web GUI agents. This repository provides both the generative CAPTCHA system and tools for evaluating agent resistance.

    Python 20