🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts
-
Updated
Sep 29, 2024 - Python
🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts
Performance Comparisons between different implementations of Dynamic Discrete Samplers
This repository replicates the NemoStation/Marlin-2B paradigm natively on Apple Silicon using the MLX framework, achieving ultra-fast temporal video captioning and sub-second visual grounding entirely within Apple's Unified Memory space.
Add a description, image, and links to the dynamic-sampling topic page so that developers can more easily learn about it.
To associate your repository with the dynamic-sampling topic, visit your repo's landing page and select "manage topics."