Commit e24d7c5
Beichen Huang
init
0 parents commit e24d7c5
3,333 files changed
Lines changed: 816229 additions & 0 deletions
File tree
- .buildkite
- lm-eval-harness
- configs
- performance-benchmarks
- scripts
- tests
- scripts
- hardware_ci
- scheduled_integration_test
- tpu
- .gemini
- .github
- ISSUE_TEMPLATE
- scripts
- workflows
- matchers
- scripts
- benchmarks
- auto_tune
- cutlass_benchmarks
- disagg_benchmarks
- fused_kernels
- kernels
- deepgemm
- multi_turn
- overheads
- cmake
- external_projects
- csrc
- attention
- mla
- cutlass_sm100_mla
- device
- kernel
- core
- cpu
- sgl-kernels
- cutlass_extensions
- epilogue
- mamba/mamba_ssm
- moe
- marlin_moe_wna16
- permute_unpermute_kernels
- quantization
- awq
- cutlass_w4a8
- fp4
- fused_kernels
- gguf
- gptq_allspark
- gptq_marlin
- gptq
- hadamard/hadacore
- machete
- marlin/sparse
- common
- w8a8
- cutlass
- c3x
- moe
- fp8
- amd
- nvidia
- int8
- quickreduce
- rocm
- sparse/cutlass
- docker
- docs
- api
- vllm
- assets
- contributing
- deployment
- design
- arch_overview
- cuda_graphs
- debug_vllm_compile
- fused_moe_modular_kernel
- hybrid_kv_cache_manager
- metrics
- paged_attention
- prefix_caching
- tpu
- features
- disagg_encoder
- disagg_prefill
- logos
- cli
- bench
- sweep
- community
- configuration
- contributing
- ci
- dockerfile
- model
- deployment
- frameworks
- integrations
- design
- examples
- features
- quantization
- getting_started
- installation
- mkdocs
- hooks
- javascript
- overrides
- partials
- stylesheets
- models
- extensions
- hardware_supported_models
- serving
- integrations
- training
- usage
- examples
- offline_inference
- basic
- disaggregated-prefill-v1
- kv_load_failure_recovery
- logits_processor
- openai_batch
- pooling
- profiling_tpu
- qwen2_5_omni
- online_serving
- chart-helm
- templates
- tests
- dashboards
- grafana
- perses
- disaggregated_encoder
- disaggregated_serving_p2p_nccl_xpyd
- disaggregated_serving
- elastic_ep
- openai_embedding_long_text
- opentelemetry
- pooling
- prometheus_grafana
- structured_outputs
- others
- lmcache
- disagg_prefill_lmcache_v1
- configs
- requirements
- tests
- basic_correctness
- benchmarks
- compile
- piecewise
- config
- cuda
- detokenizer
- distributed
- engine
- entrypoints
- llm
- offline_mode
- openai
- correctness
- tool_parsers
- pooling
- correctness
- llm
- openai
- sagemaker
- evals
- gpt_oss
- gsm8k
- configs
- kernels
- attention
- core
- mamba
- moe
- modular_kernel_tools
- quantization
- kv_transfer
- lora
- model_executor
- model_loader
- fastsafetensors_loader
- runai_model_streamer
- tensorizer_loader
- models
- fixtures
- language
- generation_ppl_test
- generation
- pooling_mteb_test
- pooling
- multimodal
- generation
- vlm_utils
- pooling
- processing
- quantization
- multimodal
- assets
- plugins_tests
- plugins
- lora_resolvers
- prithvi_io_processor_plugin
- prithvi_io_processor
- vllm_add_dummy_model
- vllm_add_dummy_model
- vllm_add_dummy_platform
- vllm_add_dummy_platform
- vllm_add_dummy_stat_logger
- dummy_stat_logger
- prompts
- quantization
- reasoning
- samplers
- standalone_tests
- system_messages
- tokenization
- tool_use
- mistral
- tools
- tpu
- lora
- transformers_utils
- utils_
- v1
- attention
- core
- cudagraph
- distributed
- e2e
- ec_connector
- integration
- unit
- engine
- entrypoints
- llm
- openai
- serving_responses
- executor
- generation
- kv_connector
- nixl_integration
- unit
- kv_offload
- logits_processors
- metrics
- sample
- shutdown
- spec_decode
- structured_output
- tpu
- worker
- tracing
- worker
- vllm_test_utils
- vllm_test_utils
- weight_loading
- tools
- ep_kernels
- elastic_ep
- pre_commit
- profiler
- nsys_profile_tools
- images
- vllm-tpu
- vllm
- assets
- attention
- backends
- layers
- ops
- utils
- benchmarks
- lib
- sweep
- compilation
- config
- device_allocator
- distributed
- device_communicators
- ec_transfer
- ec_connector
- eplb
- kv_transfer
- kv_connector
- v1
- lmcache_integration
- p2p
- kv_lookup_buffer
- kv_pipe
- engine
- entrypoints
- anthropic
- cli
- benchmark
- openai
- tool_parsers
- sagemaker
- inputs
- logging_utils
- lora
- layers
- ops
- ipex_ops
- torch_ops
- triton_ops
- xla_ops
- punica_wrapper
- model_executor
- layers
- fla
- ops
- fused_moe
- configs
- mamba
- ops
- quantization
- compressed_tensors
- schemes
- transform
- schemes
- kernels
- mixed_precision
- scaled_mm
- quark
- schemes
- utils
- configs
- rotary_embedding
- model_loader
- models
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
Lines changed: 13 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
Lines changed: 12 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
Lines changed: 12 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
Lines changed: 12 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
Lines changed: 12 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
Lines changed: 12 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
Lines changed: 12 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
Lines changed: 12 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
0 commit comments