Skip to main content

Qwen3 VL Embedding 8B benchmarks

  • TP/DP/PP - tensor/data/pipeline parallelism
  • TG t/s - text generation in tps
  • PP t/s - prompt processing in tps
dateimageModelTP/DP/PPPromptsThreadsCtx toksGen toksDurationTG t/sPP t/sWorkloadAbout
20260326-194703docker.io/mixa3607/vllm-gfx906:43566ec-rocm-7.2.0-aiinfos-20260324214800Qwen/Qwen3-VL-Embedding-8B1/4/13212048000:01:39-661.73embedrecipe
20260326-194822docker.io/mixa3607/vllm-gfx906:43566ec-rocm-7.2.0-aiinfos-20260324214800Qwen/Qwen3-VL-Embedding-8B1/4/16442048000:01:03-2,072.39embedrecipe
20260326-195021docker.io/mixa3607/vllm-gfx906:43566ec-rocm-7.2.0-aiinfos-20260324214800Qwen/Qwen3-VL-Embedding-8B1/4/112882048000:01:44-2,510.16embedrecipe
20260326-195344docker.io/mixa3607/vllm-gfx906:43566ec-rocm-7.2.0-aiinfos-20260324214800Qwen/Qwen3-VL-Embedding-8B1/4/1256162048000:03:07-2,792.35embedrecipe
20260326-195917docker.io/mixa3607/vllm-gfx906:43566ec-rocm-7.2.0-aiinfos-20260324214800Qwen/Qwen3-VL-Embedding-8B1/4/1384322048000:05:13-2,509.97embedrecipe
20260326-195956docker.io/mixa3607/vllm-gfx906:43566ec-rocm-7.2.0-aiinfos-20260324214800Qwen/Qwen3-VL-Embedding-8B1/4/1641256000:00:24-681.49embedrecipe
20260326-200023docker.io/mixa3607/vllm-gfx906:43566ec-rocm-7.2.0-aiinfos-20260324214800Qwen/Qwen3-VL-Embedding-8B1/4/11284256000:00:14-2,298.45embedrecipe
20260326-200058docker.io/mixa3607/vllm-gfx906:43566ec-rocm-7.2.0-aiinfos-20260324214800Qwen/Qwen3-VL-Embedding-8B1/4/12568256000:00:21-3,032.53embedrecipe
20260326-200136docker.io/mixa3607/vllm-gfx906:43566ec-rocm-7.2.0-aiinfos-20260324214800Qwen/Qwen3-VL-Embedding-8B1/4/138416256000:00:25-3,832.70embedrecipe
20260326-200219docker.io/mixa3607/vllm-gfx906:43566ec-rocm-7.2.0-aiinfos-20260324214800Qwen/Qwen3-VL-Embedding-8B1/4/151232256000:00:30-4,241.59embedrecipe