VBench Leaderboard

"Which Video Generation Model is better?"
๐Ÿ† Welcome to the leaderboard of the VBench! ๐ŸŽฆ A Comprehensive Benchmark Suite for Video Generative Models (CVPR 2024 Spotlight)

  • Comprehensive Dimensions: We carefully decompose video generation quality into 16 comprehensive dimensions to reveal individual model's strengths and weaknesses.
  • Human Alignment: We conducted extensive experiments and human annotations to validate robustness of VBench.
  • Valuable Insights: VBench provides multi-perspective insights useful for the community.

Join Leaderboard: Please see the instructions for 3 options to participate. One option is to follow VBench Usage info, and upload the generated result.json file here. After clicking the Submit here! button, click the Refresh button. Model Information: What are the details of these Video Generation Models? See HERE

Credits: This leaderboard is updated and maintained by the team of VBench Contributors.

Evaluation Dimension
Model-Unnamed: 0_level_1
Animals-mAM
Animals-mLGM
Nature-mAM
Nature-mLGM
Shows-mAM
Shows-mLGM
Daily Life-mAM
Daily Life-mLGM
Sports-mAM
Sports-mLGM
Entertainments-mAM
Entertainments-mLGM
Vehicles-mAM
Vehicles-mLGM
Indoor-mAM
Indoor-mLGM
Tutorial-mAM
Tutorial-mLGM
Video-CCAM-v1.2
29.18
42.13
31.04
45.14
26.96
38.39
26.61
37.82
24.35
36.36
28.47
46.41
26.89
37.29
24.01
31.16
16.68
56.62

VBench, a comprehensive benchmark suite for video generative models. We design a comprehensive and hierarchical Evaluation Dimension Suite to decompose "video generation quality" into multiple well-defined dimensions to facilitate fine-grained and objective evaluation. For each dimension and each content category, we carefully design a Prompt Suite as test cases, and sample Generated Videos from a set of video generation models. For each evaluation dimension, we specifically design an Evaluation Method Suite, which uses carefully crafted method or designated pipeline for automatic objective evaluation. We also conduct Human Preference Annotation for the generated videos for each dimension, and show that VBench evaluation results are well aligned with human perceptions. VBench can provide valuable insights from multiple perspectives.