
Pulse of Motion Benchmark
Measuring temporal realism in AI video generation. Does your model produce motion that matches real-world physics? The PoM benchmark evaluates this using PhyFPS — a metric that predicts frame rate directly from visual dynamics, without reading metadata.
What We Measure
PhyFPS (Physical Frames Per Second) captures how closely AI-generated video motion matches the temporal dynamics of the real world. A model with low PhyFPS error produces videos where objects move at physically plausible speeds.
For details, refer to our paper: arXiv:2603.14375
Avg. Error
Mean absolute difference between predicted PhyFPS and container meta FPS across all clips.
Avg. Error = (1/V) Σᵥ (1/Cᵥ) Σ_c |f̂ᵥ,c − F_meta,c|Pct. Error
Percentage error normalized by meta FPS, enabling cross-comparison across frame rate ranges.
Pct. Error = (100/V) Σᵥ (1/Cᵥ) Σ_c |f̂ᵥ,c − F_meta,c| / F_meta,cIntra-Video CV
Coefficient of variation across sliding-window clips within each video. Measures temporal consistency.
Intra CV = (1/V) Σᵥ Std({f̂ᵥ,c}) / Mean({f̂ᵥ,c})Text-Video Alignment
CLIP-based cosine similarity between input text prompt and generated video. A supplementary metric — not a primary evaluation dimension.
Note: Submissions below 0.16 may lack meaningful text-video alignment.
Dynamic FPS detection: Our pipeline automatically reads each video's per-frame timestamps and computes per-clip meta FPS, supporting both constant and variable frame rate videos.
How It Works
Four simple steps to benchmark your video generation model.
Generate videos
Run each prompt through your model to produce one video per prompt.