ClipCannon Voice Clone Pipeline - Official Benchmark Results

0.779 SeedTTS-Eval SIM | Beats Human Ground Truth by +0.049 | 0.975 on Clean Reference

SeedTTS-Eval Official Benchmark (1,088 Samples)

Scored with the official WavLM-Large + ECAPA-TDNN encoder (192-dim).

Metric Score
Mean SIM 0.779
Median SIM 0.785
Max SIM 0.896
p90 0.842
p75 0.816
p25 0.748
Min 0.520
Samples 1,088
Runtime 33.3 hours on RTX 5090

Beats Human Ground Truth by +0.049

Human recordings of the same speakers score 0.730 on this benchmark. My clones score 0.779. The AI produces more consistent speaker identity than real recordings.