Recent years have seen a surge in falsified videos of high-profile speech events, thanks to the rise of deepfake technology and other advanced video editing tools. Digital techniques for authenticating videos, like watermarking or content signing, are a promising solution to this growing problem. But they succeed only if all parties – from recording audience members to downstream video editors – cooperate by adding and retaining the appropriate credentials in their videos. This work explores a complementary physical approach that ensures all authentic videos of a speech can be verified, with no assumptions of external cooperation.
VeriLight creates dynamic physical signatures at the speech site and embeds them into all video recordings via imperceptible modulated light. These physical signatures encode features unique to the event, and are cryptographically-secured to prevent spoofing. The signatures can be extracted from any video downstream and validated to check the content's integrity.
This work focuses on combating visual falsification of speaker identity and lip/facial motion - two particularly popular and consequential forms of manipulation.* Experiments on extensive video datasets and five deepfake models show VeriLight achieves AUCs ≥ 0.99 and a true positive rate of 100% in detecting such falsifications, outperforming ten state-of-the-art passive deepfake detectors. Further, VeriLight is robust across recording conditions, video post-processing techniques, and white-box adversarial attacks.
@article{
schwartz2025combating,
author = {Schwartz, Hadleigh and Yan, Xiaofeng and Carver, Charles J. and Zhou, Xia},
title = {Combating Falsification of Speech Videos with Live Optical Signatures},
year = {2025},
booktitle = {Proceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security (CCS)}
}