33rd IEEE Conference on Signal Processing and Communications Applications, SIU 2025, İstanbul, Türkiye, 25 - 28 Haziran 2025, (Tam Metin Bildiri)
Lip synchronization is a fundamental component of speech-driven applications, ranging from virtual reality and human-machine interaction to forensic analysis and cybersecurity. Traditional signal processing-based methods for lip synchronization face challenges in accuracy, real-time performance, and language independence. This study benchmarks a statistical signal processing-based lip synchronization approach against modern machine learning-based tools. By leveraging high signal-to-noise ratio (SNR) audio data and transcripts, we enhance the conventional algorithm with AI-driven models. The integration of signal processing with machine learning and deep learning enables more precise, natural, and language-agnostic lip synchronization. The results emphasize AI's transformative impact on speech processing technologies and set the stage for future advancements in multimodal communication.