What You Read Isn't What You Hear: Linguistic Sensitivity in Deepfake Speech Detection

Abstract

Recent advances in text-to-speech technologies have enabled realistic voice generation, fueling audio-based deepfake attacks such as fraud and impersonation. While audio anti-spoofing systems are critical for detecting such threats, prior work has predominantly focused on acousticlevel perturbations, leaving the impact of linguistic variation largely unexplored. In this paper, we investigate the linguistic sensitivity of both open-source and commercial antispoofing detectors by introducing transcriptlevel adversarial attacks. Our extensive evaluation reveals that even minor linguistic perturbations can significantly degrade detection accuracy: attack centered rates surpass 60% on several open-source detector–voice pairs, and notably one commercial detection accuracy drops from 100% on synthetic audio to just 32%. Through a comprehensive feature attribution analysis, we identify that both linguistic complexity and model-level audio embedding similarity contribute strongly to detector vulnerability. We further demonstrate the real-world risk via a case study replicating the Brad Pitt audio deepfake scam, using transcript adversarial attacks to completely bypass commercial detectors. These results highlight the need to move beyond purely acoustic defenses and account for linguistic variation in the design of robust anti-spoofing systems. All source code will be publicly available.

Open-Source Anti-Spoofing Detector Performance Results

Our evaluation of open-source anti-spoofing systems (AASIST-2, CLAD, RawNet-2) reveals significant vulnerabilities to transcript-level adversarial attacks. We observe a reduction in Original Accuracy (OC) and Accuracy Under Attack (AUA), indicating that subtle linguistic changes can significantly degrade detection performance. Correspondingly, the Attack Success Rate (ASR), representing the percentage of spoof audio misclassified as real, consistently rises, exceeding 60% for several detector-voice pairs. The Semantic Preservation Score (COS) remains high, confirming that attacks maintain the original meaning.

Loading...

Brad Pitt Deepfake Scam Case Study

This case study on a simulated Brad Pitt deepfake scam demonstrates how minor linguistic changes in the transcript can dramatically increase a fake audio's detection probability as real. The results highlight that even state-of-the-art commercial detectors are vulnerable to these transcript-level adversarial attacks, posing a significant real-world risk. Crucially, these attacks can succeed even if the altered wording is slightly unnatural or grammatically imperfect.

Loading...

Table 3: This table illustrates the Brad Pitt deepfake scam case study, showing how subtle transcript modifications can flip commercial detector predictions from fake to real.

BibTeX

@article{nguyen2025linguistic,
  title={What You Read Isn't What You Hear: Linguistic Sensitivity in Deepfake Speech Detection},
  author={Nguyen, Binh and Shi, Shuju and Ofman, Ryan and Le, Thai},
  journal={arXiv preprint arXiv:2505.17513},
  year={2025}
}