Understanding Generative AI
We delve into the high-dimensional latent space of LLMs to uncover how knowledge is encoded. Our research
focuses on mechanistic interpretability to decode the 'inner circuits' of reasoning, enabling precise
behavior steering and machine unlearning.
AI Security & Privacy
This research focuses on protecting AI models from manipulation. It explores ways to prevent attacks that
exploit weaknesses, ensuring AI makes reliable and secure decisions.
AI Authenticity & Deepfakes
We focus on improving AI’s ability to generate natural content while developing robust methods to detect
AI-written text and deepfake speech, ensuring authenticity and preventing misuse.
Explainable AI (XAI)
Explores making AI decisions more transparent while addressing vulnerabilities in explanations. It aims to
ensure AI insights are trustworthy, resistant to manipulation, and easy for end users to understand.
NLP for Social Good
We apply our developed methodology in various critical NLP applications, from healthcare to education and
social media analysis.