Chaitanya Ahuja

IPA: /tʃeətənj/

Let every modalities' voice be heard, sight be seen and text be understood

I am a PhD student at the Language Technologies Institute at Carnegie Mellon University. I am advised by Dr. Louis-Philippe Morency (LP) in the Multicomp Lab and we work on anything multimodal. Lately, my research efforts have been directed towards grounding pose forecasting on Speech, and Language. As an undergraduate researcher at Indian Institute of Technology(IIT), Kanpur. I worked with Dr. Rajesh Hegde on Spatial Audio and Speaker Diarization, and Dr. Vinay Namboodiri on Video Summarization.

News

July 2020 Paper on Style Transfer for Co-Speech Gesture Animation accepted at ECCV'20
August 2019 Paper on Visual Pose Forecasting for Personalized Avatar during Dyadic Conversations accepted at ICMI'19 [pdf][webpage]
August 2019 Honourable mention in LTI SRS symposium on my talk on Natural Language Grounded Pose Forecasting
July 2019 Paper on Natural Language Grounded Pose Forecasting accepted at 3DV'19 [pdf][webpage]
March 2018 Excited to work at Facebook Reality Labs in Summer'18
January 2018 Paper on Lattice Recurrent Units accepted at AAAI'18 [pdf][webpage]
October 2017 Our survey on Multimodal Machine Learning is on arXiv

Book Chapters

Challenges and applications in multimodal machine learning
T. Baltrusaitis, C. Ahuja, and L. Morency
The Handbook of Multimodal-Multisensor Interfaces 2018
[1]

Selected Publications

Google Scholar

Style Transfer for Co-Speech Gesture Animation: A Multi-Speaker Conditional Mixture Approach
C. Ahuja, D. Lee, Y. Nakano, and L. Morency
ECCV 2020
[1] [abs] [pdf] [webpage]
To React or not to React: End-to-End Visual Pose Forecasting for Personalized Avatar during Dyadic Conversations
C. Ahuja, S. Ma, L. Morency, and Y. Sheikh
ICMI 2019
[2] [abs] [pdf] [webpage]
Language2Pose: Natural Language Grounded Pose Forecasting
C. Ahuja and L. Morency
3DV 2019
[3] [abs] [pdf] [code] [webpage]
Lattice Recurrent Unit: Improving Convergence and Statistical Efficiency for Sequence Modeling
C. Ahuja and L. Morency
AAAI 2018
[4] [abs] [pdf] [code] [webpage]
Multimodal Machine Learning: A Survey and Taxonomy
T. Baltrusaitis, C. Ahuja, and L. Morency
TPAMI 2017
[5] [abs] [pdf]
Fast modelling of pinna spectral notches from HRTFs using linear prediction residual cepstrum
C. Ahuja and R. Hegde
ICASSP 2014
[6] [abs] [pdf]
Extraction of pinna spectral notches in the median plane of a virtual spherical microphone array
A. Sohni, C. Ahuja, and R. Hegde
HSCMA 2014
[7] [abs] [pdf]