Publications
C: Conference, W: Workshop, P: Preprint / *: equal contribution, †: equal advising
[C10] CLIP-RT: Learning Language-Conditioned Robotic Policies from Natural Language Supervision
RSS 2025
CoRL 2024 Workshop on Language and Robot Learning
Project Page
Paper
Code
[C9] Socratic Planner: Inquiry-Based Zero-Shot Planning for Embodied Instruction Following
ICRA 2025
Paper
[P1] Continual Vision-and-Language Navigation
arXiv preprint 2024
Paper
[C8] PGA: Personalizing Grasping Agents with Single Human-Robot Interaction
IROS 2024
Paper
[C7] PROGrasp: Pragmatic Human-Robot Communication for Object Grasping
[C6] The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training
CVPR 2023
ICML 2022 Workshop on Pre-training: Perspectives, Pitfalls, and Paths Forward
Project Page
Paper
Code
Slides
Video
[C5] GVCCI: Lifelong Learning of Visual Grounding for Language-Guided Robotic Manipulation
[W2] Improving Robustness to Texture Bias via Shape-focused Augmentation
CVPR 2022 Workshop on Human-centered Intelligent Services: Safety and Trustworthy
Paper
[C4] Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer
EMNLP 2021 Findings
Paper
Code
Slides
[C3] Attend What You Need: Motion-Appearance Synergistic Networks for Video Question Answering
[W1] C3: Contrastive Learning for Cross-domain Correspondence in Few-shot Image Generation
NeurIPS 2021 Workshop on Controllable Generative Modeling in Language and Vision
Paper
[C2] Label Propagation Adaptive Resonance Theory for Semi-Supervised Continuous Learning
ICASSP 2020
Paper
[C1] Dual Attention Networks for Visual Reference Resolution in Visual Dialog
EMNLP 2019
ICCV 2019 Workshop on Video Turing Test (Spotlight Talk)
Paper
Code
Slides