Bio
I am a fourth year Ph.D. student in the Graduate School of AI at Seoul National University, advised by Prof. Byoung-Tak Zhang. My research straddles machine learning, natural language processing, computer vision, and robotics. I’m focused on connecting language with perception and action, enabling machines to understand the semantics of the physical world. Specific topics include:
- Embodied AI: Robots doing real-world tasks via language interaction (ICRA’24, IROS’23)
- Multimodal AI: Vision-language models (VLMs) that can continuously communicate with humans about images (CVPR’23, EMNLP’21, EMNLP’19) and videos (ACL’21)
- Other topics like robustness and image generation (CVPRW’22, NeurIPSW’21, ICASSP’20).
My PhD research has been supported by fellowships from Youlchon Foundation and IPAI. I was fortunate to collaborate with researchers in NAVER AI and SK T-Brain.
Prior to joining Ph.D., I did my master study in Cognitive Science at Seoul National University. Studying cognitive science has sparked my interest in AI and interdisciplinary research. I earned my Bachelor’s degree in Computer Science from Ajou University.
News
- [Aug 2024] I’m selected as a recipient of the Youlchon AI Star Fellowship.
- [Jun 2024] PGA is accepted at IROS 2024.
- [Apr 2024] A preprint for embodied instruction following (Socratic Planner) is released.
- [Mar 2024] I wrote my research statement about what I’ve been studying.
- [Mar 2024] A new preprint (Continual Vision-and-Language Navigation) is released.
- [Jan 2024] PROGrasp is accepted to ICRA 2024!
- [Dec 2023] I attend Brainlink 2023.
- [Nov 2023] I’ll give a talk at Dept. of Energy Resources Engineering at Seoul National University (Title: “The Evolution of Language Models: From Basic NLP to ChatGPT and Beyond”).
- [Oct 2023] Two preprints (PROGrasp and PGA) are released!
- [Jun 2023] One paper is accepted to IROS 2023!
- [Mar 2023] Happy to announce that our paper is accepted to CVPR 2023!
- [Jun 2022] One paper is accepted to ICML 2022 Pre-training Workshop.
- [May 2022] Thrilled to announce that our new preprint is released!
- [Apr 2022] One paper is accepted to CVPR 2022 HCIS Workshop.
- [Dec 2021] I gave an invited talk at Korea Software Congress.
- [Oct 2021] One paper is accepted to NeurIPS 2021 CtrlGen Workshop.
- [Aug 2021] One paper is accepted to Findings of EMNLP 2021.
- [May 2021] One paper is accepted to ACL 2021.
- [Sep 2020] I’m starting my Ph.D. in this fall.
- [Jun 2020] From July, I’ll join SNU AI Institute (AIIS) as a researcher.
- [Jan 2020] Our paper has been accepted to ICASSP 2020!
- [Dec 2019] From January, I’ll be a research intern at SK T-Brain!
- [Nov 2019] I gave a spotlight talk at Video Turing Test workshop, ICCV 2019.
- [Oct 2019] I gave an invited talk at SK Telecom AI Center.
- [Aug 2019] Excited to announce that our paper has been accepted to EMNLP 2019.
- [Jun 2019] Our proposed method ranks 3rd place in Visual Dialog Challenge 2019!!
- [Aug 2018] We have a paper accepted to ECCV 2018 Workshop on VizWiz Grand Challenge.
Publications
Socratic Planner: Inquiry-Based Zero-Shot Planning for Embodied Instruction Following
Continual Vision-and-Language Navigation
PGA: Personalizing Grasping Agents with Single Human-Robot Interaction
IROS 2024
Paper
PROGrasp: Pragmatic Human-Robot Communication for Object Grasping
The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training
CVPR 2023
ICML 2022 Workshop on Pre-training: Perspectives, Pitfalls, and Paths Forward
Project Page
Paper
Code
Slides
Video
GVCCI: Lifelong Learning of Visual Grounding for Language-Guided Robotic Manipulation
Improving Robustness to Texture Bias via Shape-focused Augmentation
CVPR 2022 Workshop on Human-centered Intelligent Services: Safety and Trustworthy
Paper
Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer
EMNLP 2021 Findings
Paper
Code
Slides
Attend What You Need: Motion-Appearance Synergistic Networks for Video Question Answering
C3: Contrastive Learning for Cross-domain Correspondence in Few-shot Image Generation
NeurIPS 2021 Workshop on Controllable Generative Modeling in Language and Vision
Paper
Label Propagation Adaptive Resonance Theory for Semi-Supervised Continuous Learning
ICASSP 2020
Paper
Dual Attention Networks for Visual Reference Resolution in Visual Dialog
EMNLP 2019
ICCV 2019 Workshop on Video Turing Test (Spotlight Talk)
Paper
Code
Slides
Invited Talks
PROGrasp: Pragmatic Human-Robot Communication for Object Grasping
IEEE International Conference on Robotics and Automation (ICRA, May 2024)
The Evolution of Language Models: From Basic NLP to ChatGPT and Beyond
Dept. of Energy Resources Engineering, Seoul National University (Nov. 2023)
The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training
IEEE RO-MAN Workshop on Learning by Asking for Intelligent Robots and Agents (Aug. 2023)
Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer
KSC 2021 - Top-tier Conference Paper Presentation Session (Dec. 2021)
Annual Conference on Human and Cognitive Language Technology (Oct. 2021)
Dual Attention Networks for Visual Reference Resolution in Visual Dialog
ICCV 2019 - Video Turing Test Workshop (Spotlight Talk) (Nov. 2019)
SK Telecom AI Center (Sep. 2019)
Services
🟥 = ML or AI / 🟩 = NLP / 🟦 = Robotics / 🟨 = Workshops
|