3be0ca4844aa80004b60b19da10954a.jpg

Contact


✉️ thj23 [at] mails.tsinghua.edu.cn

✉️ bernard.hengk.tan [at] gmail.com

📍 Beijing, China

Github: https://github.com/thkkk/

🎓 Google Scholar: Hengkai Tan - ‪Google Scholar‬

LinkedIn: Hengkai Tan / LinkedIn

Hobbies


🏸 Various ball sports and fitness

🎹 Piano and music

📚 Reading

⛰️ Traveling

🎥 Movie

📈 Value investing

Languages


🇨🇳 Chinese

🇬🇧 English

🇩🇪 German (almost forgot)

<aside> <img src="/icons/sun_red.svg" alt="/icons/sun_red.svg" width="40px" /> I’m Hengkai Tan, a third-year PhD student advised by Professor Jun Zhu and Associate Professor Hang Su in the TSAIL, Department of Computer Science and Technology, Tsinghua University.

</aside>

<aside> <img src="/icons/bullseye_blue.svg" alt="/icons/bullseye_blue.svg" width="40px" /> Vision: Build (embodied) agents that interact with real-world vision space, advancing toward AGI!

</aside>

<aside> <img src="/icons/search_orange.svg" alt="/icons/search_orange.svg" width="40px" /> Recent Focused Research Areas: Unification of Embodied Foundation Models and Multi-modal Foundation Models, as well as Reinforcement Learning.

</aside>

Feel free to reach out if you share a belief in AGI and want to collaborate on building a general-purpose embodied agent with vision space as the core focus!

Selected Publications


image.png

Vidar: Embodied Video Diffusion Model for Generalist Bimanual Manipulation Yao Feng#, Hengkai Tan*, Xinyi Mao, Guodong Liu, Shuhe Huang, Chendong Xiang, Hang Su, Jun Zhu* Project Page, Paper, 公众号文章

Embodied video foundation model: VIDAR

image.png

AnyPos: Automated Task-Agnostic Actions for Bimanual Manipulation Hengkai Tan#, Yao Feng*, Xinyi Mao*, Shuhe Huang, Guodong Liu, Zhongkai Hao, Hang Su, Jun Zhu* Project Page, Paper, 公众号文章

Automated task-agnostic data collection and IDM with near-100% replay success rate

H-RDT: Human Manipulation Enhanced Bimanual Robotic Manipulation Hongzhe Bi, Lingxuan Wu, Tianwei Lin, Hengkai Tan, Zhizhong Su, Hang Su, Jun Zhu **Project Page, PDF, CODE

image.png

ManiBox: Enhancing Embodied Spatial Generalization via Scalable Simulation Data Generation Hengkai Tan, Xuezhou Xu*, Chengyang Ying, Xinyi Mao, Songming Liu, Xingxing Zhang, Hang Su, Jun Zhu* Project Page, 公众号文章

Scaling laws of spatial generatlization and robust manipulation using bounding box

image.png

( ICLR 2025 ) RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation Songming Liu, Lingxuan Wu*, Bangguo Li, Hengkai Tan, Huayu Chen, Zhengyi Wang, Ke Xu, Hang Su, Jun Zhu* Project Page, **PDF, CODE, 公众号文章

Embodied Foundation Model with 1B parameters

image.png

image.png

( ICML 2024 ) FCNet: Fourier Controller Networks for Real-Time Decision-Making in Embodied Learning Hengkai Tan, Songming Liu, Kai Ma, Chengyang Ying, Xingxing Zhang, Hang Su, Jun Zhu Project Page, PDF, CODE

Modeling embodied motion in frequency domain

See more on my Google Scholar.

Experience


image.png

Undergraduate Student at the ****Department of Computer Science and Technology, Tsinghua University

2019 - 2023

image.png

PhD Student at TSAIL, Tsinghua University

2023 - Present

Under the guidance of Professor Jun Zhu and Associate Professor Hang Su.