Chenyang Wan

Shanghai AI Lab & Zhejiang University

prof_pic_anime.jpg

701 Yunjin Rd, Xuhui Dist

Shanghai, China

Hi there, I’m Chenyang (Bryce) Wan, a First-year PhD student at Zhejiang University, affiliated with InternRobotics at Shanghai AI Lab, under the joint supervision of Jiangmiao Pang and Dahua Lin. Previously, I received my Bachelor’s degree from the College of Control Science and Engineering at Zhejiang University.

My research develops vision-language navigation and exploration systems for embodied AI, aiming to achieve spatiotemporal intelligence in autonomous agents. This involves creating AI that not only navigates complex environments in real-time, but also understands temporal dynamics and spatial relationships over extended periods. The goal is to build adaptive systems capable of persistent environmental understanding and predictive decision-making in dynamic settings.

News

Dec 09, 2025 [Preprint] We released a preprint of our work DualVLN, a dual-system foundation model for Vision-Language Navigation. It integrates a slow system for robust reasoning and pixel goals generation, with a fast system for immediate trajectory planning, enabling robust navigation in complex environments.
Jul 07, 2025 [Preprint] We released a preprint of our work StreamVLN, a streaming VLN framework that employs a hybrid slow-fast context modeling strategy to support multi-modal reasoning over interleaved vision, language and action inputs.
Jun 30, 2025 [New Start] I graduated with honors from Zhejiang University, earning a B.Eng degree with Outstanding Graduate distinction.
Nov 15, 2024 [Award] Awarded the First-Class Scholarship at Zhejiang University.
Jul 02, 2024 [New Start] Starting internship at Shanghai AI Lab: Advancing research in Embodied Intelligence.

Publications

  1. arXiv Preprint
    dualvln_overview.jpg
    Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-Language Navigation
    arXiv preprint arXiv:2512.08186, 2025
  2. Technical Report
    internvla-n1_overview.png
    InternVLA-N1: An Open Dual-System Vision-Language Navigation Foundation Model with Learned Latent Plans
    Technical Report, 2025
  3. arXiv Preprint
    streamvln_overview.png
    StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling
    arXiv preprint arXiv:2507.05240, 2025