Hi there Welcome to my academic base!

I am a third-year undergraduate(2022-2026) at Xidian University, currently pursuing research in robot learning under the guidance of Prof. Lixin Yang and Prof. Cewu Lu at the MVIG Lab, Shanghai Jiao Tong University. Previously, I contributed to research on adversarial attacks against vision models at the Laboratory of Cooperative Intelligent Systems, under the supervision of Prof. Hao Li and Prof. Maoguo Gong.

Research Interests

My research interests and the learning paradigm I aim to shape primarily focus on:

  • Reasoning-based Learning: Inferring knowledge from interactions with environments to enhance agent’s reasoning and generalization, promoting the development of proactive learning.

  • Generative Modeling: Modeling the agent’s knowledge through generative methods to develop into a world model.

News

  • Unlesh the potential of Autoregressive model in imitation learning: Dense Policy is on preprint!
  • Build MetaPalace, Let you in a meta world of The Palace Museum.
  • Our work Advdisplay was accepted at AAAI 2025 🔥
  • My first work on robot learning:MBA, about object motion for robots manipulation.
  • In charge of Microsoft Club. Feel free to reach out if you’d like to join.
  • I have set up a Blog Site, welcome everyone to visit!

Research Experience

SJTU logo
Shanghai Jiao Tong University (SJTU)
July 2024 - Now
Research intern at MVIG Lab


Xi'dian logo
Xidian University (XDU)
September 2023 - July 2024
Research intern at OMEGA Lab

Publications

DSP
Dense Policy: Bidirectional Autoregressive Learning of Actions
Yue Su*, Xinyu Zhan*, Hongjie Fang, Han Xue,
Haoshu Fang, Yong-Lu Li, Cewu Lu, Lixin Yang

Propose Dense Policy, A bidirectional robotic autoregressive policy, which infers trajectories by gradually expanding actions from sparse keyframes, has demonstrated capabilities exceeding diffusion policies.
[arXiv] [website] [3D-code] [2D-code]


MBA
Motion Before Action: Diffusing Object Motion as Manipulation Condition
Yue Su*, Xinyu Zhan*, Hongjie Fang, Yong-Lu Li, Cewu Lu, Lixin Yang
Propose MBA, a novel plug-and-play module leveraging cascaded diffusion processes to generate actions guided by object motion, enabling seamless integration with manipulation policies.
[arxiv] [website] [code]


RIaa
Generative Adversarial Patches for Physical Attacks on Cross-Modal Pedestrian Re-Identification
Yue Su, Hao Li†, Maoguo Gong
A generative physical adversarial attack on VI-ReID models perturbs modality-invariant features.
[arxiv]


Raa
AdvDisplay: Adversarial Display Assembled by Thermoelectric Cooler for Fooling Thermal Infrared Detectors
Hao Li†, Fanggao Wan, Yue Su, Yue Wu, Mingyang Zhang, Maoguo Gong
Historically, infrared adversarial attacks were single-use and tough to deploy. Using TEC, we implemented efficient attacks adaptable to hardware scenarios. Accepted at AAAI 2025. 🔥

Projects

MetaPalace
MetaPalace: Let you in a meta world of The Palace Museum
We've done what the Old Palace official website couldn't: offering 3D artifact views with single-view reconstruction and an interactive LLM-powered tour guider using RAG technology.
[website] [front-end code] [back-end code]


U_pre
U-pre: U-Net is an excellent learner for time series forecasting
Time series forecasting is suited for U-Net's architecture due to its consistent input-output distributions and strong mathematical alignment. Combining U-Net with Bert-Encoder improved performance by incorporating both local and global attention.
[code] [report-cn]


acoflow
AcoFlow: Heuristic Search for Maximum Flow Problem
The problem of finding the maximum flow lies in how to design better heuristic information to find the augmenting path. We boldly challenge this problem through the ant colony algorithm.
[code] [report-cn]


FGSM3D
FGSM3D: Is the point cloud gradient perturbation attack feasible?
We tried to extend FGSM to the 3D field and achieved significant success within a certain gradient range, but the sampling method of 3D models tells us that things seem to be not that simple...
[code] [report-cn]


crosstalk
AgentCrossTalk: Performe a Crsosstalk between two LLM agents
This project uses the Google Gemini to create a simple chatbot application simulating two crosstalk performers performing based on user-provided topics.
[code] [website]

Awards

  • First Prize, Provincial Level, 2023 China Mathematical Contest in Modeling. code
  • First Prize, Provincial Level, 2024 China Mathematical Contest in Modeling. code,paper
  • Second Prize, Northwestern, 2024 China Computer Design Contest. code