🧵 Finally, Region-level 4D Understanding is here. Presenting 4D-RGPT: Bridging 3D structure & temporal dynamics by distilling expert perceptual knowledge directly into MLLM Proud to see this @NVIDIAAI work highlighted by @_akhaliq ! More details ↓ #ComputerVision #VideoLLM 🧵 Highlights Perceptual 4D Distillation (P4D): Distills 4D knowledge with ZERO added inference cost. R4D-Bench: New benchmark for region-level 4D VQA. SOTA: Beats baselines on standard 3D/4D benchmarks (+5.3%) and our new R4D-Bench (+4.3%). 🧵 Check out our work: Paper: https:// arxiv.org/abs/2512.17012 Project Page: https:// ca-joe-yang.com/resource/proje cts/4D_RGPT … @huggingface page: https:// huggingface.co/papers/2512.17 012 … 🧵 Big shout-out to our amazing @nvidia @LifeAtPurdue team: @cajoeyang , @RHachiuma , @Sifei30488L , @subhashree_r , @RaymondYeh , Yu-Chiang Frank Wang, @CMHungSteven , @nvidianewsroom @NVIDIARobotics Thanks @TheTuringPost @rohanpaul_ai @Montreal_AI @HuggingPapers for sharing!
Thread Screenshots
Images



