- Beijing, China
-
10:36
- 8h ahead - https://scholar.google.com/citations?user=9k1flhEAAAAJ
- https://orcid.org/0009-0000-1717-5286
Stars
[ICLR 2025] Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention
[ICLR 2025 Spotlight] Official implementation for "DynamicCity: Large-Scale 4D Occupancy Generation from Dynamic Scenes"
The official repo for "SpatialBot: Precise Spatial Understanding with Vision Language Models.
PointCT: Point Central Transformer Network for Weakly-supervised Point Cloud Semantic Segmentation (WACV 2024)
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
A Large-scale Mobile LiDAR Dataset for Semantic Segmentation of Urban Roadways
🔥Urban-scale point cloud dataset (CVPR 2021 & IJCV 2022)
🔥 Synthetic and real-world 2d/3d dataset for semantic and instance segmentation (BMVC 2022 Oral)
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
[NeurIPS 2024] SG-Nav: Online 3D Scene Graph Prompting for LLM-based Zero-shot Object Navigation
[ICLR 2025, Oral] EmbodiedSAM: Online Segment Any 3D Thing in Real Time
Leveraging Large Language Models for Visual Target Navigation
Official GitHub Repository for Paper "Bridging Zero-shot Object Navigation and Foundation Models through Pixel-Guided Navigation Skill", ICRA 2024
[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
[ACL 24] The official implementation of MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation.
(TPAMI 2024) A Survey on Open Vocabulary Learning
[ICRA 2024] Chat with NeRF enables users to interact with a NeRF model by typing in natural language.
Official code release for ConceptGraphs
DroneDeploy Machine Learning Segmentation Benchmark
Official implementation of "g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasks" (CVPR'25).
A generative world for general-purpose robotics & embodied AI learning.
[RSS 2024] NaVid: Video-based VLM Plans the Next Step for Vision-and-Language Navigation
[CoRL 2022] This repository contains code for generating relevancies, training, and evaluating Semantic Abstraction.
[NeurIPS'2024] Zero-Shot Scene Reconstruction from Single Images with Deep Prior Assembly
Code repository for "ZeroShape: Regression-based Zero-shot Shape Reconstruction".
Official Code: 3D Scene Reconstruction from a Single Viewport
HI-SLAM2: Geometry-Aware Gaussian SLAM for Fast Monocular Scene Reconstruction