Instance-Aware Monocular 3D Semantic Scene Completion

doi:10.1109/TITS.2023.3344806

IMECH-IR > 流固耦合系统力学重点实验室

	Instance-Aware Monocular 3D Semantic Scene Completion
	Xiao, Haihong 1; Xu, Hongbin 1; Kang, Wenxiong 1; Li YQ(李玉琼)2
Corresponding Author	Kang, Wenxiong(auwxkang@scut.edu.cn)
Source Publication	IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS
	2024-01-02
Pages	12
ISSN	1524-9050
Abstract	We study outdoor 3D scene understanding, a challenging task demanding the intelligent system to infer both geometry and semantics from a single-view image - a critical skill for autonomous vehicles to navigate in the real 3D world. Towards this end, we present an instance-aware monocular semantic scene completion framework. To the best of our knowledge, this is the first endeavor specifically targeting the challenge of instance perception in the camera-based semantic scene completion task. Our method consists of two stages. In stage I, we design a region-based VQ-VAE network, providing an effective solution for 3D occupancy prediction. In stage II, we first introduce an instance-aware attention module, explicitly incorporating instance-level cues captured from mask images to enhance the instance features in RGB images. Then we leverage the deformable cross-attention to aggregate image features corresponding to each voxel query and utilize the deformable self-attention to refine query proposals. We combine these key ingredients and evaluate our method on two challenging datasets, namely SemanticKITTI and SSCBench-KITTI-360. The results unequivocally demonstrate the superiority of our proposed method over the state-of-the-art VoxFormer-S. Specifically, our method surpasses VoxFormer-S by 0.22 IoU and 0.72 mIoU on the validation set and achieves an impressive improvement of 3.04 IoU and 1.06 mIoU on the SSCBench-KITTI-360 validation set. Meanwhile, our approach ensures accurate perception of critical instances, thereby exhibiting its exceptional performance and potential for practical deployment.
Keyword	3D scene understanding semantic scene completion 3D vision
DOI	10.1109/TITS.2023.3344806
Indexed By	SCI ; EI
Language	英语
WOS ID	WOS:001167317900001
WOS Research Area	Engineering ; Transportation
WOS Subject	Engineering, Civil ; Engineering, Electrical & Electronic ; Transportation Science & Technology
Funding Project	National Natural Science Foundation of China
Funding Organization	National Natural Science Foundation of China
Classification	一类
Ranking	3+
Contributor	Kang, Wenxiong
Citation statistics
Document Type	期刊论文
Identifier	http://dspace.imech.ac.cn/handle/311007/94550
Collection	流固耦合系统力学重点实验室
Affiliation	1.South China Univ Technol, Sch Automat Sci & Engn, Guangzhou 511442, Peoples R China; 2.Chinese Acad Sci, Inst Mech, Key Lab Mech Fluid Solid Coupling Syst, Beijing 100190, Peoples R China
Recommended Citation GB/T 7714	Xiao, Haihong,Xu, Hongbin,Kang, Wenxiong,et al. Instance-Aware Monocular 3D Semantic Scene Completion[J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS,2024:12.
APA	Xiao, Haihong,Xu, Hongbin,Kang, Wenxiong,&李玉琼.(2024).Instance-Aware Monocular 3D Semantic Scene Completion.IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS,12.
MLA	Xiao, Haihong,et al."Instance-Aware Monocular 3D Semantic Scene Completion".IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS (2024):12.

Files in This Item:
There are no files associated with this item.

If you have any objections to this item, please fill out the form below and the administrator will contact you as soon as possible.
Content:
Email：	*
Institution:
Verification Code:	Refresh

Any comments and suggestions are welcomed.
Title:	*
Content:
Email：	*
Verification Code:	Refresh