ETRI-Knowledge Sharing Plaform

ENGLISH

성과물

논문 검색
구분 SCI
연도 ~ 키워드

상세정보

학술지 Automatic Dense Annotation for Monocular 3D Scene Understanding
Cited 0 time in scopus Download 116 time Share share facebook twitter linkedin kakaostory
저자
Md Alimoor Reza, Kai Chen, Akshay Naik, David J. Crandall, 정순흥
발행일
202004
출처
IEEE Access, v.8, pp.68852-68865
ISSN
2169-3536
출판사
IEEE
DOI
https://dx.doi.org/10.1109/ACCESS.2020.2984745
협약과제
19ZR1100, 초실감 공간미디어 원천기술 개발, 서정일
초록
Deep neural networks have revolutionized many areas of computer vision, but they require notoriously large amounts of labeled training data. For tasks such as semantic segmentation and monocular 3d scene layout estimation, collecting high-quality training data is extremely laborious because dense, pixel-level ground truth is required and must be annotated by hand. In this paper, we present two techniques for significantly reducing the manual annotation effort involved in collecting large training datasets. The tools are designed to allow rapid annotation of entire videos collected by RGBD cameras, thus generating thousands of ground-truth frames to use for training. First, we propose a fully-automatic approach to produce dense pixel-level semantic segmentation maps. The technique uses noisy evidence from pre-trained object detectors and scene layout estimators and incorporates spatial and temporal context in a conditional random field formulation. Second, we propose a semi-automatic technique for dense annotation of 3d geometry, and in particular, the 3d poses of planes in indoor scenes. This technique requires a human to quickly annotate just a handful of keyframes per video, and then uses the camera poses and geometric reasoning to propagate these labels through an entire video sequence. Experimental results indicate that the technique could be used as an alternative or complementary source of training data, allowing large-scale data to be collected with minimal human effort.
KSP 제안 키워드
3D geometry, 3D scenes, Automatic approach, Computer Vision(CV), Conditional Random Field(CRF), Deep neural network(DNN), High-quality, Indoor scenes, Large-scale data, Manual annotation, RGB-D camera
본 저작물은 크리에이티브 커먼즈 저작자 표시 (CC BY) 조건에 따라 이용할 수 있습니다.
저작자 표시 (CC BY)