Despite the fact that many 3D human activity benchmarks
being proposed, most existing action datasets focus
on the action recognition tasks for the segmented videos.
There is a lack of standard large-scale benchmarks, especially
for current popular data-hungry deep learning based
methods. In this paper, we introduce a new large scale
benchmark (PKU-MMD) for continuous multi-modality 3D
human action understanding and cover a wide range of
complex human activities with well annotated information.
PKU-MMD contains 1076 long video sequences in 51 action
categories, performed by 66 subjects in three camera
views. It contains almost 20,000 action instances and 5.4
million frames in total. Our dataset also provides multimodality
data sources, including RGB, depth, Infrared Radiation
and Skeleton. With different modalities, we conduct
extensive experiments on our dataset in terms of two scenarios
and evaluate different methods by various metrics,
including a new proposed evaluation protocol 2D-AP. We
believe this large-scale dataset will benefit future researches
on action detection for the community
2025-06-06 18:15:59
1.56MB
1