Current Issue Cover


摘 要
目的 使用运动历史点云进行人体行为识别的方法,由于点云数据量大,在提取特征时运算复杂度很高。而深度运动图进行人体行为识别的方法,提取特征简单,但是包含动作信息并不全面,限制了识别率的上限。针对上述两种方法存在的问题,提出了一种多视角深度运动图的人体行为识别算法。方法 首先采用深度图序列生成运动历史点云对动作进行表示,接着将运动历史点云旋转特定角度补充更多视角下的动作信息;然后将原始的和旋转后的运动历史点云投影到笛卡尔坐标平面生成多视角深度运动图,对其提取方向梯度直方图,采用串联融合生成特征向量;最后将特征向量送入到支持向量机中进行分类识别,在MSR Action3D和自建数据库上对算法进行验证。结果 MSR Action3D数据库有两种实验设置,其中采用实验设置一时,该算法识别率达到96.8%,比APS_PHOG(axonometric projections and PHOG feature)算法高2.5%,比DMM(Depth motion maps)高1.9%,比DMM_CRC(Depth motion maps and Collaborative representation classifer)算法高1.1%。采用实验设置二时,识别率为93.82%,比DMM(Depth motion maps)算法高5.09%,比HON4D(Histogram of Oriented 4D Surface Normals)算法高4.93%。在自建数据库上该算法识别率达到97.98%,比MHPC算法高3.98%。结论 实验结果表明,多视角深度运动图不但解决了运动历史点云提取特征复杂的问题,而且使深度运动图包含了更多视角下的动作信息,有效的提高了人体行为识别的精度。
Human Action Recognition Based on Multi-perspective Depth Motion Maps

Liu Tingting,Li Yupeng,Zhang liang(Key Laboratory of Advanced Signal and Image Processing,Civil Aviation University of China,Tianjin,300300)

Objective Due to insensitivity to illumination of depth data, action recognition based on depth data is gradually carried out. There are two main methods, one is point clouds converted from depth maps, the other is depth motion map (DMM) generated from depth maps projection. Motion history point cloud (MHPC) was proposed to represent actions, but the large amount of points in MHPC incur expensive computations when extracting features. Depth motion map is generated by stacking motion energy of depth maps sequence projected onto three orthogonal Cartesian planes. Projecting the depth maps onto a specific plane get additional body shape and motion information. However, depth motion map contains motion information inadequately, which caps the human action recognition accurate, even though it is simple to extract features from depth motion map. In other words, an action is represented by DMMs from only three views, so the action information from other perspectives is lacking. To solve above problems, multi-perspective depth motion maps for human action recognition is proposed. Method In the algorithm, firstly, the motion history point cloud (MHPC) is generated from depth maps sequence to represent actions. Through rotating the motion history point cloud around axis Y a certain angle, motion information under different perspectives is supplemented. Then primary MHPC is projected onto three orthogonal Cartesian planes, and rotated MHPC is projected onto XOY planes. Multi-perspective depth motion map generated from these projected MHPC. After projection, the point clouds are distributed in plane where there are many overlapping points under the same coordinates. These points may come from the same frame of depth map, or may come from different frame.We use these overlapping points to generate DMM so as to capture the spatial energy distribution of motion. For example, the pixel in depth motion maps generated from MHPC projected onto XOY plane is the sum of absolute difference of z of the adjacent two overlapping points belonging to different frames. DMMs generation from MHPC projected onto YOZ plane and XOZ plane are similar to this, only the point of the z correspondingly is changed to the x and y. MHPC is projected onto three orthogonal Cartesian planes to generate DMM from front, side, top view respectively. The rotated MHPC is projected onto XOY plane to generate DMM under different view. Multi-perspectives depth motion maps encoding the 4D information of an action to 2D maps are utilized to represent an action, so the action information under more perspective is replenished. It should be noted that, the value of x,y,z of points in projected MHPC are normalized to fixed values as the multi-perspective depth motion maps image coordinates, which can reduce the intra-class variability due to different action performers. According to the experience, this paper normalizes the values of x and z ??to511, and y?? to1023. The histogram of oriented gradient (HOG) are extracted from each depth motion map, then they are concatenation as feature vectors of an action. Lastly, the SVM classifier is adopted to train the classifier to recognize the action. Experiments with this method on the MSR Action3D dataset and our dataset were done. Result The proposed algorithm exhibits improved performances on MSR Action 3D database and our dataset. There are two experimental settings for MSR Action3D. This algorithm achieves an identification rate of 96.8% in experiment setting one, which is obviously better than most algorithms. The action recognition rate of the proposed algorithm is 2.5% higher than that APS_PHOG(axonometric projections and PHOG feature) algorithm, 1.9% higher than that of DMM algorithm, 1.1% higher than that of DMM_CRC(Depth motion maps and Collaborative representation classifer)algorithm. In the second experimental setting, the recognition rate reached 93.82%, 5.09% higher than DMM algorithm, 4.93% higher than HON4D algorithm, 2.18% higher than HOPC algorithm, 1.92% higher than DMM_LBP feature fusion. In our database, the recognition rate of this algorithm is 97.98%, 3.98% higher than MHPC algorithm. Conclusion MHPC is used to represent the action, which supplement the action information from different perspectives by rotating certain angles. Multi-perspective depth motion maps are generated by computing the distribution of overlapping points in the projected MHPC, which captures the spatial distribution of the absolute motion energy. Coordinate normalization reduce the intra-class variability. The experimental results show that multi-perspective depth motion map not only solve the difficulty of extracting features from motion history point cloud, but also supplement motion information of traditional depth motion map. Human action recognition base on multi-perspective depth motion map outperform some existing methods. The new approach combines the method of point clouds with the method of deep motion map, which full play the advantages of both and weakens the disadvantages.