目的 基于马尔可夫随机场(MRF)的变分光流计算是一种较为鲁棒的光流计算方法，但是计算效率很低。置信传播算法(Brief Propagation method，BP) 是一种针对MRF较为高效的全局优化算法。本文提出了一种MRF变分光流计算模型并采用并行BP方法实现，极大提高计算效率。方法 提出的MRF变分光流计算模型中的数据项采用了Horn等根据灰度守恒假设得到光流基本约束方程，并采用非平方惩罚函数进行调整以平滑边界影响。为在CUDA平台上实现高效并行处理，本文提出了一种优化的基于置信传播的MRF并行光流计算方法。该优化方法在采用置信传播最小化MRF光流能量函数时，采用了一种4层的三维网络结构进行并行计算，每层对应MRF四邻域模型中的一个方向的信息传播，同时在每层中为每个像素分配多个线程采用并行降维法计算所要传递的信息，大大降低单线程计算负荷，大幅度提高计算效率。结果 采用旋转小球图像序列进行实验，计算效率提高314倍；采用旋转小球、Yosemite山谷和RubberWhale三种不同图像序列，与Horn算法，Weickert算法，Khalid并行Lucas算法，Grauer-Gray并行MRF算法进行对比实验，本方法得到最低的平均端点误差(AEE)，分别为(0.13，0.55和0.34)。 结论 本文提出了一种新的MRF光流计算模型，并在CUDA平台上实现了并行优化计算。实验结果表明，本文提出的并行计算方法在保持计算精度的同时极大提高了计算效率。本方法对内存需求巨大，在处理高分辨率图像时，限制了其采样点数，难以计算大位移。
Abstract: Objective Markov random field (MRF) model for variational optical flow computing is a robust optical flow computing algorithm. The MRF variational energy function includes the data term and the smooth term. The Brief Propagation method (BP method) is a high effective global method to minimize the MRF energy function by transmitting massage from one pixel to next along four directions iteratively, however the computational cost of it is still high. The computational complexity of BP method with 3-level data pyramid is O(N?L2?(10.5T+2)), where N is the number of the pixels in the image, T is the iterative times and L is the width of spatial sampling. We proposed a new MRF model to calculate optical flow, and proposed a parallel method to minimize the MRF energy function, which can dramatically reduce the computational time. Method The data term in the proposed MRF variational energy function for optical flow computing is from the optical flow constraint equation derived by the brightness constancy assumption proposed by Horn, and is smoothed by a non-square penalty function to reduce the boundary effect. To realize parallel computing on CUDA platform, we proposed an optimal parallel Brief Propagation method to minimize the MRF energy function. This optimal method used a 3D grid with four levels to perform the parallel method. In different level, the massage was transmitted along different direction in the neighborhood. In each level, we assigned L threads to one pixel, and used parallel dimension reduction method to calculate the massage to be transmitted, thus dramatically reduced the computational load of one thread and dramatically reduced the computational time. The dimension reduction method transforms the 2D energy function into two independent 1D energy functions, and uses lower parabola envelope based method to calculate the 1D energy function, whose computational complexity is O(4L2). The parallel dimension reduction method simultaneously calculates the MRF energy functions of the different rows (or columns) in the label region of optical flow, whose size is L?L, by allowing each thread to correspond to a row (or column) in the region. Thus, the computational complexity of the proposed parallel dimension reduction method is reduced to O(4L). All parallel processing steps were completed on CUDA platform for general parallel computing on GPU. The first step is calculating the data term by assigning N threads, and letting each thread correspond to a pixel in the image. The corresponding computational complexity is O(L2); The second step is using the proposed parallel dimension reduction method to calculate the massage and transmit it to up, down, left and right for T times from the first level to the third level in the data pyramid with 3 levels by assigning 4L threads to a pixel in the image. The corresponding computational complexity is O(3?4?L?T); The last step is outputting the optical flow corresponding to the minimal massage in each pixel by assigning N threads and letting each thread correspond to a pixel in the image. The corresponding computational complexity is O(L2); Thus, the entire computational complexity of this method is only O(2L2+12L?T) with 3-level data pyramid. Result In the experiment for rotational sphere image sequence, the computational time was only 0.27S, and the corresponding computational time of conventional method was 85S. The computational efficiency of proposed method was more than 314 times than that of conventional method. Our proposed method was as fast as Horn method, and much faster than Weickert method (6S) and Grauer-Gray method (0.68S), but slower than Khalid method (0.01S). In comparative experiments on errors with Horn method, Weickert method, Khalid method and Grauer-Gray method using three different image sequences (the sphere image sequence with rotational moving, the Yosemite Valley image sequence with high change of brightness and the RubberWhale image sequence with many moving objects), the proposed method achieved the lowest Average Endpoint Error (AEE) (0.13, 0.55, 0.34 respectively) on all the testing image sequences. Conclusion This paper proposed a novel parallel computing strategy of variational optical flow based on MRF model by using a 3D grid with four levels to transmit the massage in Brief Propagation method along four directions simultaneously, where multi-threads were assigned to one pixel for calculating massage to be transmitted. The experiment results indicate that our proposed model has high computational efficiency and can obtain accurate optical flow from image sequences with rotational moving, high change of brightness and many moving objects, however our proposed method needs a great amount of memory, which limits the width of spatial sampling. Therefore, our proposed method is not suitable for calculating optical flows with large displacement from image sequence with high resolution.