Research progress in human-like indoor scene interaction
- Vol. 29, Issue 6, Pages: 1575-1606(2024)
Published: 16 June 2024
DOI: 10.11834/jig.240004
移动端阅览
浏览全部资源
扫码关注微信
Published: 16 June 2024 ,
移动端阅览
杜韬, 胡瑞珍, 刘利斌, 弋力, 赵昊. 2024. 室内场景拟人交互研究进展. 中国图象图形学报, 29(06):1575-1606
Du Tao, Hu Ruizhen, Liu Libin, Yi Li, Zhao Hao. 2024. Research progress in human-like indoor scene interaction. Journal of Image and Graphics, 29(06):1575-1606
人类智能是在与环境交互中进化的,因而如何实现智能体与环境的自主交互是推进智能演化的关键。环境自主交互是一项涉及计算机图形学、计算机视觉和机器人等多个学科领域的研究课题,引起广泛的关注和探究,学术界已围绕这一热点研究问题从不同视角和技术维度开展了一系列研究工作。本文着眼于室内场景拟人交互,全面梳理数字人与机器人在室内环境下学习完成特定交互任务过程中需要涉及的仿真交互平台、场景交互数据和交互生成算法3方面基本要素的研究进展。在仿真交互环境搭建方面,本文梳理了仿真环境涉及的仿真技术和研究进展,并对代表性的拟人交互仿真平台进行了介绍;在场景交互数据构建方面,本文从场景交互感知数据集、场景交互运动数据集以及交互数据规模的高效扩充3方面对国内外研究现状进行了详细介绍;在拟人交互感知与生成方面,本文介绍了以交互为导向的场景可供性分析的相关工作,并以交互生成为线索,分别梳理了数字人—场景交互生成、机器人—场景交互生成的相关工作。基于对国内外相关工作的梳理和讨论,最后从交互仿真、交互数据、交互感知和交互生成4个方面,总结了该领域目前仍面临的挑战,并对未来的发展趋势进行了展望。
Human intelligence evolves through interactions with the environment, which makes autonomous interaction between intelligent agents and the environment a key factor in advancing intelligence. Autonomous interaction with the environment is a research topic that involves multiple disciplines, such as computer graphics, computer vision, and robotics, and has attracted significant attention and exploration in recent years. In this study, we focus on human-like interaction in indoor environment and comprehensively review the research progress in the fundamental components including simulation interaction platforms, scene interaction data, and interaction generation algorithms for digital humans and robots. Regarding simulation interaction platforms, we comprehensively review representative simulation methods for virtual humans, objects, and human-object interaction. Specifically, we cover critical algorithms for articulated rigid-body simulation, deformable-body and cloth simulation, fluid simulation, contact and collision, and multi-body multi-physics coupling. In addition, we introduce several popular simulation platforms that are readily available for practitioners in the graphics, robotics, and machine learning communities. We classify these popular simulation platforms into two main categories: simulators focusing on single-physics systems and those supporting multi-physics systems. We review typical simulation platforms in both categories and discuss their advantages in human-like indoor-scene interaction. Finally, we briefly discuss several emerging trends in the physical simulation community that inspire promising future directions: developing a full-featured simulator for multi-physics multi-body physical systems, equipping modern simulation platforms with differentiability, and combining physics principles with insights from learning techniques. Regarding scene interaction data, we provide an in-depth review of the latest developments and trends in datasets that support the understanding and generation of human-scene interactions. We focus on the need for agents to perceive scenes with a focus on interaction, assimilate interactive information, and recognize human interaction patterns to improve simulation and movement generation. Our review spans three areas: perception datasets for human-scene interaction, datasets for interaction motion, and methods for scaling data efficiently. Perception datasets facilitate a deeper understanding of 3D scenes, which highlights geometry, structure, functionality, and motion. They offer resources for interaction affordances, grasping poses, interactive components, and object positioning. Motion datasets, which are essential for crafting interactions, delve into interaction movement analysis, including motion segmentation, tracking, dynamic reconstruction, action recognition, and prediction. The fidelity and breadth of these datasets are vital for creating lifelike interactions. We also discuss scaling challenges, with the limitations of manual annotation and specialized hardware, and explore current solutions like cost-effective capture systems, dataset integration, and data augmentation to enable the generation of extensive interactive models for advancing human-scene interaction research. For robot-scene interaction, this study emphasizes the importance of affordance, that is, the potential action possibilities that objects or environments can provide to users. It discusses approaches for detecting and analyzing affordance at different granularities, as well as affordance modeling techniques that combine multi-source and multimodal data. In the aspect of digital human-scene interaction, this study provides a detailed introduction to the simulation and generation methods of human motion, especially focusing on technologies based on deep learning and generative models in recent years. Building on this foundation, the study reviews ways to represent a scene and recent successful approaches that achieve high-quality human-scene interaction simulation. Finally, we discuss the challenges and future development trends in this field.
环境交互交互仿真交互数据交互感知交互生成
environment interactioninteraction simulationinteraction datainteraction perceptioninteraction generation
Ackerman M J. 1998. The visible human project. Proceedings of the IEEE, 86(3): 504-511 [DOI: 10.1109/5.662875http://dx.doi.org/10.1109/5.662875]
Ahn M, Brohan A, Brown N, Chebotar Y, Cortes O, David B, Finn C, Fu C Y, Gopalakrishnan K, Hausman K, Herzog A, Ho D, Hsu J, Ibarz J, Ichter B, Irpan A, Jang E, Ruano R J, Jeffrey K, Jesmonth S, Joshi N J, Julian R, Kalashnikov D, Kuang Y H, Lee K H, Levine S, Lu Y, Luu L, Parada C, Pastor P, Quiambao J, Rao K, Rettinghouse J, Reyes D, Sermanet P, Sievers N, Tan C, Toshev A, Vanhoucke V, Xia F, Xiao T, Xu P, Xu S C, Yan M Y and Zeng A. 2022. Do as I can, not as I say: grounding language in robotic affordances [EB/OL]. [2023-12-20]. https://arxiv.org/pdf/2204.01691.pdfhttps://arxiv.org/pdf/2204.01691.pdf
Akkaya I, Andrychowicz M, Chociej M, Litwin M, McGrew B, Petron A, Paino A, Plappert M, Powell G, Ribas R, Schneider J, Tezak N, Tworek J, Welinder P, Weng L L, Yuan Q M, Zaremba W and Zhang L. 2019. Solving rubik’s cube with a robot hand [EB/OL]. [2023-12-20]. https://arxiv.org/pdf/1910.07113.pdfhttps://arxiv.org/pdf/1910.07113.pdf
Alexanderson S, Nagy R, Beskow J and Henter G E. 2023. Listen, denoise, action! Audio-driven motion synthesis with diffusion models. ACM Transactions on Graphics, 42(4): #44 [DOI: 10.1145/3592458http://dx.doi.org/10.1145/3592458]
Andrews S and Erleben K. 2021. Contact and friction simulation for computer graphics//ACM SIGGRAPH 2021 Courses. [s.l.]: ACM: #2 [DOI: 10.1145/3450508.3464571http://dx.doi.org/10.1145/3450508.3464571]
Andrychowicz O M, Baker B, Chociej M, Józefowicz R, McGrew B, Pachocki J, Petron A, Plappert M, Powell G, Ray A, Schneider J, Sidor S, Tobin J, Welinder P, Weng L L and Zaremba W. 2020. Learning dexterous in-hand manipulation. The International Journal of Robotics Research, 39(1): 3-20 [DOI: 10.1177/0278364919887447http://dx.doi.org/10.1177/0278364919887447]
Ao T L, Gao Q Z, Lou Y K, Chen B Q and Liu L B. 2022. Rhythmic gesticulator: rhythm-aware co-speech gesture synthesis with hierarchical neural embeddings. ACM Transactions on Graphics, 41(6): 1-19 [DOI: 10.1145/3550454.3555435http://dx.doi.org/10.1145/3550454.3555435]
Ao T L, Zhang Z Y and Liu L B. 2023. GestureDiffuCLIP: gesture diffusion model with CLIP latents. ACM Transactions on Graphics, 42(4): #42 [DOI: 10.1145/3592097http://dx.doi.org/10.1145/3592097]
Arunachalam S P, Silwal S, Evans B and Pinto L. 2023. Dexterous imitation made easy: a learning-based framework for efficient dexterous manipulation//Proceedings of 2023 IEEE International Conference on Robotics and Automation (ICRA). London, England: IEEE: 5954-5961 [DOI: 10.1109/icra48891.2023.10160275http://dx.doi.org/10.1109/icra48891.2023.10160275]
Azadi S, Shah A, Hayes T, Parikh D and Gupta S. 2023. Make-an-animation: large-scale text-conditional 3D human motion generation//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision (ICCV). Paris, France: IEEE: 14993-15002 [DOI: 10.1109/ICCV51070.2023.01381http://dx.doi.org/10.1109/ICCV51070.2023.01381]
Bargteil A W, Shinar T and Kry P G. 2020. An introduction to physics-based animation//SIGGRAPH Asia 2020 Courses. [s.l.]: ACM: #5 [DOI: 10.1145/3415263.3419147http://dx.doi.org/10.1145/3415263.3419147]
Barquero G, Escalera S and Palmero C. 2023. BelFusion: latent diffusion for behavior-driven human motion prediction//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 2317-2327 [DOI: 10.1109/ICCV51070.2023.00220http://dx.doi.org/10.1109/ICCV51070.2023.00220]
Batty C, Bertails F and Bridson R. 2007. A fast variational framework for accurate solid-fluid coupling//ACM SIGGRAPH 2007 Papers. San Diego, USA: ACM: #100 [DOI: 10.1145/1275808.1276502http://dx.doi.org/10.1145/1275808.1276502]
Becker M, Ihmsen M and Teschner M. 2009. Corotated SPH for deformable solids//Proceedings of the 5th Eurographics conference on Natural Phenomena. Munich, Germany: Eurographics Association: 27-34
Bender J, Erleben K and Trinkle J. 2014. Interactive simulation of rigid body dynamics in computer graphics. Computer Graphics Forum, 33(1): 246-270 [DOI: 10.1111/cgf.12272http://dx.doi.org/10.1111/cgf.12272]
Bender J and Koschier D. 2015. Divergence-free smoothed particle hydrodynamics//The 14th ACM SIGGRAPH/Eurographics Symposium on Computer Animation. Los Angeles, USA: ACM: 147-155 [DOI: 10.1145/2786784.2786796http://dx.doi.org/10.1145/2786784.2786796]
Bhatnagar B L, Xie X H, Petrov I A, Sminchisescu C, Theobalt C and Pons-Moll G. 2022. BEHAVE: dataset and method for tracking human object interactions//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 15914-15925 [DOI: 10.1109/cvpr52688.2022.01547http://dx.doi.org/10.1109/cvpr52688.2022.01547]
Bouaziz S, Martin S, Liu T T, Kavan L and Pauly M. 2014. Projective dynamics: fusing constraint projections for fast simulation. ACM Transactions on Graphics, 33(4): #154 [DOI: 10.1145/2601097.2601116http://dx.doi.org/10.1145/2601097.2601116]
Brahmbhatt S, Ham C, Kemp C C and Hays J. 2019. ContactDB: analyzing and predicting grasp contact via thermal imaging//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 8701-8711 [DOI: 10.1109/cvpr.2019.00891http://dx.doi.org/10.1109/cvpr.2019.00891]
Brahmbhatt S, Tang C C, Twigg C D, Kemp C C and Hays J. 2020. ContactPose: a dataset of grasps with object contact and hand pose//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 361-378 [DOI: 10.1007/978-3-030-58601-0_22http://dx.doi.org/10.1007/978-3-030-58601-0_22]
Brohan A, Brown N, Carbajal J, Chebotar Y, Chen X, Choromanski K, Ding T L, Driess D, Dubey A, Finn C, Florence P, Fu C Y, Arenas M G, Gopalakrishnan K, Han K H, Hausman K, Herzog A, Hsu J, Ichter B, Irpan A, Joshi N, Julian R, Kalashnikov D, Kuang Y H, Leal I, Lee L, Lee T W E, Levine S, Lu Y, Michalewski H, Mordatch I, Pertsch K, Rao K, Reymann K, Ryoo M, Salazar G, Sanketi P, Sermanet P, Singh J, Singh A, Soricut R, Tran H, Vanhoucke V, Vuong Q, Wahid A, Welker S, Wohlhart P, Wu J L, Xia F, Xiao T, Xu P, Xu S C, Yu T H and Zitkovich B. 2023. RT-2: vision-language-action models transfer web knowledge to robotic control [EB/OL]. [2023-12-20]. https://arxiv.org/pdf/2307.15818.pdfhttps://arxiv.org/pdf/2307.15818.pdf
Büttner M. 2015. Motion matching-the road to next gen animation [EB/OL]. [2023-12-20]. https://www.youtube.com/watch?v=z_wpgHFSWss&t=658shttps://www.youtube.com/watch?v=z_wpgHFSWss&t=658s
Catto E. 2023. Box2D [EB/OL]. [2023-12-20]. https://github.com/erincatto/box2dhttps://github.com/erincatto/box2d
Chang A X, Funkhouser T, Guibas L, Hanrahan P, Huang Q X, Li Z M, Savarese S, Savva M, Song S R, Su H, Xiao J X, Yi L and Yu F. 2015. Shapenet: an information-rich 3d model repository [EB/OL]. [2023-12-20]. https://arxiv.org/pdf/1512.03012.pdfhttps://arxiv.org/pdf/1512.03012.pdf
Chao Y W, Yang W, Xiang Y, Molchanov P, Handa A, Tremblay J, Narang Y S, Van Wyk K, Iqbal U, Birchfield S, Kautz J and Fox D. 2021. DexYCB: a benchmark for capturing hand grasping of objects//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 9040-9049 [DOI: 10.1109/cvpr46437.2021.00893http://dx.doi.org/10.1109/cvpr46437.2021.00893]
Chen J, Gao D F, Lin K Q and Shou M Z. 2023a. Affordance grounding from demonstration video to target image//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, Canada: IEEE: 6799-6808 [DOI: 10.1109/CVPR52729.2023.00657http://dx.doi.org/10.1109/CVPR52729.2023.00657]
Chen L H, Zhang J W, Li Y W, Pang Y R, Xia X B and Liu T L. 2023b. HumanMAC: masked motion completion for human motion prediction//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision (ICCV). Paris, France: IEEE: 9510-9521 [DOI: 10.1109/ICCV51070.2023.00875http://dx.doi.org/10.1109/ICCV51070.2023.00875]
Chen S R, Wu A and Liu C K. 2023c. Synthesizing dexterous nonprehensile pregrasp for ungraspable objects//Proceedings of 2023 ACM SIGGRAPH Conference. Los Angeles, USA: Association for Computing Machinery: #10 [DOI: 10.1145/3588432.3591528http://dx.doi.org/10.1145/3588432.3591528]
Chen T, Xu J and Agrawal P. 2022a. A system for general in-hand object Re-orientation//Proceedings of 2022 Conference on Robot Learning. London, UK: PMLR: 297-307
Chen X, Jiang B, Liu W, Huang Z L, Fu B, Chen T and Yu G. 2023d. Executing your commands via motion diffusion in latent space//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 18000-18010 [DOI: 10.1109/cvpr52729.2023.01726http://dx.doi.org/10.1109/cvpr52729.2023.01726]
Chen X W, Ni X Y, Zhu B, Wang B and Chen B Q. 2022b. Simulation and optimization of magnetoelastic thin shells. ACM Transactions on Graphics, 41(4): #61 [DOI: 10.1145/3528223.3530142http://dx.doi.org/10.1145/3528223.3530142]
Chen X X, Liu T Y, Zhao H, Zhou G Y and Zhang Y Q. 2022c. Cerberus Transformer: joint semantic, affordance and attribute parsing//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 19617-19626 [DOI: 10.1109/cvpr52688.2022.01903http://dx.doi.org/10.1109/cvpr52688.2022.01903]
Chen Y N, Li M C, Lan L, Su H, Yang Y and Jiang C F F. 2022d. A unified newton barrier method for multibody dynamics. ACM Transactions on Graphics, 41(4): #66 [DOI: 10.1145/3528223.3530076http://dx.doi.org/10.1145/3528223.3530076]
Chen Y P, Wu T H, Wang S J, Feng X D, Jiang J C, Lu Z Q, McAleer S, Dong H, Zhu S C and Yang Y D. 2022e. Towards human-level bimanual dexterous manipulation with reinforcement learning//Proceedings of the 36th International Conference on Neural Information Processing Systems. New Orleans, USA: NeurIPS: 5150-5163
Chu M Y and Thuerey N. 2017. Data-driven synthesis of smoke flows with CNN-based feature descriptors. ACM Transactions on Graphics, 36(4): #69 [DOI: 10.1145/3072959.3073643http://dx.doi.org/10.1145/3072959.3073643]
Coumans E and Bai Y. 2021. PyBullet, a python module for physic simulation for games, robotics and machine learning [EB/OL]. [2023-12-20]. http://pybullet.orghttp://pybullet.org
Dabral R, Mughal M H, Golyanik V and Theobalt C. 2023. MoFusion: a framework for denoising-diffusion-based motion synthesis//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 9760-9770 [DOI: 10.1109/cvpr52729.2023.00941http://dx.doi.org/10.1109/cvpr52729.2023.00941]
Damen D, Doughty H, Farinella G M, Fidler S, Furnari A, Kazakos E, Moltisanti D, Munro J, Perrett T, Price W and Wray M. 2018. Scaling egocentric vision: the EPIC-KITCHENS dataset//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 753-771 [DOI: 10.1007/978-3-030-01225-0_44http://dx.doi.org/10.1007/978-3-030-01225-0_44]
Damen D, Doughty H, Farinella G M, Fidler S, Furnari A, Kazakos E, Moltisanti D, Munro J, Perrett T, Price W and Wray M. 2021. The EPIC-KITCHENS dataset: collection, challenges and baselines. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(11): 4125-4141 [DOI: 10.1109/tpami.2020.2991965http://dx.doi.org/10.1109/tpami.2020.2991965]
Deng S H, Xu X, Wu C Z, Chen K and Jia K. 2021. 3D AffordanceNet: a benchmark for visual object affordance understanding//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 1778-1787 [DOI: 10.1109/cvpr46437.2021.00182http://dx.doi.org/10.1109/cvpr46437.2021.00182]
Driess D, Xia F, Sajjadi M S M, Lynch C, Chowdhery A, Ichter B, Wahid A, Tompson J, Vuong Q, Yu T H, Huang W L, Chebotar Y, Sermanet P, Duckworth D, Levine S, Vanhoucke V, Hausman K, Toussaint M, Greff K, Zeng A, Mordatch I and Florence P. 2023. PaLM-E: an embodied multimodal language model//Proceedings of the 40th International Conference on Machine Learning. Honolulu, USA: PMLR: 8469-8488
Erez T, Tassa Y and Todorov E. 2015. Simulation tools for model-based robotics: comparison of bullet, Havok, MuJoCo, ODE and PhysX//Proceedings of 2015 IEEE International Conference on Robotics and Automation (ICRA). Seattle, USA: IEEE: 4397-4404 [DOI: 10.1109/icra.2015.7139807http://dx.doi.org/10.1109/icra.2015.7139807]
Fan Z C, Taheri O, Tzionas D, Kocabas M, Kaufmann M, Black M J and Hilliges O. 2023. ARCTIC: a dataset for dexterous bimanual hand-object manipulation//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 12943-12954 [DOI: 10.1109/cvpr52729.2023.01244http://dx.doi.org/10.1109/cvpr52729.2023.01244]
Fang H J, Fang H S, Wang Y M, Ren J J, Chen J J, Zhang R, Wang W M and Lu C W. 2023a. Low-cost exoskeletons for learning whole-arm manipulation in the wild [EB/OL]. [2023-12-20]. https://arxiv.org/pdf/2309.14975.pdfhttps://arxiv.org/pdf/2309.14975.pdf
Fang H S, Fang H J, Tang Z Y, Liu J R, Wang C X, Wang J B, Zhu H Y and Lu C W. 2023b. RH20T: a comprehensive robotic dataset for learning diverse skills in one-shot//Proceedings of the 7th Conference on Robot Learning (CoRL 2023). Atlanta, USA: CoRL: #9
Faure F, Duriez C, Delingette H, Allard J, Gilles B, Marchesseau S, Talbot H, Courtecuisse H, Bousquet G, Peterlik I and Cotin S. 2012. SOFA: a multi-model framework for interactive physical simulation//Payan Y, ed. Soft Tissue Biomechanical Modeling for Computer Assisted Surgery. Berlin Heidelberg, Germany: Springer: 283-321 [DOI: 10.1007/8415_2012_125http://dx.doi.org/10.1007/8415_2012_125]
Featherstone R. 1984. Robot dynamics algorithms. Edinburgh, UK: The University of Edinburgh
Ferguson Z, Li M C, Schneider T, Gil-Ureta F, Langlois T, Jiang C F F, Zorin D, Kaufman D M and Panozzo D. 2021. Intersection-free rigid body dynamics. ACM Transactions on Graphics, 40(4): #183 [DOI: 10.1145/3450626.3459802http://dx.doi.org/10.1145/3450626.3459802]
Foster N and Fedkiw R. 2001. Practical animation of liquids//Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques. Los Angeles, USA: ACM: 23-30 [DOI: 10.1145/383259.383261http://dx.doi.org/10.1145/383259.383261]
Freeman D, Frey E, Raichuk A, Girgin S, Mordatch I and Bachem O. 2021. Brax—a differentiable physics engine for large scale rigid body simulation//Proceedings of the 1st Neural Information Processing Systems Track on Datasets and Benchmarks 1. [s.l.]: NeurIPS: #404
Fu Z P, Cheng X X and Pathak D. 2023. Deep whole-body control: learning a unified policy for manipulation and locomotion//Proceedings of the 6th Conference on Robot Learning. Auckland, New Zealand: PMLR: 138-149
Garcia-Hernando G, Yuan S X, Baek S and Kim T K. 2018. First-person hand action benchmark with RGB-D videos and 3d hand pose annotations//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 409-419 [DOI: 10.1109/cvpr.2018.00050http://dx.doi.org/10.1109/cvpr.2018.00050]
Gästrin J. 2004. Physically based character simulation—rag doll behaviour in computer games. Stockholm, Sweden: Royal Institute of Technology
Geng H R, Li Z M, Geng Y R, Chen J Y, Dong H and Wang H. 2023a. PartManip: learning cross-category generalizable part manipulation policy from point cloud observations//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 2978-2988 [DOI: 10.1109/cvpr52729.2023.00291http://dx.doi.org/10.1109/cvpr52729.2023.00291]
Geng H R, Xu H L, Zhao C Y, Xu C, Yi L, Huang S Y and Wang H. 2023b. GAPartNet: cross-category domain generalizable object perception and manipulation via generalizable and actionable parts//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 7081-7091 [DOI: 10.1109/cvpr52729.2023.00684http://dx.doi.org/10.1109/cvpr52729.2023.00684]
Geng Y R, An B S, Geng H R, Chen Y P, Yang Y D and Dong H. 2023c. RLAfford: end-to-end affordance learning for robotic manipulation//Proceedings of 2023 IEEE International Conference on Robotics and Automation (ICRA). London, England: IEEE: 5880-5886 [DOI: 10.1109/icra48891.2023.10161571http://dx.doi.org/10.1109/icra48891.2023.10161571]
Google DeepMind. 2023. MuJoCo 3 [EB/OL].[2023-12-20]. https://github.com/google-deepmind/mujoco/discussions/1101https://github.com/google-deepmind/mujoco/discussions/1101
Goyal R, Ebrahimi Kahou S, Michalski V, Materzynska J, Westphal S, Kim H, Haenel V, Fruend I, Yianilos P, Mueller-Freitag M, Hoppe F, Thurau C, Bax I and Memisevic R. 2017. The “something something” video database for learning and evaluating visual common sense//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 5843-5851 [DOI: 10.1109/iccv.2017.622http://dx.doi.org/10.1109/iccv.2017.622]
Grauman K, Westbury A, Byrne E, Chavis Z, Furnari A, Girdhar R, Hamburger J, Jiang H, Liu M, Liu X Y, Martin M, Nagarajan T, Radosavovic I, Ramakrishnan S K, Ryan F, Sharma J, Wray M, Xu M M, Xu E Z, Zhao C, Bansal S, Batra D, Cartillier V, Crane S, Do T, Doulaty M, Erapalli A, Feichtenhofer C, Fu Q C, Gebreselasie A, Gonzlez C, Hillis J, Huang X H, Huang Y F, Jia W Q, Khoo W, Kolĭ J, Kottur S, Kumar A, Landini F, Li C, Li Y H, Li Z Q, Mangalam K, Modhugu R, Munro J, Murrell T, Nishiyasu T, Price W, Puentes P R, Ramazanova M, Sari L, Somasundaram K, Southerland A, Sugano Y, Tao R J, Vo M, Wang Y C, Wu X D, Yagi T, Zhao Z W, Zhu Y Y, Arbelez P, Crandall D, Damen D, Farinella G M, Fuegen C, Ghanem B, Ithapu V K, Jawahar C V, Joo H, Kitani K, Li H Z, Newcombe R, Oliva A, Park H S, Rehg J M, Sato Y, Shi J B, Shou M Z, Torralba A, Torresani L, Yan M F and Malik J. 2022. Ego4D: around the world in 3,000 hours of egocentric video//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 18973-18990 [DOI: 10.1109/CVPR52688.2022.01842http://dx.doi.org/10.1109/CVPR52688.2022.01842]
Ha H and Song S. 2022. FlingBot: the unreasonable effectiveness of dynamic manipulation for cloth unfolding//Proceedings of the 5th Conference on Robot Learning. London, UK: PMLR: 24-33
Hampali S, Rad M, Oberweger M and Lepetit V. 2020. HOnnotate: a method for 3d annotation of hand and object poses//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 3193-3203 [DOI: 10.1109/cvpr42600.2020.00326http://dx.doi.org/10.1109/cvpr42600.2020.00326]
Harvey F G, Yurick M, Nowrouzezahrai D and Pal C. 2020. Robust motion in-betweening. ACM Transactions on Graphics, 39(4): #60 [DOI: 10.1145/3386569.3392480http://dx.doi.org/10.1145/3386569.3392480]
Hassan M, Ceylan D, Villegas R, Saito J, Yang J M, Zhou Y, Black M J. 2021a. Stochastic scene-aware motion prediction//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 11354-11364 [DOI: 10.1109/iccv48922.2021.01118http://dx.doi.org/10.1109/iccv48922.2021.01118]
Hassan M, Choutas V, Tzionas D and Black M. 2019. Resolving 3D human pose ambiguities with 3D scene constraints//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 2282-2292 [DOI: 10.1109/iccv.2019.00237http://dx.doi.org/10.1109/iccv.2019.00237]
Hassan M, Ghosh P, Tesch J, Tzionas D and Black M J. 2021b. Populating 3D scenes by learning human-scene Interaction//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: 14703-14713 [DOI: 10.1109/cvpr46437.2021.01447http://dx.doi.org/10.1109/cvpr46437.2021.01447]
Hassan M, Guo Y R, Wang T W, Black M, Fidler S and Peng X B. 2023. Synthesizing physical character-scene interactions//Proceedings of 2023 ACM SIGGRAPH Conference.Los Angeles, USA: Association for Computing Machinery: #63 [DOI: 10.1145/3588432.3591525http://dx.doi.org/10.1145/3588432.3591525]
Hasson Y, Varol G, Tzionas D, Kalevatykh I, Black M J, Laptev I and Schmid C. 2019. Learning joint reconstruction of hands and manipulated objects//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 11799-11808 [DOI: 10.1109/cvpr.2019.01208http://dx.doi.org/10.1109/cvpr.2019.01208]
Heiden E, Macklin M, Narang Y, Fox D, Garg A and Ramos F. 2021. DiSECt: a differentiable simulation engine for autonomous robotic cutting//Proceedings of the 17th Robotics: Science and Systems. [s.l.]: Robotics: Science and Systems: #67 [DOI: 10.15607/RSS.2021.XVII.067http://dx.doi.org/10.15607/RSS.2021.XVII.067]
Henter G E, Alexanderson S and Beskow J. 2020. MoGlow: probabilistic and controllable motion synthesis using normalising flows. ACM Transactions on Graphics, 39(6): #236 [DOI: 10.1145/3414685.3417836http://dx.doi.org/10.1145/3414685.3417836]
Ho J and Salimans T. 2022. Classifier-free diffusion guidance [EB/OL]. [2023-12-20]. https://arxiv.org/pdf/2207.12598.pdfhttps://arxiv.org/pdf/2207.12598.pdf
Holden D, Komura T and Saito J. 2017. Phase-functioned neural networks for character control. ACM Transactions on Graphics, 36(4): #42 [DOI: 10.1145/3072959.3073663http://dx.doi.org/10.1145/3072959.3073663]
Holl P, Koltun V, Um K and Thuerey N. 2020. phiflow: a differentiable PDE solving framework for deep learning via physical simulations//Workshop on Differentiable Vision, Graphics, and Physics in Machine Learning at NeurIPS 2020. [s.l.]: [s.n.]
Hu R Z, Li W C, Van Kaick O, Shamir A, Zhang H and Huang H. 2017. Learning to predict part mobility from a single static snapshot. ACM Transactions on Graphics, 36(6): #227 [DOI: 10.1145/3130800.3130811http://dx.doi.org/10.1145/3130800.3130811]
Hu Y M, Anderson L, Li T M, Sun Q, Carr N, Ragan-Kelley J and Durand F. 2020. DiffTaichi: differentiable programming for physical simulation//Proceedings of the 8th International Conference on Learning Representations. Addis Ababa, Ethiopia: ICLR: 1-18
Huang D A, Nair S, Xu D F, Zhu Y K, Garg A, Li F F, Savarese S and Niebles J C. 2019. Neural task graphs: generalizing to unseen tasks from a single video demonstration//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 8557-8566 [DOI: 10.1109/cvpr.2019.00876http://dx.doi.org/10.1109/cvpr.2019.00876]
Huang S Y, Wang Z, Li P H, Jia B X, Liu T Y, Zhu Y X, Liang W and Zhu S C. 2023a. Diffusion-based generation, optimization, and planning in 3D scenes//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, Canada: IEEE: 16750-16761 [DOI: 10.1109/CVPR52729.2023.01607http://dx.doi.org/10.1109/CVPR52729.2023.01607]
Huang W L, Wang C, Zhang R H, Li Y Z, Wu J J and Li F F. 2023b. VoxPoser: composable 3D value maps for robotic manipulation with language models//Proceedings of the 7th Conference on Robot Learning. Atlanta, USA: PMLR: 540-562
Huang Y H, Taheri O, Black M J and Tzionas D. 2022. InterCap: joint markerless 3D tracking of humans and objects in interaction//Proceedings of the 44th DAGM German Conference on Pattern Recognition. Konstanz, Germany: Springer: 281-299 [DOI: 10.1007/978-3-031-16788-1_18http://dx.doi.org/10.1007/978-3-031-16788-1_18]
Huang Z A, Hu Y M, Du T, Zhou S Y, Su H, Tenenbaum J B and Gan C. 2021. PlasticineLab: a soft-body manipulation benchmark with differentiable physics//Proceedings of the 9th International Conference on Learning Representations. [s.l.]: ICLR: 1-18
Ihmsen M, Orthmann J, Solenthaler B, Kolb A and Teschner M. 2014. SPH fluids in computer graphics//Proceedings of the 35th Annual Conference of the European Association for Computer Graphics. Strasbourg, France: Eurographics: 21-42 [DOI: 10.2312/egst.20141034http://dx.doi.org/10.2312/egst.20141034]
Jauhri S, Peters J and Chalvatzaki G. 2022. Robot learning of mobile manipulation with reachability behavior priors. IEEE Robotics and Automation Letters, 7(3): 8399-8406 [DOI: 10.1109/lra.2022.3188109http://dx.doi.org/10.1109/lra.2022.3188109]
Jian J T, Liu X P, Li M Y, Hu R Z and Liu J. 2023. AffordPose: a large-scale dataset of hand-object interactions with affordance-driven hand pose//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 14667-14678 [DOI: 10.1109/ICCV51070.2023.01352http://dx.doi.org/10.1109/ICCV51070.2023.01352]
Jiang B, Chen X, Liu W, Yu J Y, Yu G and Chen T. 2023. MotionGPT: human motion as a foreign language//Proceedings of the 37th International Conference on Neural Information Processing Systems. New Orleans, USA: NeurIPS: #14795
Jiang C F F, Schroeder C, Teran J, Stomakhin A and Selle A. 2016. The material point method for simulating continuum materials//ACM SIGGRAPH 2016 Courses. Anaheim, USA: ACM: #24 [DOI: 10.1145/2897826.2927348http://dx.doi.org/10.1145/2897826.2927348]
Kalashnikov D, Irpan A, Pastor P, Ibarz J, Herzog A, Jang E, Quillen D, Holly E, Kalakrishnan M, Vanhoucke V and Levine S. 2018. QT-opt: scalable deep reinforcement learning for vision-based robotic manipulation [EB/OL]. [2023-12-20]. https://arxiv.org/pdf/2018.10293.pdfhttps://arxiv.org/pdf/2018.10293.pdf
Karniadakis G E, Kevrekidis I G, Lu L, Perdikaris P, Wang S F and Yang L. 2021. Physics-informed machine learning. Nature Reviews Physics, 3(6): 422-440 [DOI: 10.1038/s42254-021-00314-5http://dx.doi.org/10.1038/s42254-021-00314-5]
Karunratanakul K, Preechakul K, Suwajanakorn S and Tang S Y. 2023. Guided motion diffusion for controllable human motion synthesis//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 2151-2162 [DOI: 10.1109/ICCV51070.2023.00205http://dx.doi.org/10.1109/ICCV51070.2023.00205]
Kim J and Pollard N S. 2011. Fast simulation of skeleton-driven deformable body characters. ACM Transactions on Graphics, 30(5): #121 [DOI: 10.1145/2019627.2019640http://dx.doi.org/10.1145/2019627.2019640]
Kong H Y, Gong K H, Lian D Z, Mi M B and Wang X C. 2023. Priority-centric human motion generation in discrete latent space//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 14760-14770 [DOI: 10.1109/ICCV51070.2023.01360http://dx.doi.org/10.1109/ICCV51070.2023.01360]
Kumar S, Zamora J, Hansen N, Jangir R and Wang X L. 2023. Graph inverse reinforcement learning from diverse videos//Proceedings of the 6th Conference on Robot Learning. Auckland, New Zealand: PMLR: 55-66
Kwon T, Tekin B, Stühmer J, Bogo F and Pollefeys M. 2021. H2O: two hands manipulating objects for first person interaction recognition//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 10118-10128 [DOI: 10.1109/iccv48922.2021.00998http://dx.doi.org/10.1109/iccv48922.2021.00998]
Lan L, Kaufman D M, Li M C, Jiang C F F and Yang Y. 2022a. Affine body dynamics: fast, stable and intersection-free simulation of stiff materials. ACM Transactions on Graphics, 41(4): #67 [DOI: 10.1145/3528223.3530064http://dx.doi.org/10.1145/3528223.3530064]
Lan L, Ma G Q, Yang Y, Zheng C X, Li M C and Jiang C F F. 2022b. Penetration-free projective dynamics on the GPU. ACM Transactions on Graphics, 41(4): #29 [DOI: 10.1145/3528223.3530069http://dx.doi.org/10.1145/3528223.3530069]
Lanczos C. 2012. The Variational Principles of Mechanics. North Chelmsford: Courier Corporation
Lee J, Grey M X, Ha S, Kunz T, Jain S, Ye Y, Srinivasa S S, Stilman M and Liu C K. 2018. Dart: dynamic animation and robotics toolkit. The Journal of Open Source Software, 3(22): #500 [DOI: 10.21105/joss.00500http://dx.doi.org/10.21105/joss.00500]
Lee S, Park M, Lee K and Lee J. 2019. Scalable muscle-actuated human simulation and control. ACM Transactions on Graphics, 38(4): #73 [DOI: 10.1145/3306346.3322972http://dx.doi.org/10.1145/3306346.3322972]
Li G, Jampani V, Sun D Q and Sevilla-Lara L. 2023a. LOCATE: localize and transfer object parts for weakly supervised affordance grounding//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, Canada: IEEE: 10922-10931 [DOI: 10.1109/CVPR52729.2023.01051http://dx.doi.org/10.1109/CVPR52729.2023.01051]
Li M C, Ferguson Z, Schneider T, Langlois T, Zorin D, Panozzo D, Jiang C F F and Kaufman D M. 2020. Incremental potential contact: intersection-and inversion-free, large-deformation dynamics. ACM Transactions on Graphics, 39(4): #49 [DOI: 10.1145/3386569.3392425http://dx.doi.org/10.1145/3386569.3392425]
Li M C, Kaufman D M and Jiang C F F. 2021. Codimensional incremental potential contact. ACM Transactions on Graphics, 40(4): #170 [DOI: 10.1145/3450626.3459767http://dx.doi.org/10.1145/3450626.3459767]
Li P F, Tian B W, Shi Y L, Chen X X, Zhao H, Zhou G Y and Zhang Y Q. 2022a. TOIST: task oriented instance segmentation transformer with noun-pronoun distillation//Proceedings of the 36th International Conference on Neural Information Processing Systems. New Orleans, USA: NeurIPS: 17597-17611
Li P Z, Aberman K, Zhang Z H, Hanocka R and Sorkine-Hornung O. 2022b. GANimator: neural motion synthesis from a single sequence. ACM Transactions on Graphics, 41(4): #138 [DOI: 10.1145/3528223.3530157http://dx.doi.org/10.1145/3528223.3530157]
Li R H, Zhao J F, Zhang Y C, Su M Y, Ren Z P, Zhang H, Tang Y S and Li X. 2023b. FineDance: a fine-grained choreography dataset for 3D full body dance generation//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 10200-10209 [DOI: 10.1109/ICCV51070.2023.00939http://dx.doi.org/10.1109/ICCV51070.2023.00939]
Li W, Ma Y H, Liu X P and Desbrun M. 2022c. Efficient kinetic simulation of two-phase flows. ACM Transactions on Graphics, 41(4): #114 [DOI: 10.1145/3528223.3530132http://dx.doi.org/10.1145/3528223.3530132]
Li W Y, Chen X L, Li P Z, Sorkine-Hornung O and Chen B Q. 2023c. Example-based motion synthesis via generative motion matching. ACM Transactions on Graphics, 42(4): #94 [DOI: 10.1145/3592395http://dx.doi.org/10.1145/3592395]
Li X Y and Chen D. 2023. The improved atrous spatial pyramid pooling and polarized self-attention based bottom-up panoptic segmentation. Journal of Image and Graphics, 28(8): 2410-2419
李新叶, 陈丁. 2023. 融合改进ASPP和极化自注意力的自底向上全景分割. 中国图象图形学报, 28(8): 2410-2419 [DOI: 10.11834/jig.220279http://dx.doi.org/10.11834/jig.220279]
Li Z H, Xu Q Y, Ye X H, Ren B and Liu L G. 2023d. DiffFR: differentiable SPH-based fluid-rigid coupling for rigid body control. ACM Transactions on Graphics, 42(6): #179 [DOI: 10.1145/3618318http://dx.doi.org/10.1145/3618318]
Liang Y Z, Wang X H, Zhu L C and Yang Y. 2023. MAAL: multimodality-aware autoencoder-based affordance learning for 3D articulated objects//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 217-227 [DOI: 10.1109/ICCV51070.2023.00027http://dx.doi.org/10.1109/ICCV51070.2023.00027]
Lin X Y, Qi C, Zhang Y C, Huang Z A, Fragkiadaki K, Li Y Z, Gan C and Held D. 2022. Planning with spatialtemporal abstraction from point clouds for deformable object manipulation//Proceedings of the 6th Conference on Robot Learning. Auckland, New Zealand: PMLR: 1640-1651
Lin X Y, Wang Y F, Olkin J and Held D. 2021. SoftGym: benchmarking deep reinforcement learning for deformable object manipulation//Proceedings of 2020 Conference on Robot Learning. Cambridge, USA: PMLR: 432-448
Ling H Y, Zinno F, Cheng G and Van De Panne M. 2020. Character controllers using motion VAEs. ACM Transactions on Graphics, 39(4): #40 [DOI: 10.1145/3386569.3392422http://dx.doi.org/10.1145/3386569.3392422]
Liu H Y, Iwamoto N, Zhu Z H, Li Z Q, Zhou Y, Bozkurt E and Zheng B. 2022a. DisCo: disentangled implicit content and rhythm learning for diverse co-speech gestures synthesis//Proceedings of the 30th ACM International Conference on Multimedia. Lisbon, Portugal: ACM: 3764-3773 [DOI: 10.1145/3503161.3548400http://dx.doi.org/10.1145/3503161.3548400]
Liu L B and Hodgins J. 2018. Learning basketball dribbling skills using trajectory optimization and deep reinforcement learning. ACM Transactions on Graphics, 37(4): #142 [DOI: 10.1145/3197517.3201315http://dx.doi.org/10.1145/3197517.3201315]
Liu L B, van de Panne M and Yin K K. 2016. Guided learning of control graphs for physics-based characters. ACM Transactions on Graphics, 35(3): #29 [DOI: 10.1145/2893476http://dx.doi.org/10.1145/2893476]
Liu L B, Yin K K, Wang B and Guo B N. 2013. Simulation and control of skeleton-driven soft body characters. ACM Transactions on Graphics, 32(6): #215 [DOI: 10.1145/2508363.2508427http://dx.doi.org/10.1145/2508363.2508427]
Liu M, Pan Z R, Xu K, Ganguly K and Manocha D. 2020. Deep differentiable grasp planner for high-DOF grippers//Proceedings of the 16th Robotics: Science and Systems. Corvalis, USA: Robotics: Science and Systems: #66 [DOI: 10.15607/rss.2020.xvi.066http://dx.doi.org/10.15607/rss.2020.xvi.066]
Liu W Y, Du Y L, Hermans T, Chernova S and Paxton C. 2023. StructDiffusion: language-guided creation of physically-valid structures using unseen objects//Proceedings of the 19th Robotics: Science and Systems. Daegu, Korea(South): Robotics: Science and Systems: #3 [DOI: 10.15607/rss.2023.xix.031http://dx.doi.org/10.15607/rss.2023.xix.031]
Liu W Y, Paxton C, Hermans T and Fox D. 2022b. StructFormer: learning spatial structure for language-guided semantic rearrangement of novel objects//Proceedings of 2022 International Conference on Robotics and Automation. Philadelphia, USA: IEEE: 6322-6329 [DOI: 10.1109/icra46639.2022.9811931http://dx.doi.org/10.1109/icra46639.2022.9811931]
Liu Y Z, Liu Y, Jiang C, Lyu K, Wan W K, Shen H, Liang B Q, Fu Z J, Wang H and Yi L. 2022c. HOI4D: a 4D egocentric dataset for category-level human-object interaction//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 20981-20990 [DOI: 10.1109/cvpr52688.2022.02034http://dx.doi.org/10.1109/cvpr52688.2022.02034]
Luo R, Xu W W, Shao T J, Xu H Y and Yang Y. 2019. Accelerated complex-step finite difference for expedient deformable simulation. ACM Transactions on Graphics, 38(6): #160 [DOI: 10.1145/3355089.3356493http://dx.doi.org/10.1145/3355089.3356493]
Lyu C Y, Bai K, Wu Y H, Desbrun M, Zheng C X and Liu X P. 2023. Building a virtual weakly-compressible wind tunnel testing facility. ACM Transactions on Graphics, 42(4): #125 [DOI: 10.1145/3592394http://dx.doi.org/10.1145/3592394]
Mahler J, Liang J, Niyaz S, Laskey M, Doan R, Liu X Y, Ojea J A and Goldberg K. 2017. Dex-Net 2.0: deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics//Proceedings of Robotics: Science and Systems XIII. Cambridge, USA: Robotics: Science and Systems: #58 [DOI: 10.15607/rss.2017.xiii.058http://dx.doi.org/10.15607/rss.2017.xiii.058]
Mahmood N, Ghorbani N, Troje N F, Pons-Moll G and Black M. 2019. AMASS: archive of motion capture as surface shapes//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 5441-5450 [DOI: 10.1109/iccv.2019.00554http://dx.doi.org/10.1109/iccv.2019.00554]
Maitin-Shepard J, Cusumano-Towner M, Lei J N and Abbeel P. 2010. Cloth grasp point detection based on multiple-view geometric cues with application to robotic towel folding//Proceedings of 2010 IEEE International Conference on Robotics and Automation. Anchorage, USA: IEEE: 2308-2315 [DOI: 10.1109/robot.2010.5509439http://dx.doi.org/10.1109/robot.2010.5509439]
Mandikal P and Grauman K. 2021. Learning dexterous grasping with object-centric visual affordances//Proceedings of 2021 IEEE International Conference on Robotics and Automation (ICRA). Xi’an, China: IEEE: 6169-6176 [DOI: 10.1109/icra48506.2021.9561802http://dx.doi.org/10.1109/icra48506.2021.9561802]
Mei C H and Shi J Y. 2003. Boundary element method based soft object simulation. Chinese Journal of Computers, 26(12): 1709-1716
梅春晖, 石教英. 2003. 基于边界元素法的柔软物体变形模拟. 计算机学报, 26(12): 1709-1716 [DOI: 10.3321/j.issn:0254-4164.2003.12.014http://dx.doi.org/10.3321/j.issn:0254-4164.2003.12.014]
Merel J, Tunyasuvunakool S, Ahuja A, Tassa Y, Hasenclever L, Pham V, Erez T, Wayne G and Heess N. 2020. Catch & carry: reusable neural controllers for vision-guided whole-body tasks. ACM Transactions on Graphics, 39(4): #39 [DOI: 10.1145/3386569.3392474http://dx.doi.org/10.1145/3386569.3392474]
Miller A T and Allen P K. 2004. Graspit! A versatile simulator for robotic grasping. IEEE Robotics and Automation Magazine, 11(4): 110-122 [DOI: 10.1109/MRA.2004.1371616http://dx.doi.org/10.1109/MRA.2004.1371616]
Mo K C, Zhu S L, Chang A X, Yi L, Tripathi S, Guibas L J and Su H. 2019. PartNet: a large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 909-918 [DOI: 10.1109/cvpr.2019.00100http://dx.doi.org/10.1109/cvpr.2019.00100]
Müller M, Heidelberger B, Hennix M and Ratcliff J. 2007. Position based dynamics. Journal of Visual Communication and Image Representation, 18(2): 109-118 [DOI: 10.1016/j.jvcir.2007.01.005http://dx.doi.org/10.1016/j.jvcir.2007.01.005]
Müller M, Keiser R, Nealen A, Pauly M, Gross M and Alexa M. 2004. Point based animation of elastic, plastic and melting objects//2004 ACM SIGGRAPH/Eurographics symposium on Computer animation. Grenoble, France: Eurographics Association: 141-151 [DOI: 10.1145/1028523.1028542http://dx.doi.org/10.1145/1028523.1028542]
Mur-Labadia L, Guerrero J J and Martinez-Cantin R. 2023. Multi-label affordance mapping from egocentric vision//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 5215-5226 [DOI: 10.1109/ICCV51070.2023.00483http://dx.doi.org/10.1109/ICCV51070.2023.00483]
Nagabandi A, Konolige K, Levine S and Kumar V. 2020. Deep dynamics models for learning dexterous manipulation//Proceedings of the 3rd Annual Conference on Robot Learning. Osaka, Japan: PMLR: 1101-1112
Nagarajan T, Li Y H, Feichtenhofer C and Grauman K. 2020. Ego-topo: environment affordances from egocentric video//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 160-169 [DOI: 10.1109/cvpr42600.2020.00024http://dx.doi.org/10.1109/cvpr42600.2020.00024]
Nau D, Cao Y, Lotem A and Munoz-Avila H. 1999. SHOP: simple hierarchical ordered planner//Proceedings of the 16th international joint conference on Artificial intelligence-Volume 2. Stockholm, Sweden: Morgan Kaufmann Publishers Inc: 968-973
Nguyen T, Vu M N, Vuong A, Nguyen D, Vo T, Le N and Nguyen A. 2023. Open-vocabulary affordance detection in 3d point clouds//Proceedings of 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems. Detroit, USA: IEEE: 5692-5698 [DOI: 10.1109/IROS55552.2023.10341553http://dx.doi.org/10.1109/IROS55552.2023.10341553]
Nocedal J and Wright S J. 1999. Numerical Optimization. New York, USA: Springer
NVIDIA Corporation. 2023a. NVIDIA Isaac Sim [EB/OL]. [2023-12-20]. https://developer.nvidia.com/isaac-simhttps://developer.nvidia.com/isaac-sim
NVIDIA Corporation. 2023c. NVIDIA Warp [EB/OL]. [2023-12-20]. https://developer.nvidia.com/warp-pythonhttps://developer.nvidia.com/warp-python
Oprea S, Martinez-Gonzalez P, Garcia-Garcia A, Castro-Vargas J A, Orts-Escolano S and Garcia-Rodriguez J. 2019. A visually realistic grasping system for object manipulation and interaction in virtual reality environments. Computers and Graphics, 83: 77-86 [DOI: 10.1016/j.cag.2019.07.003http://dx.doi.org/10.1016/j.cag.2019.07.003]
Padalkar A, Pooley A, Jain A, Bewley A, Herzog A, Irpan A, Khazatsky A, Rai A, Singh A, Brohan A, Raffin A, Wahid A, Burgess-Limerick B, Kim B, Schölkopf B, Ichter B, Lu C W, Xu C, Finn C, Xu C F, Chi C, Huang C G, Chan C, Pan C, Fu C Y, Devin C, Driess D, Pathak D, Shah D, Büchler D, Kalashnikov D, Sadigh D, Johns E, Ceola F, Xia F, Stulp F, Zhou G Y, Sukhatme G S, Salhotra G, Yan G, Schiavi G, Kahn G, Su H, Fang H S, Shi H C, Amor H B, Christensen H I, Furuta H, Walke H, Fang H J, Mordatch I, Radosavovic I, Leal I, Liang J, Abou-Chakra J, Kim J, Peters J, Schneider J, Hsu J, Bohg J, Bingham J, Wu J J, Wu J L, Luo J L, Gu J Y, Tan J, Oh J, Malik J, Tompson J, Yang J, Lim J J, Silvério J, Han J, Rao K, Pertsch K, Hausman K, Go K, Gopalakrishnan K, Goldberg K, Byrne K, Oslund K, Kawaharazuka K, Zhang K, Rana K, Srinivasan K, Chen L Y, Pinto L, Tan L, Ott L, Lee L, Tomizuka M, Du M, Ahn M, Zhang M T, Ding M Y, Srirama M K, Sharma M, Kim M J, Kanazawa N, Hansen N, Heess N, Joshi N J, Suenderhauf N, Di Palo N, Shafiullah N M N, Mees O, Kroemer O, Sanketi P R, Wohlhart P, Xu P, Sermanet P, Sundaresan P, Vuong Q, Rafailov R, Tian R, Doshi R, Martín-Martín R, Mendonca R, Shah R, Hoque R, Julian R, Bustamante S, Kirmani S, Levine S, Moore S, Bahl S, Dass S, Sonawani S, Song S R, Xu S C, Haldar S, Adebola S, Guist S, Nasiriany S, Schaal S, Welker S, Tian S, Dasari S, Belkhale S, Osa T, Harada T, Matsushima T, Xiao T, Yu T H, Ding T L, Davchev T, Zhao T Z, Armstrong T, Darrell T, Jain V, Vanhoucke V, Zhan W, Zhou W X, Burgard W, Chen X, Wang X L, Zhu X H, Li X L, Lu Y, Chebotar Y, Zhou Y F, Zhu Y F, Xu Y, Wang Y X, Bisk Y, Cho Y, Lee Y, Cui Y C, Wu Y H, Tang Y J, Zhu Y K, Li Y Z, Iwasawa Y, Matsuo Y, Xu Z and Cui Z F. 2023. Open X-embodiment: robotic learning datasets and RT-X models [EB/OL]. [2023-12-20]. https://arxiv.org/pdf/2310.08864.pdfhttps://arxiv.org/pdf/2310.08864.pdf
Pari J, Shafiullah N M, Arunachalam S P and Pinto L. 2021. The surprising effectiveness of representation learning for visual imitation//18th Robotics: Science and Systems. New York City, USA: Robotics: Science and Systems: #10 [DOI: 10.15607/rss.2022.xviii.010http://dx.doi.org/10.15607/rss.2022.xviii.010]
Pavlakos G, Choutas V, Ghorbani N, Bolkart T, Osman A A, Tzionas D and Black M J. 2019. Expressive body capture: 3d hands, face, and body from a single image//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 10967-10977 [DOI: 10.1109/cvpr.2019.01123http://dx.doi.org/10.1109/cvpr.2019.01123]
Peng X B, Abbeel P, Levine S and Van de Panne M. 2018. DeepMimic: example-guided deep reinforcement learning of physics-based character skills. ACM Transactions on Graphics, 37(4): #143 [DOI: 10.1145/3197517.3201311http://dx.doi.org/10.1145/3197517.3201311]
Peng X B, Guo Y R, Halper L, Levine S and Fidler S. 2022. ASE: large-scale reusable adversarial skill embeddings for physically simulated characters. ACM Transactions on Graphics, 41(4): #94 [DOI: 10.1145/3528223.3530110http://dx.doi.org/10.1145/3528223.3530110]
Peng X B, Ma Z, Abbeel P, Levine S and Kanazawa A. 2021. AMP: adversarial motion priors for stylized physics-based character control. ACM Transactions on Graphics, 40(4): #144 [DOI: 10.1145/3450626.3459670http://dx.doi.org/10.1145/3450626.3459670]
Peskin C S. 2002. The immersed boundary method. Acta Numerica, 11: 479-517 [DOI: 10.1017/S0962492902000077http://dx.doi.org/10.1017/S0962492902000077]
Petrovich M, Black M J and Varol G. 2021. Action-conditioned 3D human motion synthesis with Transformer VAE//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 10965-10975 [DOI: 10.1109/ICCV48922.2021.01080http://dx.doi.org/10.1109/ICCV48922.2021.01080]
Pfaff T, Fortunato M, Sanchez-Gonzalez A and Battaglia P. 2021. Learning mesh-based simulation with graph networks//Proceedings of the 9th International Conference on Learning Representations. [s.l.]: ICLR: 1-18
Qi C, Lin X Y and Held D. 2022. Learning closed-loop dough manipulation using a differentiable reset module. IEEE Robotics and Automation Letters, 7(4): 9857-9864 [DOI: 10.1109/lra.2022.3191239http://dx.doi.org/10.1109/lra.2022.3191239]
Qin Y Z, Su H and Wang X L. 2022a. From one hand to multiple hands: Imitation learning for dexterous manipulation from single-camera teleoperation. IEEE Robotics and Automation Letters, 7(4): 10873-10881 [DOI: 10.1109/lra.2022.3196104http://dx.doi.org/10.1109/lra.2022.3196104]
Qin Y Z, Wu Y H, Liu S W, Jiang H W, Yang R H, Fu Y and Wang X L. 2022b. DexMV: imitation learning for dexterous manipulation from human videos//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv, Israel: Springer: 570-587 [DOI: 10.1007/978-3-031-19842-7_33http://dx.doi.org/10.1007/978-3-031-19842-7_33]
Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y Q, Li W and Liu P J. 2020. Exploring the limits of transfer learning with a unified text-to-text Transformer. The Journal of Machine Learning Research, 21(1): #140 [DOI: 10.1109/cvpr52729.2023.00941http://dx.doi.org/10.1109/cvpr52729.2023.00941]
Raissi M, Perdikaris P and Karniadakis G E. 2019. Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics, 378: 686-707 [DOI: 10.1016/j.jcp.2018.10.045http://dx.doi.org/10.1016/j.jcp.2018.10.045]
Raissi M, Yazdani A and Karniadakis G E. 2020. Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations. Science, 367(6481): 1026-1030 [DOI: 10.1126/science.aaw4741http://dx.doi.org/10.1126/science.aaw4741]
Rajeswaran A, Kumar V, Gupta A, Vezzani G, Schulman J, Todorov E and Levine S. 2018. Learning complex dexterous manipulation with deep reinforcement learning and demonstrations//Proceedings of the 14th Robotics: Science and Systems. Pittsburgh USA: Robotics: Science and Systems: #49 [DOI: 10.15607/rss.2018.xiv.049http://dx.doi.org/10.15607/rss.2018.xiv.049]
Ren J W, Dai J F and Lin H W. 2024. Simulation of cloth with thickness based on isogeometric continuum elastic model. Journal of Image and Graphics, 29(1): 243-255
任靖雯, 戴俊飞, 蔺宏伟. 2024. 等几何连续介质弹性模型的带厚度布料仿真方法. 中国图象图形学报, 29(1): 243-255 [DOI: 10.11834/jig.221199http://dx.doi.org/10.11834/jig.221199]
Robinson-Mosher A, Shinar T, Gretarsson J, Su J and Fedkiw R. 2008. Two-way coupling of fluids to rigid and deformable solids and shells. ACM Transactions on Graphics, 27(3): 1-9 [DOI: 10.1145/1360612.1360645http://dx.doi.org/10.1145/1360612.1360645]
Rong Y, Shiratori T and Joo H. 2021. FrankMocap: a monocular 3D whole-body pose estimation system via regression and integration//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada,: IEEE: 1749-1759 [DOI: 10.1109/iccvw54120.2021.00201http://dx.doi.org/10.1109/iccvw54120.2021.00201]
Ruan L W, Liu J Y, Zhu B, Sueda S, Wang B and Chen B Q. 2021. Solid-fluid interaction with surface-tension-dominant contact. ACM Transactions on Graphics, 40(4): #120 [DOI: 10.1145/3450626.3459862http://dx.doi.org/10.1145/3450626.3459862]
Sadeghi F, Toshev A, Jang E and Levine S. 2018. Sim2Real viewpoint invariant visual servoing by recurrent control//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 4691-4699 [DOI: 10.1109/cvpr.2018.00493http://dx.doi.org/10.1109/cvpr.2018.00493]
Savva M, Chang A X, Hanrahan P, Fisher M and Nießner M. 2016. PiGraphs: learning interaction snapshots from observations. ACM Transactions on Graphics, 35(4): #139 [DOI: 10.1145/2897824.2925867http://dx.doi.org/10.1145/2897824.2925867]
Schneider T, Dumas J, Gao X F, Botsch M, Panozzo D and Zorin D. 2019. Poly-spline finite-element method. ACM Transactions on Graphics, 38(3): #19 [DOI: 10.1145/3313797http://dx.doi.org/10.1145/3313797]
She Q J, Hu R Z, Xu J Z, Liu M, Xu K and Huang H. 2022. Learning high-DOF reaching-and-grasping via dynamic representation of gripper-object interaction. ACM Transactions on Graphics, 41(4): #97 [DOI: 10.1145/3528223.3530091http://dx.doi.org/10.1145/3528223.3530091]
Shen S Y, Yang Y, Shao T J, Wang H, Jiang C F F, Lan L and Zhou K. 2021. High-order differentiable autoencoder for nonlinear model reduction. ACM Transactions on Graphics, 40(4): #68 [DOI: 10.1145/3450626.3459754http://dx.doi.org/10.1145/3450626.3459754]
Shi H C, Xu H Z, Clarke S, Li Y Z and Wu J J. 2023. RoboCook: long-horizon elasto-plastic object manipulation with diverse tools//Proceedings of the 7th Conference on Robot Learning. Atlanta, USA: PMLR: 642-660
Shi H C, Xu H Z, Huang Z A, Li Y Z and Wu J J. 2022. RoboCraft: learning to see, simulate, and shape elasto-plastic objects with graph networks//18th Robotics: Science and Systems. New York City, USA: Robotics: Science and Systems: #8 [DOI: 10.15607/rss.2022.xviii.008http://dx.doi.org/10.15607/rss.2022.xviii.008]
Shinar T, Schroeder C and Fedkiw R. 2008. Two-way coupling of rigid and deformable bodies//Proceedings of 2008 ACM SIGGRAPH/Eurographics Symposium on Computer Animation. Dublin, Ireland: Eurographics Association: 95-103
Sifakis E and Barbic J. 2012. FEM simulation of 3D deformable solids: a practitioner’s guide to theory, discretization and model reduction//ACM SIGGRAPH 2012 Courses. Los Angeles, USA: ACM: #20 [DOI: 10.1145/2343483.2343501http://dx.doi.org/10.1145/2343483.2343501]
Sifakis E, Shinar T, Irving G and Fedkiw R. 2007. Hybrid simulation of deformable solids//Proceedings of 2007 ACM SIGGRAPH/Eurographics Symposium on Computer Animation. San Diego, USA: Eurographics Association: 81-90
Sin F S, Schroeder D and Barbič J. 2013. Vega: non-linear FEM deformable object simulator. Computer Graphics Forum, 32(1): 36-48 [DOI: 10.1111/j.1467-8659.2012.03230.xhttp://dx.doi.org/10.1111/j.1467-8659.2012.03230.x]
Siyao L, Yu W J, Gu T P, Lin C Z, Wang Q, Qian C, Loy C C and Liu Z W. 2022. Bailando: 3D dance generation by actor-critic GPT with choreographic memory//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 11040-11049 [DOI: 10.1109/cvpr52688.2022.01077http://dx.doi.org/10.1109/cvpr52688.2022.01077]
Stam J. 2023. Stable fluids. Seminal Graphics Papers: Pushing the Boundaries, 2: #81
Starke S, Mason I and Komura T. 2022. DeepPhase: periodic autoencoders for learning motion phase manifolds. ACM Transactions on Graphics, 41(4): #136 [DOI: 10.1145/3528223.3530178http://dx.doi.org/10.1145/3528223.3530178]
Starke S, Zhang H, Komura T and Saito J. 2019. Neural state machine for character-scene interactions. ACM Transactions on Graphics, 38(6): #209 [DOI: 10.1145/3355089.3356505http://dx.doi.org/10.1145/3355089.3356505]
Starke S, Zhao Y W, Komura T and Zaman K. 2020. Local motion phases for learning multi-contact character movements. ACM Transactions on Graphics, 39(4): #54 [DOI: 10.1145/3386569.3392450http://dx.doi.org/10.1145/3386569.3392450]
Starke S, Zhao Y W, Zinno F and Komura T. 2021. Neural animation layering for synthesizing martial arts movements. ACM Transactions on Graphics, 40(4): #92 [DOI: 10.1145/3450626.3459881http://dx.doi.org/10.1145/3450626.3459881]
Sun L J, Zeng T F, Fan J X and Wang W J. 2024. Double-view feature fusion network for LiDAR semantic segmentation. Journal of Image and Graphics, 29(1): 205-217
孙刘杰, 曾腾飞, 樊景星, 王文举. 2024. 大场景双视角点云特征融合语义分割方法. 中国图象图形学报, 29(1): 205-217 [DOI: 10.11834/jig.220943http://dx.doi.org/10.11834/jig.220943]
Sundaresan P, Antonova R and Bohgl J. 2022. DiffCloud: real-to-sim from point clouds with differentiable simulation and rendering of deformable objects//Proceedings of 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Kyoto, Japan: IEEE: 10828-10835 [DOI: 10.1109/iros47612.2022.9981101http://dx.doi.org/10.1109/iros47612.2022.9981101]
Taheri O, Ghorbani N, Black M J and Tzionas D. 2020. GRAB: a dataset of whole-body human grasping of objects//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 581-600 [DOI: 10.1007/978-3-030-58548-8_34http://dx.doi.org/10.1007/978-3-030-58548-8_34]
Takahashi T and Batty C. 2022. ElastoMonolith: a monolithic optimization-based liquid solver for contact-aware elastic-solid coupling. ACM Transactions on Graphics, 41(6): 1-19 [DOI: 10.1145/3550454.3555474http://dx.doi.org/10.1145/3550454.3555474]
Tan J, Zhang T N, Coumans E, Iscen A, Bai Y F, Hafner D, Bohez S and Vanhoucke V. 2018. Sim-to-real: learning agile locomotion for quadruped robots//Proceedings of the 14th Robotics: Science and Systems. Pittsburgh, USA: Robotics: Science and Systems: #10 [DOI: 10.15607/RSS.2018.XIV.010http://dx.doi.org/10.15607/RSS.2018.XIV.010]
Tang J J, Zheng G, Yu J Y and Yang S B. 2023a. CoTDet: affordance knowledge prompting for task driven object detection//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 3045-3055 [DOI: 10.1109/ICCV51070.2023.00285http://dx.doi.org/10.1109/ICCV51070.2023.00285]
Tang X J, Wu L J, Wang H, Hu B, Gong X, Liao Y C, Li S N, Kou Q L and Jin X G. 2023b. RSMT: real-time stylized motion transition for characters//Proceedings of 2023 ACM SIGGRAPH 2023 Conference Proceedings. Los Angeles, USA: ACM: #38 [DOI: 10.1145/3588432.3591514http://dx.doi.org/10.1145/3588432.3591514]
Teran J, Sifakis E, Blemker S S, Ng-Thow-Hing V, Lau C and Fedkiw R. 2005. Creating and simulating skeletal muscle from the visible human data set. IEEE Transactions on Visualization and Computer Graphics, 11(3): 317-328 [DOI: 10.1109/tvcg.2005.42http://dx.doi.org/10.1109/tvcg.2005.42]
Tevet G, Raab S, Gordon B, Shafir Y, Cohen-Or D and Bermano A H. 2023. Human motion diffusion model//Proceedings of the 11th International Conference on Learning Representations. Kigali, Rwanda: ICLR: #11970
Todorov E, Erez T and Tassa Y. 2012. MuJoCo: a physics engine for model-based control//Proceedings of 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. Vilamoura-Algarve, Portugal: IEEE: 5026-5033 [DOI: 10.1109/iros.2012.6386109http://dx.doi.org/10.1109/iros.2012.6386109]
Toussaint M. 2015. Logic-geometric programming: an optimization-based approach to combined task and motion planning//Proceedings of the 24th International Joint Conference on Artificial Intelligence. Buenos Aires, Argentina: IJCAI: 1930-1936
Tseng J, Castellon R and Liu C K. 2023. EDGE: editable dance generation from music//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 448-458 [DOI: 10.1109/cvpr52729.2023.00051http://dx.doi.org/10.1109/cvpr52729.2023.00051]
van den Oord A, Vinyals O and Kavukcuoglu K. 2017. Neural discrete representation learning//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc.: 6309-6318
Wan W K, Geng H R, Liu Y, Shan Z K, Yang Y D, Yi L and Wang H. 2023. UniDexGrasp++: improving dexterous grasping policy learning via geometry-aware curriculum and iterative generalist-specialist learning//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 3868-3879 [DOI: 10.1109/ICCV51070.2023.00360http://dx.doi.org/10.1109/ICCV51070.2023.00360]
Wang B H, Matcuk G and Barbič J. 2020. Hand MRI dataset [EB/OL]. [2023-12-20]. http://www.jernejbarbic.com/hand-mri-datasethttp://www.jernejbarbic.com/hand-mri-dataset
Wang H, Sridhar S, Huang J W, Valentin J, Song S R and Guibas L J. 2019a. Normalized object coordinate space for category-level 6D object pose and size estimation//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 2637-2646 [DOI: 10.1109/cvpr.2019.00275http://dx.doi.org/10.1109/cvpr.2019.00275]
Wang H M. 2021. GPU-based simulation of cloth wrinkles at submillimeter levels. ACM Transactions on Graphics, 40(4): #169 [DOI: 10.1145/3450626.3459787http://dx.doi.org/10.1145/3450626.3459787]
Wang J B, Rong Y, Liu J Y, Yan S J, Lin D H and Dai B. 2022a. Towards diverse and natural scene-aware 3d human motion synthesis//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 20428-20437 [DOI: 10.1109/cvpr52688.2022.01981http://dx.doi.org/10.1109/cvpr52688.2022.01981]
Wang J S, Xu H Z, Xu J W, Liu S F and Wang X L. 2021. Synthesizing long-term 3D human motion and interaction in 3d scenes//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 9396-9406 [DOI: 10.1109/cvpr46437.2021.00928http://dx.doi.org/10.1109/cvpr46437.2021.00928]
Wang R C, Zhang J L, Chen J Y, Xu Y Z, Li P H, Liu T Y and Wang H. 2023. DexGraspNet: a large-scale robotic dexterous grasp dataset for general objects based on simulation//Proceedings of 2023 IEEE International Conference on Robotics and Automation (ICRA). London, England: IEEE: 11359-11366 [DOI: 10.1109/icra48891.2023.10160982http://dx.doi.org/10.1109/icra48891.2023.10160982]
Wang Y, Weidner N J, Baxter M A, Hwang Y, Kaufman D M and Sueda S. 2019b. REDMAX: efficient & flexible approach for articulated dynamics. ACM Transactions on Graphics, 38(4): #104 [DOI: 10.1145/3306346.3322952http://dx.doi.org/10.1145/3306346.3322952]
Wang Y A, Wu R H, Mo K C, Ke J Q, Fan Q N, Guibas L J and Dong H. 2022b. AdaAfford: learning to adapt manipulation affordance for 3D articulated objects via few-shot interactions//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv, Israel: Springer: 90-107 [DOI: 10.1007/978-3-031-19818-2_6http://dx.doi.org/10.1007/978-3-031-19818-2_6]
Wang Z, Chen Y X, Liu T Y, Zhu Y X, Liang W and Huang S Y. 2022c. HUMANISE: language-conditioned human motion generation in 3D scenes//Proceedings of the 36th International Conference on Neural Information Processing Systems. New Orleans, USA: NeurIPS: 14959-14971
Weinstein R, Teran J and Fedkiw R. 2006. Dynamic simulation of articulated rigid bodies with contact and collision. IEEE Transactions on Visualization and Computer Graphics, 12(3): 365-374 [DOI: 10.1109/tvcg.2006.48http://dx.doi.org/10.1109/tvcg.2006.48]
Werling K, Omens D, Lee J, Exarchos I and Liu C K. 2021. Fast and feature-complete differentiable physics for articulated rigid bodies with contact//17th Robotics: Science and Systems. [s.l.]: Robotics: Science and Systems
Won J, Gopinath D and Hodgins J. 2022. Physics-based character controllers using conditional VAEs. ACM Transactions on Graphics, 41(4): #96 [DOI: 10.1145/3528223.3530067http://dx.doi.org/10.1145/3528223.3530067]
Wong J, Tung A, Kurenkov A, Mandlekar A, Li F F, Savarese S and Martín-Martín R. 2022. Error-aware imitation learning from teleoperation data for mobile manipulation//Proceedings of the 5th Conference on Robot Learning. London, UK: PMLR: 1367-1378
Wu B T, Wang Z D and Wang H M. 2022. A GPU-based multilevel additive schwarz preconditioner for cloth and deformable body simulation. ACM Transactions on Graphics, 41(4): #63 [DOI: 10.1145/3528223.3530085http://dx.doi.org/10.1145/3528223.3530085]
Wu R H, Ning C R and Dong H. 2023. Learning foresightful dense visual affordance for deformable object manipulation//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 10913-10922 [DOI: 10.1109/ICCV51070.2023.01005http://dx.doi.org/10.1109/ICCV51070.2023.01005]
Xian Z, Zhu B, Xu Z J, Tung H Y, Torralba A, Fragkiadaki K and Gan C. 2022. FluidLab: a differentiable environment for benchmarking complex fluid manipulation//Proceedings of the 11th International Conference on Learning Representations. Kigali, Rwanda: ICLR: 1-19
Xiang F B, Qin Y Z, Mo K C, Xia Y K, Zhu H, Liu F C, Liu M H, Jiang H X, Yuan Y, Wang H, Yi L, Chang A X, Guibas L J and Su H. 2020. SAPIEN: a simulated part-based interactive environment//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 11094-11104 [DOI: 10.1109/cvpr42600.2020.01111http://dx.doi.org/10.1109/cvpr42600.2020.01111]
Xiang Y, Schmidt T, Narayanan V and Fox D. 2018. PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes//14th Robotics: Science and Systems. Pittsburgh, USA: Robotics: Science and Systems: #19
Xie X H, Bhatnagar B L and Pons-Moll G. 2023. Visibility aware human-object interaction tracking from single RGB camera//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 4757-4768 [DOI: 10.1109/cvpr52729.2023.00461http://dx.doi.org/10.1109/cvpr52729.2023.00461]
Xu D F, Nair S, Zhu Y K, Gao J L, Garg A, Li F F and Savarese S. 2018. Neural task programming: learning to generalize across hierarchical tasks//Proceedings of 2018 IEEE International Conference on Robotics and Automation (ICRA). Brisbane, Australia: IEEE: 3795-3802 [DOI: 10.1109/ICRA.2018.8460689http://dx.doi.org/10.1109/ICRA.2018.8460689]
Xu Y Z, Wan W K, Zhang J L, Liu H R, Shan Z K, Shen H, Wang R C, Geng H R, Weng Y J, Chen J Y, Liu T Y, Yi L and Wang H. 2023. UniDexGrasp: universal robotic dexterous grasping via learning diverse proposal generation and goal-conditioned policy//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 4737-4746 [DOI: 10.1109/cvpr52729.2023.00459http://dx.doi.org/10.1109/cvpr52729.2023.00459]
Xu Z J, He Z P and Song S R. 2022. Universal manipulation policy network for articulated objects. IEEE Robotics and Automation Letters, 7(2): 2447-2454 [DOI: 10.1109/lra.2022.3142397http://dx.doi.org/10.1109/lra.2022.3142397]
Yan X C, Hsu J, Khansari M, Bai Y F, Pathak A, Gupta A, Davidson J and Lee H. 2018. Learning 6-DOF grasping interaction via deep geometry-aware 3D representations//Proceedings of 2018 IEEE International Conference on Robotics and Automation (ICRA). Brisbane, Australia: IEEE: 3766-3773 [DOI: 10.1109/ICRA.2018.8460609http://dx.doi.org/10.1109/ICRA.2018.8460609]
Yang L X, Li K L, Zhan X Y, Wu F, Xu A R, Liu L and Lu C W. 2022a. OakInk: a large-scale knowledge repository for understanding hand-object interaction//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 20921-20930 [DOI: 10.1109/cvpr52688.2022.02028http://dx.doi.org/10.1109/cvpr52688.2022.02028]
Yang T, Chang J, Ren B, Lin M C, Zhang J J and Hu S M. 2015. Fast multiple-fluid simulation using Helmholtz free energy. ACM Transactions on Graphics, 34(6): #201 [DOI: 10.1145/2816795.2818117http://dx.doi.org/10.1145/2816795.2818117]
Yang T Y, Arnaud S, Shah K, Yokoyama N, Clegg A W, Truong J, Undersander E, Maksymets O, Ha S, Kalakrishnan M, Mottaghi R, Batra D and Rai A. 2023a. LSC: language-guided skill coordination for open-vocabulary mobile pick-and-place [EB/OL]. [2023-12-20]. https://languageguidedskillcoordination.github.io/https://languageguidedskillcoordination.github.io/.
Yang Y H, Zhai W, Luo H C, Cao Y, Luo J B and Zha Z J. 2023b. Grounding 3D object affordance from 2D interactions in images//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 10871-10881 [DOI: 10.1109/ICCV51070.2023.01001http://dx.doi.org/10.1109/ICCV51070.2023.01001]
Yang Z S, Yin K K and Liu L B. 2022b. Learning to use chopsticks in diverse gripping styles. ACM Transactions on Graphics, 41(4): #95 [DOI: 10.1145/3528223.3530057http://dx.doi.org/10.1145/3528223.3530057]
Yao H Y, Song Z H, Chen B Q and Liu L B. 2022. ControlVAE: model-based learning of generative controllers for physics-based characters. ACM Transactions on Graphics, 41(6): #183 [DOI: 10.1145/3550454.3555434http://dx.doi.org/10.1145/3550454.3555434]
Yenamandra S, Ramachandran A, Yadav K, Wang A S, Khanna M, Gervet T, Yang T Y, Jain V, Clegg A, Turner J M, Kira Z, Savva M, Chang A X, Chaplot S D, Batra D, Mottaghi R, Bisk Y and Paxton C. 2023. HomeRobot: open-vocabulary mobile manipulation//Proceedings of the 7th Conference on Robot Learning. Atlanta, USA: PMLR: 1975-2011
Yin H, Varava A and Kragic D. 2021. Modeling, learning, perception, and control methods for deformable object manipulation. Science Robotics, 6(54): #8803 [DOI: 10.1126/scirobotics.abd8803http://dx.doi.org/10.1126/scirobotics.abd8803]
Yin Z H, Huang B H, Qin Y Z, Chen Q F and Wang X L. 2023. Rotating without seeing: towards in-hand dexterity through touch//19th Robotics: Science and Systems. Daegu, Korea(South): Robotics: Science and Systems: #36 [DOI: 10.15607/rss.2023.xix.036http://dx.doi.org/10.15607/rss.2023.xix.036]
Yokoyama N, Clegg A W, Undersander E, Ha S, Batra D and Rai A. 2023. Adaptive skill coordination for robotic mobile manipulation [EB/OL]. [2023-12-20]. https://arxiv.org/abs/2304.00410v1https://arxiv.org/abs/2304.00410v1
Zakka K, Zeng A, Florence P, Tompson J, Bohg J and Dwibedi D. 2022. XIRL: cross-embodiment inverse reinforcement learning//Proceedings of the 5th Conference on Robot Learning. London, UK: PMLR: 537-546
Zeng F F, Liang B L, Liu Z and Wang J H. 2000. An interactive environment design based on digital glove. Journal of Image and Graphics, 5(2): 153-157
曾芬芳, 梁柏林, 刘镇, 王建华. 2000. 基于数据手套的人机交互环境设计. 中国图象图形学报, 5(2): 153-157 [DOI: 10.11834/jig.20000214http://dx.doi.org/10.11834/jig.20000214]
Zhai W, Luo H C, Zhang J, Cao Y and Tao D C. 2021. One-shot object affordance detection in the wild. International Journal of Computer Vision, 130(1): 2472-2500 [DOI: 10.1007/s11263-022-01642-4http://dx.doi.org/10.1007/s11263-022-01642-4]
Zhang H, Starke S, Komura T and Saito J. 2018. Mode-adaptive neural networks for quadruped motion control. ACM Transactions on Graphics, 37(4): #145 [DOI: 10.1145/3197517.3201366http://dx.doi.org/10.1145/3197517.3201366]
Zhang H, Ye Y T, Shiratori T and Komura T. 2021. ManipNet: neural manipulation synthesis with a hand-object spatial representation. ACM Transactions on Graphics, 40(4): #121 [DOI: 10.1145/3450626.3459830http://dx.doi.org/10.1145/3450626.3459830]
Zhang H T, Yuan Y, Makoviychuk V, Guo Y R, Fidler S, Peng X B and Fatahalian K. 2023a. Learning physically simulated tennis skills from broadcast videos. ACM Transactions on Graphics, 42(4): #95 [DOI: 10.1145/3592408http://dx.doi.org/10.1145/3592408]
Zhang J Z, Gireesh N, Wang J L, Fang X M, Xu C Y, Chen W G, Dai L and Wang H. 2023b. GAMMA: graspability-aware mobile manipulation policy learning based on online grasping pose fusion [EB/OL]. [2023-12-20]. https://arxiv.org/pef/2309.15459.pdfhttps://arxiv.org/pef/2309.15459.pdf
Zhang S W, Zhang Y, Ma Q L, Black M J and Tang S Y. 2020a. PLACE: proximity learning of articulation and contact in 3D environments//Proceedings of 2020 International Conference on 3D Vision (3DV). Fukuoka, Japan: IEEE: 642-651 [DOI: 10.1109/3dv50981.2020.00074http://dx.doi.org/10.1109/3dv50981.2020.00074]
Zhang Y, Hassan M, Neumann H, Black M J and Tang S Y. 2020b. Generating 3D people in scenes without people//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 6193-6203 [DOI: 10.1109/cvpr42600.2020.00623http://dx.doi.org/10.1109/cvpr42600.2020.00623]
Zhao K F, Wang S F, Zhang Y, Beeler T and Tang S Y. 2022. Compositional human-scene interaction synthesis with semantic control//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv, Israel: Springer: 311-327 [DOI: 10.1007/978-3-031-20068-7_18http://dx.doi.org/10.1007/978-3-031-20068-7_18]
Zhao K F, Zhang Y, Wang S F, Beeler T and Tang S Y. 2023a. Synthesizing diverse human motions in 3D indoor scenes//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 14692-14703 [DOI: 10.1109/ICCV51070.2023.01354http://dx.doi.org/10.1109/ICCV51070.2023.01354]
Zhao T Z, Kumar V, Levine S and Finn C. 2023b. Learning fine-grained bimanual manipulation with low-cost hardware//19th Robotics: Science and Systems. Daegu, Korea(South): Robotics: Science and Systems: #16
Zhao Y, Wu R H, Chen Z H, Zhang Y R, Fan Q N, Mo K C and Dong H. 2023c. DualAfford: learning collaborative visual affordance for dual-gripper manipulation//Proceedings of the 11th International Conference on Learning Representations. Kigali, Rwanda: ICLR: #1971
Zheng J T, Zheng Q Y, Fang L X, Liu Y and Yi L. 2023. CAMS: CAnonicalized manipulation spaces for category-level functional hand-object manipulation synthesis//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 585-594 [DOI: 10.1109/cvpr52729.2023.00064http://dx.doi.org/10.1109/cvpr52729.2023.00064]
Zheng M L, Wang B H, Huang J T and Barbič J. 2022. Simulation of hand anatomy using medical imaging. ACM Transactions on Graphics, 41(6): #273 [DOI: 10.1145/3550454.3555486http://dx.doi.org/10.1145/3550454.3555486]
Zhi Y H, Cun X D, Chen X L, Shen X, Guo W, Huang S L and Gao S H. 2023. LivelySpeaker: towards semantic-aware co-speech gesture generation//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 20750-20760 [DOI: 10.1109/ICCV51070.2023.01902http://dx.doi.org/10.1109/ICCV51070.2023.01902]
Zhong C L, Zheng Y H, Zheng Y P, Zhao H, Yi L, Mu X D, Wang L, Li P F, Zhou G Y, Yang C, Zhang X L and Zhao J. 2023. 3D implicit transporter for temporally consistent keypoint discovery//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 3846-3857 [DOI: 10.1109/ICCV51070.2023.00358http://dx.doi.org/10.1109/ICCV51070.2023.00358]
Zhu Z H, Wang J S, Qin Y Z, Sun D Q, Jampani V and Wang X L. 2023. ContactArt: learning 3D interaction priors for category-level articulated object and hand poses estimation [EB/OL]. [2023-12-20]. https://arxiv.org/pdf/2305.01618.pdfhttps://arxiv.org/pdf/2305.01618.pdf
相关文章
相关作者
相关机构