Incorporating physics into data-driven computer vision

A Kadambi, C de Melo, CJ Hsieh… - Nature Machine …, 2023 - nature.com
Many computer vision techniques infer properties of our physical world from images.
Although images are formed through the physics of light and mechanics, computer vision …

Kubric: A scalable dataset generator

K Greff, F Belletti, L Beyer, C Doersch… - Proceedings of the …, 2022 - openaccess.thecvf.com
Data is the driving force of machine learning, with the amount and quality of training data
often being more important for the performance of a system than architecture and training …

Blenderproc2: A procedural pipeline for photorealistic rendering

M Denninger, D Winkelbauer, M Sundermeyer… - Journal of Open Source …, 2023 - elib.dlr.de
BlenderProc2 is a procedural pipeline that can render realistic images for the training of
neural networks. Our pipeline can be employed in various use cases, including …

[HTML][HTML] Robot bionic vision technologies: A review

H Zhang, S Lee - Applied Sciences, 2022 - mdpi.com
The visual organ is important for animals to obtain information and understand the outside
world; however, robots cannot do so without a visual system. At present, the vision …

Blenderproc: Reducing the reality gap with photorealistic rendering

M Denninger, M Sundermeyer, D Winkelbauer… - 16th Robotics: Science …, 2020 - elib.dlr.de
BlenderProc is an open-source and modular pipeline for rendering photorealistic images of
procedurally generated 3D scenes which can be used for training data-hungry deep …

Adaptive procedural task generation for hard-exploration problems

K Fang, Y Zhu, S Savarese, L Fei-Fei - arXiv preprint arXiv:2007.00350, 2020 - arxiv.org
We introduce Adaptive Procedural Task Generation (APT-Gen), an approach to
progressively generate a sequence of tasks as curricula to facilitate reinforcement learning …

External camera-based mobile robot pose estimation for collaborative perception with smart edge sensors

S Bultmann, R Memmesheimer… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
We present an approach for estimating a mobile robot's pose wrt the allocentric coordinates
of a network of static cameras using multi-view RGB images. The images are processed …

Cross-modality time-variant relation learning for generating dynamic scene graphs

J Wang, J Huang, C Zhang… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
Dynamic scene graphs generated from video clips could help enhance the semantic visual
understanding in a wide range of challenging tasks such as environmental perception …

Informed pre-training on prior knowledge

L von Rueden, S Houben, K Cvejoski… - arXiv preprint arXiv …, 2022 - arxiv.org
When training data is scarce, the incorporation of additional prior knowledge can assist the
learning process. While it is common to initialize neural networks with weights that have …

Increasing the Robustness of Deep Learning Models for Object Segmentation: A Framework for Blending Automatically Annotated Real and Synthetic Data

AI Károly, S Tirczka, H Gao, IJ Rudas… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Recent problems in robotics can sometimes only be tackled using machine learning
technologies, particularly those that utilize deep learning (DL) with transfer learning …