相关文章- 学术资源搜索

Zero-shot robotic manipulation with pretrained image-editing diffusion models

K Black, M Nakamoto, P Atreya, H Walke… - arXiv preprint arXiv …, 2023 - arxiv.org

If generalist robots are to operate in truly unstructured environments, they need to be able to
recognize and reason about novel objects and scenarios. Such objects and scenarios might …

被引用次数：25 相关文章所有 4 个版本

[PDF] arxiv.org

Zm-net: Real-time zero-shot image manipulation network

H Wang, X Liang, H Zhang, DY Yeung… - arXiv preprint arXiv …, 2017 - arxiv.org

Many problems in image processing and computer vision (eg colorization, style transfer) can
be posed as' manipulating'an input image into a corresponding output image given a user …

被引用次数：36 相关文章所有 3 个版本

[PDF] arxiv.org

Open-world object manipulation using pre-trained vision-language models

A Stone, T Xiao, Y Lu, K Gopalakrishnan… - arXiv preprint arXiv …, 2023 - arxiv.org

For robots to follow instructions from people, they must be able to connect the rich semantic
information in human vocabulary, eg" can you get me the pink stuffed whale?" to their …

被引用次数：69 相关文章所有 4 个版本

Genaug: Retargeting behaviors to unseen situations via generative augmentation

Z Chen, S Kiami, A Gupta, V Kumar - arXiv preprint arXiv:2302.06671, 2023 - arxiv.org

Robot learning methods have the potential for widespread generalization across tasks,
environments, and objects. However, these methods require large diverse datasets that are …

被引用次数：38 相关文章所有 2 个版本

[PDF] arxiv.org

Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation

H Wu, Y Jing, C Cheang, G Chen, J Xu, X Li… - arXiv preprint arXiv …, 2023 - arxiv.org

Generative pre-trained models have demonstrated remarkable effectiveness in language
and vision domains by learning useful representations. In this paper, we extend the scope of …

被引用次数：13 相关文章所有 3 个版本

[PDF] arxiv.org

Zero-shot robot manipulation from passive human videos

H Bharadhwaj, A Gupta, S Tulsiani, V Kumar - arXiv preprint arXiv …, 2023 - arxiv.org

Can we learn robot manipulation for everyday tasks, only by watching videos of humans
doing arbitrary tasks in different unstructured settings? Unlike widely adopted strategies of …

被引用次数：17 相关文章所有 4 个版本

[PDF] arxiv.org

Learning to see before learning to act: Visual pre-training for manipulation

L Yen-Chen, A Zeng, S Song, P Isola… - 2020 IEEE International …, 2020 - ieeexplore.ieee.org

Does having visual priors (eg the ability to detect objects) facilitate learning to perform vision-
based manipulation (eg picking up objects)? We study this problem under the framework of …

被引用次数：90 相关文章所有 8 个版本

[PDF] thecvf.com

A latent space of stochastic diffusion models for zero-shot image editing and guidance

CH Wu, F De la Torre - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com

Diffusion models generate images by iterative denoising. Recent work has shown that by
making the denoising process deterministic, one can encode real images into latent codes …

被引用次数：45 相关文章所有 3 个版本

[PDF] thecvf.com

Deltaedit: Exploring text-free training for text-driven image manipulation

Y Lyu, T Lin, F Li, D He, J Dong… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Text-driven image manipulation remains challenging in training or inference flexibility.
Conditional generative models depend heavily on expensive annotated training data …

被引用次数：21 相关文章所有 5 个版本

[PDF] arxiv.org

Secant: Self-expert cloning for zero-shot generalization of visual policies

L Fan, G Wang, DA Huang, Z Yu, L Fei-Fei… - arXiv preprint arXiv …, 2021 - arxiv.org

Generalization has been a long-standing challenge for reinforcement learning (RL). Visual
RL, in particular, can be easily distracted by irrelevant factors in high-dimensional …

被引用次数：59 相关文章所有 8 个版本

高级搜索

QQ 群

Zero-shot robotic manipulation with pretrained image-editing diffusion models

Zm-net: Real-time zero-shot image manipulation network

Open-world object manipulation using pre-trained vision-language models

Genaug: Retargeting behaviors to unseen situations via generative augmentation

Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation

Zero-shot robot manipulation from passive human videos

Learning to see before learning to act: Visual pre-training for manipulation

A latent space of stochastic diffusion models for zero-shot image editing and guidance

Deltaedit: Exploring text-free training for text-driven image manipulation

Secant: Self-expert cloning for zero-shot generalization of visual policies

相关搜索

引用