TL;DR: Fine-tuning image and video generation models to 'draw actions' as visual patterns could lead us to powerful agents for robotics and GUI apps.
Share this post
Diffusion Agents
Share this post
TL;DR: Fine-tuning image and video generation models to 'draw actions' as visual patterns could lead us to powerful agents for robotics and GUI apps.