Recently, Google DeepMind announced a foundation model called "Genie: Generative Interactive Environments." It is said to possess functionalities not found in existing AI, so let's explore this development here. Let's get started!

1.It can execute actions

According to Google DeepMind's blog, "We introduce Genie, a foundation world model trained from Internet videos that can generate an endless variety of playable (action-controllable) worlds from synthetic images, photographs, and even sketches." This suggests that it's not just about generating images but creating environments where you can execute actions and play. In simple terms, you could create a game from a single image. Sounds incredible!

2. It can learn actions through unsupervised learning

The idea of creating an interactive environment from a single image where you can perform actions is quite imaginative. Especially deciding what actions to take seems challenging. However, Genie was trained on 30,000 hours of unlabeled video through unsupervised learning. There's a vast amount of unlabeled video available on the internet, providing plenty of material for training. Currently focused on 2D games and robotics, it seems applicable to various fields in the future. Amazing!

3. You can create games from your drawings

It appears there are no restrictions on the type of image; anything can be fed into Genie. Not only real photos and artworks but also simple illustrations are okay. Looking at the images below, indeed, a game has been created.

4. New "AI agents" will emerge from here

Though it's about games, by creating environments and executing actions within them, we will feel as if we are actually in those environments. When we talk about actions, it reminds us of "AI agents". They make decisions and execute actions on our behalf from a set of options. If these environments can be built from my illustrations and automatically set options, then creating "AI agents" will become significantly simpler. Google owns YouTube, which should provide ample material for learning about the world. It's likely evolving even as we speak.

Finally, I'd like to conclude with a message from Google DeepMind: "Genie introduces the era of being able to generate entire interactive worlds from images or text. We also believe it will be a catalyst for training the generalist AI agents of the future." . While Genie has not yet been released to the public, the future developments are truly exciting!

Notice: ToshiStats Co., Ltd. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithms or ideas contained herein, or acting or refraining from acting as a result of such use. ToshiStats Co., Ltd. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on ToshiStats Co., Ltd. and me to correct any errors or defects in the codes and the software.