Table of Contents
The field of generative AI is advancing rapidly, as demonstrated by the latest innovation from Google’s DeepMind team. They have developed a new model called Genie AI, which can create entire 2D platformer games from just a single image input.
Genie AI is a remarkable feat of machine learning, as it does not require any action labels or domain-specific knowledge to generate crossplay games. It can use any image as a prompt, whether it is a photo, a sketch, a painting, or an AI-generated image. It then transforms the image into a playable 2D platformer, with obstacles, enemies, and rewards.
How does Genie AI work?
Genie AI is based on a large-scale neural network with 11 billion parameters, which the team calls a foundation world model. It consists of three main components:
- A spatiotemporal video tokenizer, which converts the input image into a sequence of tokens that represent the visual features of the game world.
- An autoregressive dynamics model, predicts the next frame of the game based on the previous frames and the user’s actions.
- A simple and scalable latent action model, which learns a compact representation of the possible actions that the user can take in the game.
By combining these components, Genie can generate and render the game environment on the fly, and allow the user to interact with it on a frame-by-frame basis. The user can control the game character using a keyboard or a joystick, and explore the generated world.
How is Genie AI different from other generative models?
Genie uses Artificial Intelligence as the first model to generate games from images, but it is the first to do so in a fully unsupervised and scalable way. Previous models relied on human annotations, such as action labels, to train the dynamics model, or used predefined templates, such as sprites and tiles, to construct the game world. These methods limit the diversity and quality of the generated games and require a lot of manual effort and domain expertise.
Genie, on the other hand, learns everything from scratch, by watching hundreds of thousands of hours of Internet videos of people playing 2D platformers. It does not need any action labels, as it infers the latent actions from the video frames. It also does not use any templates, as it generates the game world directly from the image tokens. This allows Genie to create a wide variety of games, with different themes, styles, and mechanics, from any image input.
What are the benefits of Genie AI?
Genie AI has many potential benefits for both users and developers of games. For users, Genie can offer a novel and personalized gaming experience, as they can create their games from any image they like. They can also share their games with others, and discover new games generated by other users or by Genie itself.
For developers, Genie can reduce the cost and time of game development, as they can use Genie to generate game assets, levels, and mechanics from their sketches or concepts. They can also use Genie to test and improve their games, by generating different scenarios and challenges.
What are the applications of Genie AI?
Genie AI is a fun and creative tool for generating games and a powerful platform for training future AI agents. The team notes that Genie can create a never-ending curriculum of new and diverse worlds, which can challenge and improve the skills of AI agents.
The team also shows that the latent actions learned by Genie AI can transfer to real human-designed games, such as Super Mario Bros. This suggests that Genie can help develop generalist AI agents that adapt to different environments and tasks.
Genie AI is a breakthrough in generative AI and a testament to the potential of machine learning. By creating games from images, Genie opens up new possibilities for entertainment, education, and research.