Table of Contents
Midjourney, a popular image creation tool that runs inside a Discord server, is expanding its AI capabilities. The company announced on Tuesday that it is working on a “text to video” model that could be ready in the next few months.
The Midjourney video model will build on the existing image model, which has been improved in the recent v6 update. The update features better prompt following and more realistic images, as well as tweaks for V6 Niji, the manga/anime generator model.
CEO David Holz revealed the plan during an “Office Hour” Discord session, where he also discussed the upcoming official release of Midjourney V6. He did not share any further details about the Midjourney video model, such as its features, limitations, or examples.
Midjourney’s approach to AI generation
Midjourney is known for prioritizing quality and user experience over speed, even if it means lagging behind its competitors. The company has introduced enhancements like inpainting and outpainting, which allow users to fill in or extend parts of an image, months after they became standard in other platforms like Stable Diffusion.
The company has also ventured into text generation, albeit in a rudimentary way, after other models like Dall-E 3, SDXL, Ideogram, and IF had already established themselves in the field.
Midjourney’s competition in the video domain
Midjourney’s entry into the video domain will face a tough challenge from other AI video generators that have already launched or showcased their products. Stability AI, for instance, has recently announced Stable Video Diffusion, a model that can generate realistic and diverse videos from text prompts.
Meta, the social media giant, has also demonstrated its EMU video generator, which can create immersive and interactive videos from natural language queries. Moreover, existing models like Pika and Runway ML have carved out their niches in the market, offering various features and applications for video generation.
Other image generators, such as Leonardo AI, have also added video generation capabilities to their platforms, increasing the competition and variety in the field.
Midjourney’s potential impact on the creative and media industries
The development of the Midjourney video model, along with other AI video generators, has far-reaching implications for the creative and media industries. The ability to generate, manipulate, and interact with video content through AI opens up new possibilities and challenges for entertainers, advertisers, educators, journalists, and more.
AI video generation could also have a profound impact on how we perceive and consume reality, as it blurs the line between fact and fiction. As AI video generators become more accessible and advanced, the need for ethical and responsible use of the technology becomes more urgent.
What is the difference between Midjourney’s image and the Midjourney video model?
The image model is the more mature and older model, which has been updated several times since its initial release in 2022. The most recent version, v6, generates more realistic and detailed images than previous versions. Various enhancements, such as inpainting, outpainting, tiling, and stylizing, are also supported by the image model. The image model can create images in a variety of domains, including manga, anime, landscapes, portraits, and others.
The Midjourney video model is the more recent and experimental model, which is still in development and is expected to be released in the coming months. The video model will be based on the image model, but it will also be capable of producing short movies based on text prompts. The Midjourney video model will employ a “text to video” approach to generate videos that correspond to the user’s natural language queries. The video model will also make use of the image model’s existing features, such as coherency, prompt following, and aesthetics.
Both the image and video models are intended to provide users with a high-quality and user-friendly AI generation experience. However, depending on the type and complexity of content that users want to create, they have different strengths and challenges. The Midjourney video model is more innovative and immersive, while the image model is more stable and versatile.