This year has seen explosive growth in AI-generated art – or AI painting – and its staggering upward curve can be seen in the Google index, where it is spreading from academia to the masses at an unprecedented and alarming rate. Arguably, 2022 is the year of AI-generated art.
As “design software developers”, “UI designers”, “graphic designers”, and “non-professional painting enthusiasts”, we would like to talk about AI-generated art and its related controversial topics in this first year of AI-generated art, hoping to combine various perspectives to understand AI-generated art more comprehensively and predict the changes it may bring.
The Technical Reason Behind
The main technical reason behind the explosion of AI generative art is that Diffusion Model has broken through the technical bottleneck for many years in the past two years, and combined with GTP-3, a text language model that has been developed very maturely, it has led to a highly usable tool for generating images from text.
Bottlenecks in GAN (Generative Adversarial Network)
The previous generation of AI generation art is based on GAN (Generative Adversarial Network, see the brief history of GAN), which simply uses two neural networks: one as a generator and one as a discriminator, and the generator generates different images for the discriminator to determine whether the result is qualified or not, and the two train the model against each other.
GAN (Generative Adversarial Network) has been developed with good results, but there are some problems that are always difficult to overcome: lack of diversity in the generated results, pattern collapse (the generator stops progressing after finding the best pattern), and high training difficulty. These difficulties have made the art of AI generation difficult to produce practical products.
Diffusion Model Breakthrough
After years of GAN bottlenecks, scientists came up with the amazing Diffusion Model approach to training models.
The original image is added to the Markov chain with noise points, which eventually becomes a random noise image, and then the training neural network reverses the process and gradually reverts from the random noise image to the original image, so that the neural network has the ability to generate images from nothing. And the text-generated image is the description of the text processed as noise constantly added to the original image, so that the neural network can generate images from the text.
The Diffusion Model makes it easier to train models with only a large number of images, and the quality of the generated images can be very high, and the results can be very diverse, which is why the next generation of AI can have an unbelievable “imagination”.
The Diffusion Model has brought AI-generated art to a usable level in just two years of its existence.
This 2020 paper, Denoising Diffusion Probabilistic Models, uses the diffusion model born in 2015 for the first time in image generation.
In January 2021, OpenAI published Dall-E and announced in the paper that Diffusion Models Beat GANs on Image Synthesis, pointing the way for the engineering community.
In October 2021, disco-diffusion, an open source tool for generating images from text, was created, and a number of products based on it have emerged since then.
In August 2022, stability.ai opened up Stable Diffusion , the most usable open source model to date, on which many commercial products are based, such as NovelAI.
The impact of AI-generated art on painting
With the maturity of AI-generated art tools, both painting enthusiasts and graphic designers have started to pay attention to whether AI-generation will have an impact on the industry and whether AI will replace artists. However, while people are discussing it, AI-generated art has already started to replace some of the work in the original scenario, as the saying goes, “When most car drivers are still arguing about the usefulness of the car, the smart ones are already taking the driving test”.
The Atlantic, for example, has already used AI-generated images in the header of its article to replace human work on the original “gallery” site.
It’s likely that AI-generated work is already being used in more unnoticed corners and people don’t even know it yet, which means one important fact: the general public can no longer tell AI-generated work from purely handmade work, and while high-quality artwork is still hard to replace with AI, AI-generated work is efficient enough to use in most everyday consumer artwork (illustrations, covers, posters) AI-generated material is a very attractive thing to do. For art creators, the use of AI-generated tools will slowly become a must-have skill for creators.
Resistance to AI
Because AI is difficult for most people to truly understand, people look at it from a variety of perspectives, not the least of which is resistance.
Copyright and Monopoly
Unlike many artists who regard “AI as the enemy” and “copyright as a weapon to protect themselves”, in my opinion, “overly strict copyright protection will lead to monopoly of AI by big companies and excessive exploitation of creators”.
The best environment for large companies is to strictly protect any artwork from being used for training AI models, so that large companies can use their financial advantage to buy artwork and use it as data sets to create effects. This way, large companies can use their financial advantage to buy artwork copyrights to use as data sets to create the best AI art generation tools, so that large companies can monopolize AI art generation tools, and who pays the cost of monopoly? The creators will have to face the situation of having to buy AI tools from big companies at high prices or being eliminated by other creators who have bought AI tools.
This has already happened in other fields, where anyone can now get free and open-source face recognition AI tools from the web that are extremely usable because face photos are a very easy source of data. Pharmaceutical R&D AI tools, on the other hand, are monopolized by a few large companies because no one can easily get access to expensive pharmaceutical R&D data.
Of course, I’m not saying that it’s right not to restrict and protect your work, but that this is a very complex issue, and that “simple strict copyright protection” is not the most beneficial option for creators, because the productivity of AI-generated art is so great that the production relationship may need to be changed.
In fact, the AI generation tool is not a large amount of image data stored and then “assembled” with certain rules as some people think. It is not possible to use this model to generate new works that can be understood by a simple “collocation”.
And it is almost impossible to judge whether a huge dataset must have used a certain image for training, and it is also difficult for humans to distinguish whether an image is created with AI or with AI participation, which means that it can only be treated as an ordinary work to see whether the final work violates copyright.
AI Art Generation Tools Replacing Galleries
Another issue related to copyright is that AI-generated art tools can replace galleries like Shutterstock, and Getty. In fact, they are the ones who have more influence on the legal judgment of AI copyright than the painters, and AI-generated tools have a very good chance of replacing them, with copyright being the only issue, and they will tend to prevent AI from using them if they own a lot of image rights. They will tend to prevent AI from using images to train models.
Another controversy of AI-generated art is the crisis of image authenticity. Although there was Photoshop and even “darkroom technology” to create fake pictures in the film era, there was always a significant technical threshold, but AI-generated art tools have lowered the threshold for generating fake photos, and AI-generated pictures are likely to be more expressive than real photos and more conducive to dissemination, and there are now Many images in current news are already generated with AI
Can AI be used in UI design?
Some people think that AI-generated art looks “imaginative” but not “accurate” or “stable”, and is not suitable for UI creation. But I think this is actually an “engineering” problem rather than a limitation of AI’s ability.
At present, the AI that generates images is not suitable for UI design except for generating illustrations, because UI design is very structured, and it is really difficult for neural networks to “understand the rules”, but in reality there are ways to collaborate with neural networks and rule algorithms, so that AI can generate “accurate” design drawings in theory.
And there is actually a more UI-appropriate model, like Github Copilot, which treats UI design as a structured grammar and complements it according to the context, so that one half of the design is automatically generated and the other half is effective. I think it is only a matter of time before AI technology enters the field of UI design, but there is no relevant product yet probably because the structured data set of UI design is relatively lacking.
People always overestimate the development of a new technology in 3 years and underestimate its impact in 10 years.
– Amara’s Law
AI-generated art really started in 2014, and it took almost 10 years to usher in a technological breakthrough and exponential growth in impact, which is not easy because the technical threshold is low (there are high-quality open source implementations), and there will be a lot of AI-generated art products in the coming year, but the quality of generation may not be greatly improved in the short term, which means that it will not reach the point where it can easily replace manual work. After all, the reason for the explosion of AI-generated art comes from the Diffusion Model, which solves the problem of diversity of AI generation, but there are still many problems waiting for the next technical breakthrough, such as the formal understanding of the logic of the content, and the controllability of model training. However, the practicality of AI-generated art tools will likely be greatly improved in a short period of time, and we are really looking forward to the next 10 years.
AI generation technology has an impossible triangle: quality, speed, diversity, the current Diffusion Model (Diffusion Model) focus on quality and diversity, while speed is a problem, so the current AI art generation tools are very slow, dozens of seconds or even a few minutes to produce a picture, although faster than the human hand-drawn can be much faster, but due to the unpredictability of the generated results Next, with the development of AI art generation tools, the speed will definitely be improved, and AI art generation tools will really change the art creation process when multiple results can be previewed within one second of inputting the content.
Another thing that affects the experience of AI generation tools is the prompt, which is the way to manipulate AI-generated drawings. The prompt that AI-generated art relies on is still very primitive and difficult to manipulate, and even the process of writing the prompt is called Prompt Engineering, which requires a lot of experience to generate the desired content. In order to write better prompts, there are marketplaces like promptbase that sell prompts, and many tools to generate prompts.
In addition to directly increasing the usability of prompt words, secondary editing, drawing sketches, editing details, and other features are all needed, which are engineering issues that can be achieved with time.
It is conceivable that an AI like Github Copilot will emerge to help you write prompt words to manipulate another AI to generate pictures
Drawing support tools
While most of the AI generation tools nowadays are geared towards the general public, there is a lot of room for tools aimed at the creator community, such as tools that integrate into the creative workflow by completing the rest of the content based on what has already been done, expanding different versions of existing works, and guiding possible next steps. One example is Figma’s Ando plugin
AI video generation
Since AI can generate images, video generation will naturally be the next area to be reclaimed, and there are already some primary attempts, such as phenaki.video and Meta’s Make A Video.
AI generates 3D models
Video is available, 3D models can not be missing, now there are also some initial attempts dreamfusion3d
Some painting enthusiasts believe that the important thing in painting art is the experience of the process, and AI can generate excellent “works”, but it cannot replace the experience and fun of creating art, and the process of the creator experiencing these is the beauty of art. If we consider painting as a means of depicting the heart and expressing oneself, AI generation is actually also a means, and AI art creators will also get their own “mind-flow experience” in the process of using AI, and experience the fun and beauty of art created with AI.
AI-generated art will make more people think about the meaning of “art” and the “relationship between people and art”. Painting is not static, it has been inseparable from technology since its birth. The chemical industry has brought rich colors to painting, making realism possible and giving painting the meaning of recording history. Printing made it possible for the public to learn to paint. Photography in turn deprived painting of its realistic value, allowing painting to refocus on inner description and self-expression, while AI-generated art may change even more.