- AI
- A
Ballad of Three Elvises: How AI Will Change the Field of Generative Content by 2035
Hello, I am Andrey Vecherney, my nickname on tekkix is @Andvecher. I am a well-known author on vc.ru and a videographer.
Recently I read an article on tekkix about how AI will change our lives by 2035. It has already changed my work: ChatGPT has practically replaced the search engine and proofreader, Midjourney has become both a sketchbook and a personal Pinterest, and so on. In the article, I want to share more of my thoughts about the future and tell more about how I already use AI in my work.
2035 could be a new era for artificial intelligence: will AI be able to become an independent creator by then, capable of creating films, music, and texts at the level of Tarantino or Elvis Presley? Let's figure out how neural networks will change the creative industries — from cinema to journalism, and what tasks will remain within the power of humans alone.
Five stages of Altman
But before making specific predictions, let's turn to the most famous "legislator" of AI — Sam Altman.
At the beginning of autumn at the T‑Mobile Capital Markets Day, he spoke about the five stages of artificial intelligence development from the current level to full-fledged AGI, where it will be equal to or surpass human intelligence, going from a chatbot to an independent mind capable of managing large companies. Here are these stages:
1. Chatbots — the first stage, with full-fledged dialogues and understanding of their context.
2. Reasoners — the next stage, when AI begins to perform basic reasoning and problem-solving tasks at the level of a highly qualified human. For example, a PhD. We have just stepped over this stage. OpenAI released the o1-preview model, capable of not only generating text but also reasoning on a given request, displaying each stage of reasoning. In my work, this is especially useful when analyzing and reviewing technical or legal documents.
3. Agents — at this level, AI becomes advanced enough to perform actions and make decisions, completing tasks without constant human involvement.
In July 2024, the HR company Lattice announced that it was going to implement AI agents as employees. This news caused a strong reaction from HR professionals and ordinary people. Many were concerned that this could devalue human labor and lead to ethical issues.
Lattice quickly realized that this was a bad idea and withdrew the initiative. They acknowledged that the idea had many unresolved issues.
4. Innovators — here, AI is capable of participating in the process of creating new technologies and innovations, developing viable ideas and prototypes.
5. Organizators — the final stage, where AI can perform all the tasks inherent in an entire organization, allowing it to manage and optimize processes at the company level, replacing traditional structures and employees.
According to Altman himself, by 2035, progress should be between the 3rd and 4th stages, where neural networks will be able to independently develop their own projects and make decisions to the maximum extent (within given frameworks, of course).
Let's build on this idea and continue.
AI will replace HBO
Videography — this includes scriptwriting, visual content, and working with sound, images, and animation. In one video, a wide range of content production directions is covered, so several generative directions are touched upon at once.
The most obvious of them is the creation of full-fledged videos by neural networks.
The globally renowned Sora from OpenAI has shown that AI can generate watchable and high-quality videos based on a single text description. However, for now, they are short, up to one minute.
[mp4] — Sora
When AI first gained popularity, enthusiasts took real videos, broke them down into frames, and generated a separate image for each. It turned out to be something like a gif, where the environment and the appearance of people changed smoothly but significantly throughout the video — often with noticeable distortions.
Sora and its analogs (e.g., Runway Gen-2 or Meta's Make-A-Video) have moved to the next level, where the effect of change has become much less or disappeared altogether. It feels like we are looking at either a real video or a human-crafted animation.
By 2035, we should get high-quality videos: with frame composition and light elaboration, which can be compared to the work of professional studios and even filmmakers.
We will get more independent authors with projects like "How it should have ended...". Only now we will see not an animation drawn by a team of professionals, but full-fledged alternatives to TV series and movies.
Didn't like the 8th season of "Game of Thrones"? Here is a parallel fan version where the White Walkers reached King's Landing, and Jon Snow became the new Night King.
There will be projects where literally one person can implement completely new ideas, and creating them will not require huge budgets and teams of experienced specialists.
With animation, everything will become even easier: if video requires a deep understanding of physics, lighting, and complex object interactions, then in animation, the neural network controls the frames much more easily. In other words, in animation, characters, shadows, background, details are separate layers, like in Photoshop. In video, everything goes together, in one frame, and it is much more difficult for neural networks to deal with each aspect.
Therefore, by 2035, we also expect new animation projects, mainly developed with the help of AI. I am sure that Pixar and Disney will also start actively using generative technologies for their cartoons.
For example: recently, blogger BadComedian began using neural networks for his reviews and sketches, supplementing movie scenes or bringing Russian classics to life.
I use neural networks to draw additional details in videos and quickly retouch photos — this sometimes saves not just hours, but entire working days. I used a neural network in the same way for the image in the header of this article. I quickly implemented a creative concept with it — a concert that is only possible with generative models.
However, not everything is so sweet — there are a few nuances. On the one hand, the cost of hardware for AI chips that generate video will be much higher than the cost of services of cinema specialists, since video requires tens of times more tokens than texts or pictures. On the other hand, the availability of material for training the neural networks themselves will create difficulties for the application of AI in content creation.
If the second problem resolves itself, despite all attempts by large companies to limit the training of neural networks, the first will still become an obstacle for enthusiasts — due to the high cost. Not everyone will be able to afford to make a movie, but it will still be more affordable than the costs of Mosfilm or HBO for a series or film.
I will leave this question to your discretion, but in my opinion, it is becoming increasingly difficult to distinguish a human hand from the work of an algorithm, and now for the further development of neural networks, it is necessary to teach AI to copy and imitate the author's style, thanks to which the text will look unique and more human.
By 2035, developers simply cannot fail to solve this issue, as by that time there will be enough data available for study for AI to create texts in an author's style. If you publish articles, stories, or even posts on social networks and you have a paid ChatGPT, try training it on your texts. AI will analyze and highlight the characteristic features of the text and start trying to write the way you do. The more publications you "feed" it and the more often you ask it to analyze your articles, the more the texts created by AI will resemble yours.
But again, the nuance: the neural network will never be able to breathe life into the text, as it will not be able to understand the meanings and even the main sense of what is written. The language model will remain a logical sequence of words glued into sentences. Yes, when we speak or write, our brain also composes phrases from a sequence of words, but with a key difference: we imagine and realize our narrative, we can use drama or comedy. By 2035 or 2135, no AI will be able to comprehend the text it has written - this is an exclusively human ability.
With other tasks, AI is already coping: it writes materials of sufficient length without losing context, adapts it to the desired format, translates, and does so much better than Yandex or Google Translate. It feels like language models are already at their peak of development and further development will not go in terms of text quality, but, according to Altman's theory, will be applied to solving tasks at work.
Stars will become immortal
Yandex has had the "Neural Music" option for a couple of years now, where AI generates tracks on the fly according to a given mood. Spotify is actively working on a similar feature, and some foreign music services (for example, AIVA) have also long been able to create tracks at the request of users.
By 2035, audio generation will most likely go the most commercial way: the neural network will be able to create adaptive concerts and shows right during performances: AI will take into account the mood, requests, and behavior of the audience. This applies to both offline and online shows that gained popularity during the COVID era.
In the future, it will be possible not to limit oneself to just one Elvis. Why, when you can release three holographic neuroKings of different ages on stage at once? The most unprecedented ideas for supergroups and concerts will be able to come true.
In this area, everything will significantly depend on automated device management and AI-powered software: cameras for real-time analysis, holograms, light shows, and the music itself.
However, the experience of developing real-time show generation risks producing a huge number of blunders or failures. For example, sudden errors in algorithms can lead to the music changing tempo or genre at the wrong moment, ruining the entire atmosphere of the concert, which is not so easy to revive. The hologram may freeze or disappear altogether.
If AI incorrectly recognizes the mood of the crowd, it may play an inappropriate track or light effect, raising many technical questions for music engineers. But for small installations, chamber concerts, and venues, such technologies will be a godsend, as they will be much simpler than solutions for large-scale spectacles.
And this is essentially the fourth stage of Altman — AI makes all the decisions and shapes the course of the entire event.
In video editing, I often have to select audio tracks: sometimes it takes several hours to find the one needed. Neural networks precisely suggest composition options if you describe the picture to them in detail. They rarely hit the right track, but often give the right and unexpected direction from which you can quickly select the desired music.
Conclusions
AI has rapidly turned into a profitable business and a sought-after product for SMM specialists, but progress still requires more time for humanity to see a full-fledged artificial intelligence in the correct understanding of this term.
It took ChatGPT a whole six years to move to the second stage (the first GPT model appeared in 2018), and each subsequent stage requires even more time.
But, on the other hand, ten years ago we could not have imagined where global technological development would be: back then, we thought that smartphones would become even more individualized in design, personal computers would be tens of times more powerful, and bitcoin would be worth at least a million dollars.
So, AI always has time to disappoint or surprise us. What do you think? Write in the comments how AI is changing your work now and what you expect from it in the future.
Write comment