- DIY
- A
Programming for a non-programmer or how the dream of shifting everything to AI was shattered
Once, while traveling through remote Vietnamese villages, I asked a girl to squeeze me a glass of sugarcane juice. To explain, I used Google Translate. From her eyes, I understood that if my request was about sugarcane, it was about the wrong kind. I had to quickly simplify the message by first translating from Russian to English, and then from English to Vietnamese. Essentially performing the same algorithm that Google does when one or both of the languages are not English.
It was then that the idea of creating a translator with a "result verification translation" function was born, meaning the application would translate not only, for example, from Russian to Vietnamese, but also immediately translate what was obtained in Vietnamese back to Russian, allowing the request to be clarified before its "crooked" translation could cause misunderstanding and shock in the interlocutor.
In this article, I will share the problems and solutions that arose when creating a translator using AI, and show how to approach solving such tasks, especially if you are not a programmer.
It should be noted right away that with the use of ChatGPT-4o's voice capabilities, the need for such a translator has somewhat decreased, as the generative pre-trained transformer captures the thread of conversation between several interlocutors and works quite well with aphorisms. The remaining shortcomings will be discussed in two paragraphs.
Compare the results of the "konfetki baranochki" query:
— according to Google, it is "Lamb sweets," which in reverse translation is "Lamb candies,"
— according to ChatGPT-4o, it is "Sweet treats and bagels," which in reverse translation is "Sweet treats and bagels."
With a significantly better translation from GPT, a two-way translator ("result verification translator") can draw your attention to the fact that this aphorism is not translated literally, and you will change the request.
Thus, we come to understand that the value of a two-way translator will still be there for some time, let's start programming it.
Attention, dear commentators — the main tasks of this exercise are to understand:
1. can a non-programmer create their own working IT product using AI,
2. what are the features of creating an IT product using AI for a non-programmer.
Some revelations may seem trivial to you, but remember, the article is for "non-programmers".
We start = We continue
Last time I stopped at creating the front end and the need to connect the API (Application Programming Interface - a software interface that describes how one computer program interacts with others) of some powerful translator. It is clear that AI copes with the creation of the front end perfectly, you just need to build the request correctly, so we move on to the most difficult part.
I knew several good online translators:
1. DeepL - was the first to use neural networks to create a sensible translation that follows the context,
2. Google translate - omnivorous and, most likely, very simple in terms of connection,
3. Yandex translator - if it doesn't work out with the global ones, we'll go to our compatriots.
What worked / didn't work:
1. DeepL provides a free API for users from almost all countries, which Russia for some reason does not include, to connect to the free API you need to link a card, which is difficult to do from Russia. Through the freelance exchange, I managed to buy a Swiss account, but while I was going to start the connection, the shop closed.
2. Google, as expected, easily provides access to the API, but only not to users from Russia, again it is not clear why... The attempt to pretend to be a foreign account failed, you need to show a lease agreement from a foreign jurisdiction or otherwise show a strong connection with the declared country. I didn't want to do black stuff again, so that one fine day something unexpected wouldn't break again.
3. Yandex, well hello, dear! We relatively easily find the connection instructions, ask the AI to take into account the requirements specified on this page, and create a file index.php, which does not yet have an interface with buttons, but has a login and password for the API, a word for translation, and output the result to the screen using the print command. In other words, if the API works, we will see the result of the translation of the given word, if not, we will see an error.
Relatively quickly dealt with errors, fed the AI what I saw on the screen, received a phrase something like "I understand what the error is, here is the new code, try it. If it doesn't work, come back" and somewhere around the 5th time, the translation appeared.
To learn how to write requests for any generative pre-trained transformer, for example, for ChatGPT, check out our guide with AI here.
Then everything is simple, we move towards our goal in small steps (iteratively), add functionality piece by piece, do not forget to save versions of the working code in any text document with a comment on what works in this version and what does not (big programmers do this too, but on GitHub).
Iterations:
— Add translation result check — surprisingly, the AI quickly understood that it was necessary to take the translation result and immediately translate it into the request language, as if someone had already done this. Then it's strange why there is no service with two-way translation anywhere?
— Separate login and password for the API into a separate file, so that hooligans (hopefully not you) could not steal them outright or DDoS the system.
— Create an input-output interface — one of the simplest tasks — clearly describe what we want to see on the screen, which field and button are responsible for what. The AI itself gives them conditional names, use them in further adjustments, for example, the AI called the translation field "zapros", then to add a phrase to this field, write "add to the zapros field the text 'Enter text and press Enter', the text should disappear when entering text in this field" (by the way, such a text hint is called a placeholder).
— Execution of the translation after pressing the enter key on the keyboard — initially there was a "Translate" button, but since everyone is used to Google and Yandex translating on the fly, it had to be optimized.
First revelation — knowledge of special terminology significantly speeds up "development", fewer letters need to be written in the prompt to get the desired result. You can ask the AI itself in the same dialogue "What is the name of the text that should disappear when entering text in the field?".
There will be nine revelations in total throughout the text, which will help structure the text and highlight the main points that you need to pay attention to.
Somewhere in the middle of this action, when I was already rubbing my hands in anticipation of a quick victory, Yandex delighted with the message "The free limit for requests is over. Switch to the paid version". By that time, I had already felt the "taste of blood" and such a trifle as the refusal of the main mechanism of the system did not stop me.
Second revelation — it can break in the most unexpected place and no AI will help you with this.
Ingenuity will help, if there is something big and paid somewhere, then there must be something small and free somewhere. After a targeted search, a site was found RapidAPI.com (the world's largest API marketplace, bringing together more than 8,000 APIs and more than 500,000 developers), it solved my problem — provided a free, but limited to 10,000 requests per month, API to Google translate.
Somewhere here it was necessary to quickly come up with a name for the project and create a logo. This also took some time to correspond with the AI, so all questions on this matter to it. And I still don't want to redo the logo.
Third revelation — most likely, everything will have to be redone from scratch and, possibly, more than once...
I asked the AI to redo the index.html file, replacing everything related to the Yandex API with the Google API. Fortunately, over the past year, chatGPT has disabled the restriction on the length of the request and the number of requests in the dialogue.
Revelation Four — creating your own brainchild can get you so carried away that you forget about sleep and food for 12–14–16 hours. I recommend stocking up on at least sandwiches )
By the end of the first day, it seemed that the MVP was ready — it translates in both directions simultaneously and does not fail.
But it was impossible to demonstrate this disgrace to the most honorable public. The "default design" can only be called a design with a stretch.
We use the magic command "redesign, make it more visual and modern".
The AI added rounding and shadows to the buttons, made the indents larger, worked with the headers.
But still not that:
— on the smartphone, the text overlaps the elements,
— when turning the phone, the buttons fly off the screen
I had to "talk" all this separately to the AI and make corrections to the code.
Somewhere at this stage, a CSS (Cascading Style Sheets) section appeared in the index.html file. As I somehow know, the technology/language that describes the appearance of the document is styles (font size and color, line thickness, indents, etc.). And since there are already two files, it is not a sin to allocate styles to a separate third file stili.css.
Revelation Five — you will not be able to implement everything in one language and in one file. Even now, this seemingly simplest translator consists of 4 (four) files using different languages:
— HTML — for front-end visualization,
— CSS — for style management,
— PHP — for storing the access password (one file) and working with the API (the second file).
PHP (Personal Home Page Tools) — a scripting language used for web application development.
If the project is small and still unclear to you, it is convenient when all the data is in one file (yes, you can keep HTML, CSS, and even PHP in index.html for the first time), but when the code grows, and the edits become minor (small), you yourself will want to break them into parts and edit styles separately and the mechanics of the application separately.
Improvements, like repairs, cannot be completed, they can only be paused, so you need to clearly decide for yourself what functionality should be implemented in the current iteration.
Revelation six — yes, even in such small pet projects it is impossible to do without planning. Creativity should bring joy, not social life dropout and gastritis with ulcers.
Probably, most of the time, about 7 hours of continuous communication with AI, was spent on:
— implementation of adaptive design (adaptive design allows you to use the application on the phone as conveniently as on a computer or tablet) and
— polishing (to make it concise and smooth).
This is the essence of the problem — designers, and AI supports them in this, use one scale system for text and another for other elements.
For text, one size is set for the main font, and the rest of the text is set as a proportion of the root or parent element (em and rem, respectively).
Other elements either do not change in size (for example, lines limiting the boundaries of the input field or button boundaries) or change depending on the specified screen resolutions (vw — viewport width — browser window width, vh — viewport height — browser window height).
As a solution to the problem with different resolutions, an interface is usually created with three screen extensions:
— for desktops (computers and laptops),
— for tablets,
— for smartphones.
And given that there are a great many screen sizes and resolutions, unpopular models always have something that does not fit and one element "crawls" onto another.
A lot of time was spent on:
— realizing this problem, through understanding that at this stage I do not want to make three different designs at all — one would work adequately,
— testing the hypothesis that the scale of all elements, including text, can be set relative to the visible height and width of the screen,
— instructing AI to do it the way I need, not the way it is accepted in its digital head,
— and on "licking" the proportions.
Revelation Seven — fine-tuning even the most insignificant, at first glance, element (the proportion of the field border relative to the visible height/width of the screen) can take much more time than initially anticipated.
It's time to present the project to friends, family, and colleagues.
I recommend starting the story with an exciting event that served as the beginning/impetus for the creation of the project, such as my situation with sugarcane juice in Vietnam.
And be ready for the eighth revelation.
Revelation Eight — 97% of the audience will not care about your product, your time, your red eyes, and your gastritis. To reach the target audience, you either need to conduct CusDev correctly from the start or realize that the target audience consists of only you and not try to impose on others what they do not need.
The good news is that there is a non-zero probability that during a random pitch you will meet approving exclamations and words of support. These are the 3% who can become your early birds and bring in the first money if your product is about commerce at all. Record the contacts of those who gave positive feedback, find out what they liked about the product and how they previously solved this problem. But that's another story.
Revelation Nine (Final) — you can't blame all the problems on AI, but through n-iterations, your joint activities with it may yield results. Edison found the recipe for the light bulb on the 10,000th attempt, Elon Musk launched his rocket on the 4th attempt and almost went bankrupt. How are you better? Strive, go towards your goal, listen to the market, and may Product-market-fit be with you.
You can play with the translator and install it on your phone as an application via the link https://translator.posovetuy.com
Write comment