Unidraw — a two-year journey

23:51
17.10.2024
DrWells
216

Hello! I am Georgiy, a developer from the team that created Unidraw. I will tell you the story of how we were looking for a tool for joint sessions on a virtual board. At first, we deployed an open-source solution, but then our load grew so much that we had to write our own. The article is about how the product started, what it is now, and what we want it to be in the future. There will be technical data, beautiful templates, and the story of our main mistake.

Finding suitable solutions

We constantly use services with virtual boards, settled on one option, but since 2022 we have been looking for an alternative. It is important for us to be able to:

Conduct retrospectives
Make quick mockup sketches that do not need design elaboration
Draw architectural diagrams for discussion
Hold online meetings with artifact visualization
Train colleagues

We considered different options, including BoardOs, but when we contacted them, they replied that they did not plan to develop for Russia and localized their application only for China.

The found solutions had limitations on the number of accesses to the boards, or they solved narrowly specialized tasks.

At the same time, help came from our architects, who showed an Open Source project with P2P encryption of data transmission. We decided to build our own service-analog based on it with the ability to share links, save results in a cabinet on the server, and add recipients to the board content.

The open-source solution offered:

one board maximum;
support for collective sessions;
board storage in the user's browser.

Technical details of the solution:

socket.io for synchronizing changes between users;
no single source of truth: all clients store the board state themselves;
the client has a Reconciliation mechanism that can merge changes from different users;
board updates are tied to the React lifecycle.

First steps of the new product

Based on an open-source project, we created a dashboard where the user can see all the boards and projects they have created, and distribute these boards across projects.

In the dashboard, you can share boards, give shared access to the entire project with boards, and distribute read or edit rights — not just for one board, but for all boards in the project at once.

The number one task was to set up persistent board storage. The technical solution was chosen for implementation speed to test the MVP and understand how much demand there might be for the application, so:

The board was saved entirely (JSON object) to the database every 20 seconds.
There were no locks during collective sessions, each client saved independently of each other. The small load from 1,500 users allowed us to do this.
In collective sessions, a newly joined user received board data not only from the database but also from other users via socket. Therefore, in the worst case, the data in the database is outdated by 20 seconds.

We launched the service at the end of September 2022 and did not do any internal marketing, except for the "tell a friend" method. In the first month, we received 250 users who consistently used the service.

Technical project details

Frontend figures: the current number of cyclic dependencies in the project is ~940.
We are gradually transitioning to FSD — Feature-Sliced Design to build a scalable system and move away from the spaghetti code that we inherited from open source.

It may seem that we are criticizing the code we took. In fact, we just took cool developments and are making them even better.

Reasons for switching to FSD:

there was no idea behind the project structure;
there was no uniformity;
there was no composition, which leads to huge files and a large number of cyclic dependencies;
it is difficult to work with the code: hard to read and understand connections, complicated onboarding of newcomers;
no scalability at the structure level — developing the established structure would have required increasing already large modules.

FSD — это наиболее удачная имплементация луковичной архитектуры на фронтенде:

модули понятно делятся по зонам ответственности;
проще разрешать циклические зависимости;
модули небольшие за счет хорошей декомпозиции;
единообразие структур — проект декомпозирован по понятному принципу, поэтому легко находить и добавлять файлы;
проще понять, что собой представляет проект по слоям, существенно сокращается когнитивная нагрузка при разработке новых модулей и поддержании старых;
простой онбординг новых сотрудников.

Минусы: любой подход, решающий проблемы такого уровня, требует определенной квалификации. Нужно воспитывать понимание хорошей архитектуры за счет передачи знаний и опыта, а потом еще и контролировать соблюдение принципов. Это занимает время, а еще может вызывать сопротивление

Отрисовка графики. Раньше мы использовали Canvas API, и у него было несколько плюсов:

Read also:

Counting the number of tokens for LLM in the Linux kernel sources and beyond…

Простота освоения: API, не требующий глубоких знаний в области отрисовки. Большое количество разработчиков знакомы с ним как минимум на базовом уровне.
Сглаживание, которое работает без дополнительных инструментов.
Поддержка браузерами: работает во всех, даже старых, браузерах.

Главный минус Canvas API, по сравнению с WebGL, — производительность, поскольку Canvas API использует для отрисовки CPU, а WebGL — GPU. Это особенно заметно на досках с большим количеством элементов.

We also considered switching to KonvaJS. This is a complex and lengthy process that, despite its advantages, could lead to a decrease in performance. We decided not to rewrite the application for the new library, but to refine the solution:

Use the techniques applied in KonvaJS in the already finished application. For example, layering, text element rendering and editing functions, and so on.
Rework the existing codebase to meet our requirements.

To speed up rendering, we decided to switch to WebGL rendering. We analyzed popular libraries and frameworks for browser rendering, among which we chose PixiJS. Other options were not suitable for the following reasons:

ThreeJS is primarily intended for working with 3D graphics. Most of the functions will not be used, but will only perform additional calculations. The vast majority of official examples are in 3D.
FabricJS. No WebGL support.
PhaserJS, BabylonJS, PlayCanvas are primarily intended for games - they are focused on working with sprites. They have weak capabilities for working with graphic elements.

PixiJS is a 2D graphics rendering engine. With it, you can animate and create interactive graphics, draw applications, and it has a good API. It is also easy to adapt to your coding style.

Advantages of switching to PixiJS:

Significant performance boost. Created a Whiteboard demo with basic unoptimized PixiJS rendering, which showed an 8-fold speed increase.
Created and optimized for working with 2D graphics.
Time-tested solution with a large community (Miro also uses PixiJS).
There is a devtools extension for debugging.
There are additional libraries for more advanced rendering, smoothing, and animations.
It is possible to offload hit-detection to the PixiJS event model.
Easier to write code without the need to maintain a "virtual" Canvas.
Batching of graphic elements rendering is used (combining several small calls into one large one).
There is a built-in caching mechanism that further speeds up rendering.

Backend features. Old architecture:

supported only 1,500 users (X DAU);
generated > 1 TB of data in one year (with 1,500 users);
up to 10 people on one board (with delays appearing);
system unavailability with 30+ people on a medium-sized board (> 1 MB).

With the new architecture we have:

Unlimited number of users (horizontally scalable).
400 GB in two years at T-Bank (optimized the data model, started storing objects separately instead of entire boards).
Large collective sessions on one board do not affect the system as a whole.
We do not send optimistic changes to clients until we persistently record them. A very cheap method, and it works well under relatively low load.
Stateless application with fast recovery time.

We use VPA — Vertical Pod Autoscaling, because we try to approach resource utilization consciously. With VPA, you can automatically determine how much RAM, CPU, and other resources to allocate to the application.

Cursor movements used to be sent every 33 ms, which theoretically allows achieving 30 fps. In practice, smoothness without additional processing left much to be desired; with network delays or rapid participant movements across the board, cursor movement looked jerky.

We increased the interval to 200 ms and started moving the cursors algorithmically: we took two points in time and independently calculated the speed at which to move the cursor. Despite now sending data 7 times less frequently, the real smoothness perceived by the user has increased.

One cursor movement event weighs 200 B. The weight of the event can change, for example, if the user selects or moves an object. If we select 16 objects and move them, the message size is already 430 B. RPS at peak is 80, and the number of operations on the board is 250/s.

There is still room for improvement in rendering speed. We plan to separate the board state from the React lifecycle, change state management through ComponentDidUpdate, and create our own separate Flow for model changes. We also want to optimize very large diagrams to avoid display degradation when zooming.

On the board with 5.5 thousand elements and a small scale (7%), the picture looks like this

Main error on the way

We were not ready for a large number of users.Paying attention to working with the visual component, we planned to optimize the server for exchanging messages about model element changes a little later.

The data transfer model inherited from the Open Source solution was unsatisfactory: the board model with all elements was completely transferred over the network between recipients working with this board. This led to artifacts when more than five people appeared on the board and gave a high load on the data transfer server.

We knew about this problem and wanted to completely rewrite the service after expanding the client's functions with new tools. But the application began to be used for presenting various ideas at meetings, and it was once used at an internal Demo Day event, where attention was paid to our application itself.

After that, we had a multiple increase in users immediately by two times — a little more than 2,500 WAU (5,000 MAU by the end of the month). And such a number of users gave a significant load on the server.

This affected the backlog, and we had to urgently solve the problem of network interaction.

We had to reduce the amount of data for each element, change the model synchronization function, and revise the message transmission method.

The message transmission method was made so that changing one element within one chunk would lead to an update on the server and recipients of only that element and would not result in the transmission of the entire model with the changed element.

We had a scheme where the entire board model was saved as a whole. At the same time, the operation was performed by all users who are on the board. If a user changed an element on a board consisting of 1,000 elements, the payload in this case was 0.1%.

We needed to maximize the payload and improve performance.

The task was complicated by the fact that the board data was encrypted with end-to-end encryption and ideally we would like to make all changes without downtime.

We made it so that the entire board is not updated, but each element change is saved separately.

Unidraw Use Cases

Retrospective— a regular event in IT teams to analyze the previous work week or project iteration to identify successes, problems, and ways to improve future work.

There are many types of online retrospectives. Here are some examples:

Idea carousel. Participants write ideas on cards and discuss them in a circle.
Decision tree. Problems are recorded as branches of a tree, and solutions as leaves.
4-Quadrant Model. Participants divide problems into four quadrants: important and urgent, important but not urgent, not important but urgent, not important and not urgent.

Unidraw is often used for strategy building because it simplifies communication, stimulates creative thinking, and increases engagement.

Active participation in filling out Unidraw makes the strategic planning process more interactive and interesting for participants. Here are some ways it is used.

Visualization of goals and mission:

Mind Mapping. Brainstorming to visualize the company's goal, its values, and key strategic directions.
SWOT analysis. Recording strengths, weaknesses, opportunities, and threats in separate sectors of Unidraw, which helps to clearly see the company's position in the market.

Brainstorming and idea generation:

Brainwriting. Participants take turns writing their ideas on sticky notes, which are placed on Unidraw for group discussion.
Scenario planning. Developing possible scenarios for the company's development and their impact on the strategy.

Structuring processes and planning:

Kanban board. Visual representation of tasks, development stages, and progress for each strategic direction.
Visualization of project execution process. Used to monitor how projects are being executed and whether they have crossed the time threshold relative to the completion percentage.
Gantt Chart. Creating a schedule for completing key tasks and stages of strategy implementation over a certain period.

Teamwork and Joint Decision-Making:

Sticky Notes. Using colored stickers to indicate priorities, categories, or opinions on various aspects of the strategy.
Round Table. Unidraw serves as a central tool for discussing ideas, commenting on proposals, and making joint decisions.

Visualization of Results:

Summary Table. Key findings and decisions made during the strategy session are structured on Unidraw using a table sorting of stickers, and a link to this board can be shared.
Visual Charts. Presentation of data on key metrics, strategy implementation progress, or its application results.

Unidraw is often used in mentoring sessions to visualize and explain different situations that a protégé may encounter.

Planning and Goal Setting:

Goal Definition. You can write down general mentoring goals or specific tasks that the protégé wants to achieve over a certain period.
Creating Roadmaps. Visualize an action plan with development stages and key events to track progress and adjust the path as needed.

Problem Discussion and Solution Development:

Mind Mapping. Using it, the mentor and mentee can analyze the problem together, break it down into components, and generate solution options.
Sticky Notes. Each solution option can be assigned a colored sticker with a brief description of the pros and cons. This helps make the comparison and selection of the best solution more interactive.

Sharing Knowledge and Experience:

Sharing Knowledge. Key phrases from discussions, links to useful resources, or conclusions from the mentor's experience can be recorded. This will help the mentee remember the information and apply it in future work.

Tracking Progress:

Kanban board. Helps visualize the current state of work on tasks and progress in achieving mentoring goals.
To Do, In Progress, Done — stages for each task.

Collecting Feedback:

A section can be created to collect feedback from the mentee about the mentoring process, which will help the mentor improve their work and make it more effective.

Unidraw can be used for many other tasks. Another list we hid here:

Building a Road Map for tasks. Sharing the link with all participants is quite simple, and during collective discussions, weak points of the chosen plan can be identified.

Training and Workshops. For conducting online training, it allows creating the visual part of the content with its progression and gradual opening for students.

Tutoring. Due to the ability to draw with a pencil, save templates, insert pictures, and draw primitives, the tutor can explain subjects to the student online, while keeping all materials in the office and returning to them if necessary.

Business processes. Create visualizations of business process diagrams by drawing not only BPMN diagrams but also simpler visualizations to explain the essence of the tasks and the architecture and design of the processes.

Mockup. Unidraw is used as a simple version of Figma, in which you can explain the design concept on diagrams. In many cases, if teams already have their own design framework, the layout of the necessary blocks is sufficient, and Unidraw is a good, simple tool with a low entry threshold for your business.

CJM, research. Building a customer journey is one of the good analytical tools for business and identifying weaknesses in the business process. By recording notes on stickers along this path, identified during the study of customer actions, you can easily explain problem areas to team colleagues.

Description of structure, personnel management. It is easy to visualize organizational design in Unidraw, which increases transparency in the company for colleagues, especially due to the possibility of collective discussion of this scheme. This approach mitigates the risks of conflicts during the restructuring of the organization.

OKR. Visualization of common goals and their discussion with the team on a single board allows not only to save the goals themselves, but also the artifacts of the discussion, problems related to achieving the goals.

Software Architecture, Domain-Driven Design. Unidraw is used to visualize software architecture, conduct DDD sessions. Due to the infinite space, architecture and DDD diagrams can be drawn on a single board, which, using links on the board and search, allows easy navigation through the entire diagram.

I'll leave a link to the open kanban metrics board so you can see how it looks in action.

Voting for new features

With the growth of the internal user base, the number of requests for changes related to work has become very influential. Using ratings, wishes, and questions, we have changed our Road Map to achieve these goals. At the same time, the task backlog was clearly growing faster than the team's capabilities, which led us to decide to expand the team, as well as to choose new features taking into account the existing functionality.

One of the requests was to add new arrow mechanics, the desire to make a simpler interface, add a large number of images at once, add bold, italic text, and links that can be inserted directly into the text.

But the ability to create objects has been moved to the backlog because, using Hotkeys, users were able to quickly create the necessary elements without visual control for these purposes. We do not postpone such convenient opportunities forever, but only for a while in favor of the most demanded functions.

I note that there were no prohibitions on the use of other applications in our company, and high Retention (above 65%) along with the growth of MAU gives the team a great impetus for further product development.

Our plans

Our task is to make joint work of people as comfortable as possible.

The presentation module will be completed. We will add it because there is an internal need.

We will expand the story with the mobile version because we have a lot of users from mobile. But we also plan some basic things: add reactions, complex widgets, access by link without registration, drawing by clicking on a point from an object.

We try to listen to our users and improve the product together with them, so we launched a Telegram channel. Use it and leave feedback on improvements!

Unidraw — a two-year journey

Finding suitable solutions

First steps of the new product

Technical project details

Main error on the way

Unidraw Use Cases

Voting for new features

Our plans

Write comment

Relevant news on the topic "Software"

Эlectronic Compromise: The US and China Ease Mutual Restrictions

Also read