Sword Art Online is closer than it seems. Four layers of full-dive VR

The first season of Sword Art Online was released in 2012. Since then, NerveGear — a full-dive virtual world headset — has become the benchmark for an "unattainable future". The standard answer to the question "when?" is "in thirty to fifty years"

In 2012, the first season of Sword Art Online was released — an anime in which ten thousand gamers end up trapped inside a fully immersive MMO using the NerveGear headset. The headset reads thoughts, projects a virtual image directly into the visual cortex, paralyzes the body's motor signals, simulates touch and pain. The player cannot see the boundaries between reality and the game. If you die in the game, you die in real life.

Since 2012, this has been one of the most popular benchmarks of an "unattainable future" in technological folklore. "When will we finally get NerveGear?" is a common tweet posted under VR headset announcements. The standard answer is "thirty to fifty years, not earlier."

This answer is outdated. Not by five years, but by fifteen.

In this article, I will explain why. The main thesis: full immersion is not a single technology that will be invented at some point. It is four independent layers, three of which are already commercially available, and the fourth — the most complex — passed a proof of concept in February 2026. Further on, it is a matter of engineering integration, regulation, and which company will be the first to put all of this together into a single product.

Four layers of full immersion

For NerveGear to work, four independent engineering tasks need to be solved. Each is a separate industry with its own ecosystem. Most publications about SAO confuse them or lump them together, which is why the conclusion is usually "none of this is solved". In reality, each layer is developing along its own curve, with its own leaders and its own readiness horizon.

Layer 1. Environment. What the agent exists in. A virtual world that reacts to actions in real time.

Layer 2. Intention reading. The ability to control a virtual avatar without physical controllers. To take from the user "I want to move forward", "I want to swing my sword", "I want to say this exact phrase".

Layer 3. Perception Recording. A way to make the brain "see" virtual environments, "hear" virtual sounds, "feel" virtual touches. Without a screen, without headphones, directly into the cortex.

Layer 4. Physical Body Suit. A suit, gloves, shoes — these simulate tactile feedback, muscle movement, temperature, and balance. Without this, the brain experiences sensory mismatch and develops cyber sickness within minutes.

Next, we go through each layer: where we are now, what the latest artifact is, and what needs to happen for the layer to reach SAO-level quality.

Layer 1. Environment — already exists, 18 months left to reach final form

I recently dedicated a separate article to this layer about world models and Google DeepMind's Project Genie 3, so I won't repeat the details here. Short summary:

Google Genie 3 (August 2025, commercial access starting January 2026) generates playable 3D environments in real time from a text prompt. 720p at 24 frames per second, memory of up to one minute, prompt-controlled events ("a snowstorm started"). For $200 per month on the Google AI Ultra plan.

In May 2026, Street View integration was added: you can generate any real neighborhood in any weather and at any time of day. At the same time, World Labs Marble is working on an alternative based on Gaussian Splatting with guaranteed 3D geometry. Decart Oasis is optimized for low latency for robotics. Microsoft Muse is for game environments.

What needs to be completed to reach SAO-level:

  • Extend memory from one minute to hours. This is a matter of context window scale.

  • Consistency for multiple agents: two players in the same generated world must see the same world.

  • Export to Unity/Unreal standards — so the environment can be used alongside traditional gameplay.

I bet that Genie 4 or an equivalent from a competitor will solve the first two tasks by the end of 2027. The third is a matter of industry standards, not technology.

Layer readiness: ~70%. Will be completed in 18-24 months.

Layer 2. Reading — two paths lead to the same goal

This layer has been luckier than others - there are two mature branches here, and both are already commercially available.

Invasive path - neurointerface with electrodes in the brain

By May 2026, there are three serious players in the market:

Neuralink - second-generation N1 implant, FDA approval received in early 2026. 12 patients with paralysis in the PRIME cohort, 15,000+ hours of device operation. The first loud problem - thread retraction (electrode threads moved away from optimal positions in the cortex weeks after implantation) - was solved in N2 due to revised electrode geometry. There are no peer-reviewed publications yet.

Synchron - a fundamentally different approach. Stentrode is installed through blood vessels, without craniotomy. By May 2026, more than 50 patients had been implanted. The COMMAND study showed zero serious adverse effects over 12 months. In August 2025, Synchron publicly demonstrated control of an iPad directly from the brain. And - a key moment - in May 2025, Apple added BCI HID protocol to iOS: a standard by which a neurointerface connects to Apple devices like a normal input device. That is, the brain is now a typed input class in iOS, like a keyboard or mouse.

Paradromics - high-conductance neural interface for patients with locked-in syndrome, communication speeds approaching natural speech. Precision Neuroscience and Blackrock Neurotech are working in parallel.

All these systems are currently only reading signals from the brain. Recording is a separate layer, which we'll get to.

Non-invasive path - electromyography and muscle movement

Parallel to this is a path without surgery. At CES 2026, Meta presented Meta Neural Band: a wristband with electromyography that captures electrical signals from the forearm muscles and translates them into commands through machine learning. Launched since late 2025 as a controller for Meta Ray-Ban Display, now expanding through partnerships - Garmin (in-car control), University of Utah (smart home control for patients with amyotrophic lateral sclerosis).

The model is trained on data from approximately 200,000 people - this is statistically sufficient for recognizing subtle gestures, including "I just thought about moving my finger, but didn't."

This is the key advantage of the myographic approach: the brain prepares a muscle movement roughly 100 milliseconds before the muscle actually contracts. The ready signal is already traveling along the nerve, and electromyography intercepts it. In other words, from the user's perspective — "I just thought about it, and the character in the virtual world moved".

What needs to happen to reach SAO-level

Invasive neural interfaces currently only work for patients with paralysis or ALS — the FDA does not allow implanting them in healthy people for gaming purposes. This is a legal barrier, not a technical one. Technically, the Neuralink N2 can already handle the task of "reading motor intentions to control a full avatar" right now.

Non-invasive interfaces need to be expanded from the wrist to cover the full body. Meta currently recognizes hand and finger gestures; to capture legs, torso, and facial expressions, myographic sensors need to be placed all over the body. This is being solved via a suit format, and Teslasuit is already working on something similar, though so far it only captures movement, not intentions.

Layer readiness: ~80%. It will be completed in 24-36 months on the invasive side (if regulations are relaxed for healthy users), 18-24 months on the non-invasive side.

Layer 3. Writing — the February 2026 breakthrough that journalists missed

This is the most complex and most interesting layer. Here we are not talking about "reading a signal from the brain", but "writing a signal to the brain", so that the user can physically see, hear, or feel something that does not exist in reality.

Until recently, I would have put the readiness of this layer at 5%. Transcranial magnetic stimulation technology has existed since the 1980s, but its resolution is 1 millimeter, which is catastrophically insufficient for "seeing as if in reality". To stimulate the visual cortex, thousands of neurons need to be targeted with a resolution of hundreds of microns to get even recognizable shapes.

Everything changed in February 2026.

An article was published in the journal Neuron A mescale optogenetics system for precise and robust stimulation of the primate cortex. The team for the first time demonstrated million-pixel stimulation of the primate visual cortex over a centimeter-sized area with a stable result for over a year.

What happened from a technical standpoint. The team used a combination of two technologies:

  1. Optogenetics — a method from 2005: genes encoding light-sensitive proteins (opsins) are introduced into cortical neurons. After injection, specific neurons become controllable with light — direct a flash of blue light at them, and the neurons fire.

  2. MicroLED matrix — an array of microLEDs the size of a 1-centimeter chip, which is placed on the cortex and precisely illuminates targeted neural groups. The resolution is one million pixels. Each "pixel" is a group of neurons that can be activated independently.

In the paper, they used a new opsin called ChRger, which provides "reliable behavioral modulation of primates at low irradiation intensity" — that is, less light is required, less tissue heating, and less tissue damage. Phosphenes (artificial "flashes of light in the animal's visual perception") caused by focal cortical stimulation could be induced for over a year consecutively at the same coordinates with retinotopic accuracy. This means the brain stably "sees" a point at a specific position in the visual field when a specific cortical neuron is stimulated.

Quote from the translated abstract: "Proof of principle for the next generation of optical neural interfaces and visual prosthesis".

Wait, realize what that means. If a million pixels fit into one square centimeter of the visual cortex — and the visual cortex of primates and humans is roughly comparable in area — then you could theoretically fit tens of millions of stimulation points across the entire V1 (primary visual cortex). This is no longer "phosphenes", this is potentially full-fledged artificial vision, generated by the brain without any input from the eyes.

In parallel — the path through the retina

There is also a second, less radical path. This is optogenetics targeting the retina, not the cortex. The idea is to modify retinal cells (ganglion cells) so they become light-sensitive, then route the signal from a VR device not to a display, but directly into the eye using a laser or micro-OLED projector.

In 2021, a study on primates was published that demonstrated "the longest transgene expression to date — over 20 months". The signal generated in the retina via optogenetics reaches the visual cortex and is processed there as regular vision. For patients with retinitis pigmentosa who are blind, this is already in clinical trials.

In the context of SAO, the retinal approach is interesting because it does not require a brain implant — only an injection into the eye. It is an order of magnitude less invasive in terms of regulation and risk.

Layer 4. The body — commercially available, needs quality improvements

The final layer is body haptics. This is the most mature from a commercial standpoint, as it has been developing since the 2010s in the VR and motion capture industry.

Where we are now

Teslasuit 4 is a full-body suit with electromyostimulation and transcutaneous electroneurostimulation instead of standard vibration motors. It has 14 inertial motion capture sensors (gyroscope + accelerometer + magnetometer in each). Electromyostimulation can actually stimulate muscles — meaning the suit can force your muscle to contract, not just vibrate. This is no longer just a notification, this is physical impact.

bHaptics TactSuit Pro is a mass-market product. 32 points of tactile feedback. $698 for the full set. 250+ compatible VR games. It uses simple vibration motors, but they are densely spaced. It is enough to feel impacts, gunshots, and touches.

Somnium Space DK1 + Teslasuit — a collaboration on 68 points of haptic feedback + full motion capture. Premium segment.

Apple Vision Pro M5 (October 2025) — 16-core neural engine, on-device processing. Apple is patenting tactile devices attached to fingers: rings that transmit the feeling of pressing virtual keys.

Cyber sickness — why it is critically important to solve separately

Without a body harness, sensory mismatch occurs: the eyes show movement, the body does not feel it, and the brain within minutes causes nausea, dizziness, disorientation. This is the main killer factor for VR in long sessions, and without solving it, SAO is impossible (there players dive for 200+ days).

There are recent studies on galvanic vestibular stimulation: a weak current is applied to the mastoid processes behind the ears, which stimulates the vestibular canals and forces the inner ear to “agree” with what the eyes report about movement. At IEEE VR 2025, omnidirectional galvanic stimulation was presented, which significantly reduces discomfort in users prone to cyber sickness.

Another path is transcranial alternating current stimulation at 10 Hz through the vestibular cortex. In 2023, it was shown that this method significantly reduces nausea during VR use, the effect is frequency-dependent and not a placebo. That is, you can also “fake” balance by stimulating the required area of the brain so that it does not realize that the body is not moving.

This is half of the problem of the paralyzed body from SAO. To make the brain believe in physical movement when the body is lying on the bed.

How the layers come together

This is the most interesting part. Each layer develops in parallel, but not in a vacuum. The same three companies work on each of them:

  • Apple is developing Vision Pro (hardware, partially the environment via Imagen+Veo), BCI HID protocol for iOS (input via invasive neural interface), coming soon — tactile feedback mounted on the finger (hardware).

  • Meta is developing Neural Band (non-invasive input), Ray-Ban Display (lightweight hardware), Reality Labs is working on its own Genie equivalent (environment).

  • Google/Alphabet is developing Genie 3 + Marble via World Labs (environment), Veo + Imagen + Nano Banana + Lyria (full ecosystem for the environment), Verily is investing in neural interfaces (input).

What this means is that all three companies have a full stack across all four layers, development is proceeding in parallel, and they are incentivized to launch an integrated product before their competitors.

What is already integrated:

  • Apple BCI HID + Synchron + Vision Pro M5 — if you have a Synchron implant, you can directly control Vision Pro with your mind. This is already functional as of August 2025 for iPad control. Expanding support to Vision Pro is a matter of configuration.

  • Meta Neural Band + Meta Ray-Ban Display — sold as a bundle, unified ecosystem.

  • Google Genie 3 + Google AI Ultra — environment + characters (via Veo) + music (via Lyria) in a single $200/month subscription.

What is not yet integrated:

  • Optogenetics projects (Chinese research group, academic laboratories) are not yet integrated with major ecosystems. This will happen when the first leading company acquires a specialized startup. I would bet on Apple — they have Vision Pro as a ready-made platform and the BCI HID protocol as an established standard.

Timeline — not fantasy, but extrapolation of existing curves

Year-by-year forecast. Every data point is justified by the current trajectory.

2026–2027. Massive growth in the hybrid device market. Apple Vision Pro 2 + Meta Quest 5 + Meta Ray-Ban Display hit the mass market. Neural Band becomes the third mandatory accessory after the headset and controller. Genie 4 expands memory capacity to 10+ minutes.

2027–2028. The first commercial non-invasive game with a neural interface. Most likely Meta or a Chinese competitor (ByteDance/PICO). Avatar control via Neural Band + myographic suit, 95%+ accuracy, no controllers. Synchron receives FDA approval for broad medical indications — this blurs regulatory boundaries.

2028–2030. The first "semi-immersive" combination: Vision Pro 3/4 + myographic suit + galvanic stimulation headphones + tactile gloves. Long 4-6 hour sessions without motion sickness. Not SAO-level, but already not today's VR.

2030–2032. The first optogenetic visual prosthesis receives initial FDA approval for blind patients. In parallel — early human trials for cortical stimulation in sighted people to improve vision (controversial, but regulators are starting to discuss it).

2032–2035. The first full visual neural interface for sighted people in "enhancement" format — for example, for military aviation pilots, then for high-budget gamers. Price is on par with today's Tesla, around $100-200K per procedure.

2035–2040. SAO-level for early adopters. Fully immersive experience with long sessions. Access price is tens of thousands of dollars per year. Regulation in the US and EU is put together hastily, in China — in advance.

2045+. Mass market. Access price is on par with today's flagship smartphone.

This is 15-20 years. Not 50.

What will not be solved technically — and why exactly these do not block SAO

Skeptics usually say: "the brain's bandwidth is enormous, we will never be able to transmit it". This is false. The bandwidth of visual perception is limited by the number of retinal ganglion cells (about 1 million per eye) and the bandwidth of the optic nerve (equivalent to approximately 10 Mbps). This already fits within standard USB 3.0 bandwidth.

Skeptics say: "you cannot write to the brain with the resolution required for high-quality vision". This stopped being true on February 3, 2026, when an article from Shanghai was published.

Skeptics say: "motion sickness is incurable". This is not the case — galvanic and transcranial stimulation are already reducing it in clinical trials.

What actually remains technically unsolved:

  • Long-term safety of cortical implants. Neuralink has already encountered thread retraction. Each generation fixes previous issues, but 20–30-year implantation has never been tested by anyone, anywhere, ever.

Sensor mismatch in extreme cases. What happens when galvanic stimulation says "you are turning" while the visual cortex is stimulated as if "you are standing"? Epileptiform reactions are possible in people prone to them. This is resolved via calibration, but requires individual adjustment.

Reverse engineering another person's perception. To transmit a complex sensation — taste, smell, emotion — you need to know which neurons and in which sequence are responsible for the "taste of an apple" in that specific person. This is a personalized task, unlike the visual cortex, whose topography is relatively standardized.

These tasks are not blockers, but issues to be resolved after the first minimum viable version. That means NerveGear will launch incomplete. First only vision and hearing. Then tactile. Then everything else.

Open questions — ethics and safety

If writing to the brain becomes possible, mind reading is just around the corner. The same optogenetic methods that allow precise activation of neurons, when used in reverse, allow precise reading of them via fluorescent feedback.

This creates several unpleasant classes of vulnerabilities:

Brain hacking as a class of attacks. If your neural interface is connected to the network — and it will be, otherwise there is no way to update its firmware — then firmware compromise means an attacker can directly write to the visual cortex. In 2024, the first documented cases of pacemaker control hijacking were recorded in the US; the same is possible with a neural interface, only the consequences are far worse. I wrote a detailed article about this for Mythos — I recommend rereading it in this context.

Privacy of thought. When a myographic bracelet reads "you just thought to move your finger", it's already close to reading intentions. When a neurointerface reads the motor cortex, it's a complete reading of intentions. When optogenetics reads the visual cortex, it's reading what you see even with your eyes closed in your imagination.

Addiction and maladaptation. SAO as an anime accurately described the main risk - players refused to log out. This is a documented psychological phenomenon in VR research, called "VR addiction", and it grows with the level of immersion. The deeper the immersion, the stronger the addiction.

Regulation of "brain as a service". If your perception physically lives in the Apple/Meta/Google cloud - who does it belong to? What happens when an account is banned? When the terms of service change? When there's a vulnerability in the cloud infrastructure?

All of this is a real regulatory and engineering problem, not science fiction. They need to be solved now, not when NerveGear goes on sale.

Conclusion

Sword Art Online as a technological scenario didn't seem realistic until I broke it down into four independent layers and looked at where each one is.

  • Environment - exists, $200/month, will be finished in 18 months

  • Reading - exists, invasive and non-invasive paths are developing in parallel

  • Writing - was the weakest link, but on February 3, 2026, got proof of principle

  • Body - commercially available, needs polishing

The curves of three out of four layers are steadily going up. The curve of the fourth layer (writing) has just left the labs and is starting exponential growth - because there's suddenly been economic interest around it.

Next, they need to be glued together. And it won't be an academic lab that does it, but one of the three companies with a ready stack - Apple, Meta, or Google. I'm betting on Apple: they already have Vision Pro as a hardware platform, BCI HID protocol as a standard, and a financial cushion to buy out any optogenetic startup entirely. By 2030, they'll roll out the first generation - not full SAO, but something that will make everyone else play catch-up.

Sword Art Online is not "sometime in the future". It is 2035-2040. And at that point it will turn out that the main problem is not technological. The main problem is what we are going to do with it.

This is no longer an engineering question. It is a question for legislators, philosophers and psychologists. And they are not yet ready to answer it, because they think they still have 50 years left.

Don't count on that.

P.S. It would be useful to collect forecasts in the comments: what year do you personally put the first commercial SAO-level? Which of the four curves, in your opinion, will hit the wall first? I am especially looking forward to arguments from people with a medical and neurophysiological background — where I overestimated, where I underestimated.

Comments