A metaverse client for a high-detail virtual world has most of the problems of an MMO client plus many of the problems of a web browser. First, much of what you're doing is time-sensitive. You have a stream of high-priority events in each direction that have to be dealt with quickly but don't have a high data volume. Then you have a lot of stuff that's less time critical.
The event stream is usually over UDP in the game world. Since you might lose a packet, that's a problem. Most games have "unreliable" packets, which, if lost, are superseded by later packets. ("Where is avatar now" is a typical use.) You'd like to have that stream on a higher quality of service than the others, if only ISPs and routers actually paid attention to that.
Then you have the less-critical stuff, which needs reliability. ("Object X enters world" is a typical use.) I'd use TCP for that, but Second Life has its own not very good UDP-based protocol, with a fixed retransmit timer. Reliable delivery, in-order delivery, no head of line blocking - pick two. TCP chooses the first two, SL's protocol chooses the first and third ones. Out of order delivery after a retransmit can cause avatars to lose clothing items, because the child item arrived before the parent item.
Then you have asset fetching. In Second Life/Open Simulator this is straight HTTP/1. But there are some unusual tricks. Textures are stored in progressive JPEG 2000. It's possible to open a connection and just read a few hundred bytes to get a low-rez version. Then, the client can stop reading for a while, put the low-rez version on screen, and wait to see if there's a need to keep reading, or just close the connection because a higher-rez version is not needed. The poor server has to tolerate a large number of stalled connections. Worse, the actual asset servers on AWS are front-ended by Akamai, which is optimized for browser-type behavior. Requesting an asset from an Akamai cache results in fetching the entire asset from AWS, even if only part of it is needed. There's a suspicion that large numbers of partial reads and stalled reads from clients sometimes causes Akamai's anti-DDOS detection to trip and throttle the data flow.
So those are just some of the issues "the HTTP of VR" must handle. Most are known to MMO designers. The big difference in virtual worlds is there's far more dynamic asset loading. How well that's managed has a strong influence on how consistent the world looks. It has to be constantly re-prioritized as the viewpoint moves.
(Demo, from my own work: https://vimeo.com/user28693218 This shows the client frantically trying to load the textures from the network before the camera gets close. Not all the tricks to make that look good are in this demo.)
It's not an overwhelmingly hard problem, but botch it and you will be laughed off Steam.
There's not a lot I liked about Unity when I was working with it full-time a few years ago. But the one thing I could acknowledge that it has that was generally missing from open source web development was the asset pipeline. But dynamic, user-uploaded assets won't be able to use the asset pipeline. So one of the biggest drivers for using Unity goes right out the window.
Not Unreal Engine 4, either. UE5 has "asset streaming" and "open worlds", but mostly static and loaded from a local SSD on a Playstation 5. That's working nicely.
Asset management from the network is the real difference with seamless, modifiable virtual world systems. Otherwise, it's a minute of "...LOADING..." when you move to the next area. You need clients, servers, file formats, and protocols designed for it. It's a moderately hard engineering problem, and, as yet, there are no good off the shelf solutions.
There's a "check out, check in" approach. Decentraland uses that. You check out your parcel into a local Unity environment, edit, and check in the whole parcel to make it visible to others.
The Spatial OS people, Improbable, did some of this, but their solution cost so much to operate server side that all four of the games that used it went broke. So Improbable is trying to pivot to military simulation.
Probably by UE6 this will all be standard. It's one of those things that has to be done to move the metaverse from hype to usefulness.
I've dreamed of the metaverse since Snow Crash and maybe before (Tron?) but ... when it comes to actually making it, lets assume unlimited CPU/GPU power and unlimited memory.
Ideally, I want the Metaverse to allow people to run their own code. Whether its VR or AR it's a shared 3D space. So I want my Nintendo "Nintendogs" to be able to run around my "Ikea furniture" with my "Google/Apple/OSM maps" showing me navigation directions and my "FB Messenger/Discord/iOS Messenger" letting me connect to people inside. In a webpage, each of these things runs in an IFRAME isolated from the other and browsers go to great lengths to disllow one spying on another.
But in this 3D space my Nitendogs can't run through the space unless they can "sense the space". They need to know where the fire hydrants are, where the side walk is, what things they're allowed to climb/chew etc. But to do that effectively means they need enough info to spy on me.
Same for all the other apps. I can use messaging apps on my phone with GPS off and full network access off so that the app can't know my location, but order for different apps in the Metaverse to do similar they'll need to know at least the virtual location of themselves and the stuff around them which is enough to track/fignerprint
You can maybe get around some of this with a massive walled garden but that arguably is not the metaverse.
The messages could be delivered as a simple XML feed. Your virtual home, or HUD knows where to place them. Through hyperlinks they know where to subscribe, or refresh, or get details. The messages don't need to know anything about placement and usage.
It would be more productive to define a layer on top of HTTP/2 so we can leverage a lot of code that already works, rather than having to spend 10-15 years creating a new spec and codebases that need maturing.
And if you're not happy with websockets for low latency bidirectional communication: it would make more sense to improve websockets rather than reinvent the wheel.
Even if you're not building a video game today and want to do something other-VR esque, I guarantee that you will inevitably end up recreating something the video game industry has done in the last 22-23 years.
Everything real-time the author of this article suggests is in the realm of something you would want for VR.
It makes sense to just take what both industries of web development and game development currently understand and build on that.
So let's take that to its logical conclusion. Let's say you wanted to navigate virtual worlds. You're going to end up having some sort of "navigator" or "explorer" or end up going on some sort of virtual "safari" of sorts.
You'll do that probably initially with HTTPS or talk to some HTTPS-based server. WebSockets are not sufficient for real-time VR based work, so you'll probably end up with some sort of, let's say, WebUDP or WebSockets with UDP functionality.
Everyone will end up wanting to build their own layers of abstraction over and over again and again so entry-level web and game developers have something to do, so it'll look like some incarnations that build on or supersede Three.js. Why?
Because everyone will have a different interpretation of what they want their camera, or user entity or actor to be able to do.
So, you'll need HTTPS, web-based UDP, some sort of localplayer series of libraries or framework, then you'll need levels or maps, because to do any sort of VR, you need a world, or worlds you can navigate.
Huh, weird. All of this just sounds like someone porting Quake to the web with VR. How boring.
Of course, if someone says "OK, yep, let's do it then," it won't be anything like that, or it will, but only superficially, because that's what happens when you live long enough to see people take technologies the broader population already knows about and cram them together.
The alternative is that Zuckerberg is an old person and no one was asking for a metaverse, just video games that don't suck eggs, and that Meta is just Mark's way of graduating Facebook to an Alphabet-type conglomerate in order to keep growth moving forward.
I seriously feel bad for kids today. You have what, Fortnite, Minecraft, and Roblox to play and that's it? Too many micro transactions and low quality games.
No 30-something is asking to put on a headset and go to the VR equivalent of Something Awful, which would probably look like a back alley with players farting on each other Jaykin' Bacon-style. You want a metaverse? That's what it would look like. There's always going to be some Something Awful/4chan/Facepunch equivalent.
No one is like "oh yeah I wanna go to work and sit in a virtual cubical with my Meta® Quest 3," and yet there are some people really disconnected from reality who think people are actually asking for that instead of, like, affordable housing or something. Weird.
Yes, those are the only 3 video games.
"Everyone" wanting to build their own layer on top of a transport that is mature isn't an argument for building yet another transport. It is just as likely that "everyone" will want to build their own transport. And do it badly.
Http/3 is UDP.
So whats wrong with http exactly?
Or that Facebook-the-company was receiving lots of bad press, and this is a way to dodge some of it.
In modern fighter games, the industry seems to be tending toward predictive lockstep networking. This is a type of networking where if the client doesn't receive the inputs of other clients from the server, it will "predict" those inputs (usually by replaying the last received input) to give the illusion of zero latency gameplay. The drawback being that you need to implement rollback in the case where the predicted input doesn't match the real received input. When poorly executed, this could look like jittery player movement with entities rubber banding and teleporting and cause artifacts, but when done properly is mostly unnoticeable.
If you're interested in this domain, I recommend checking out https://www.ggpo.net/ which is the library used in many of the modern fighter games (notably Skullgirls). It also comes with an in depth explanation of how to implement predictive networking with rollback on your own https://drive.google.com/file/d/1cV0fY8e_SC1hIFF5E1rT8XRVRzP...
Currently it's a relatively simple bidirectional protocol over TLS. It's not fully documented yet but you can get an idea of it by looking at an example bot client in python: https://github.com/glaretechnologies/substrata-example-bot-p...
>A real-time, dynamic, stateful two-way client-server protocol. As such, it will be if not fully RTP then close to it.
Why didn't we always have this if all we needed to do was ask? So...realizing we still have the internet of today, what we actually need to rethink is html and the concept of the web as documents alone.
I would be interested to see some work on hyper-objects. As in, hypertext beyond text. The article should be "HTML for VR" and we should be musing about how to find, load, interact and link web based virtual objects.
This is a valid point, but I believe there's still enormous potential to innovate on top of WebXR. Since browser engines are open source, it's possible for upstart XR browser apps to add additional features to Gecko or Chromium that push WebXR forward.
http://fuse.rupy.se/about.html
You also need a P2P protocol (probably some binary UDP thing) for tick based data like limb positions if you want body language.
But really VR is much less important for immersion than action MMO = Mario/Zelda with 1000+ players.
An alternative view would be that HTTP(S) would be "the HTTP of VR". With WebXR and standard JS APIs for HTTPS, async fetching, WebRTC, etc, all the items listed in "Imagine an application-layer protocol for VR with the following characteristics..." are satisfied. And the stack can use battle-tested web technologies so that it can leverage standard CDNs, cloud servers, etc.
VR has some extra constraints over 2D webpages due to tighter frames per second and latency tolerances, but most of the web protocols can get you 90% of the way there.
Something that is unique is the idea that a website is a single document where as a virtual website might take the form of an interactive object and/or an interactive space.
I would say it's an open question how we want these web based virtual objects to interact with each other. Would we want to physically pull a video object off the Google Drive shelf and drop it into the YouTube workstation? How would such an interaction be possible? Even if, as today, they just never speak directly, could those objects live in the same space or would each website fully immerse the user?
People will not switch over in droves to do their text/image/video editing in VR all of a sudden, because other than a few special design applications, there is no point in doing so...it's slower, clumsier and the input devices are much less precise than mouse&keyboard.
Another supposed target demographic, people in IT won't switch either. I see no point in virtually grabbing a glowing code-ball and throing it into the "deploy-tube", or navigate a codebase using haptic gestures with the huge meat-styluses at the end of my arms, when I can simply type `git push` or `/myAwesomeStruct`
I also have a hard time imagining management sitting in meetings while wearing a 400g headset for 3h. Or companies being willing to cough up 350+$ for every employee just so they can join meetings, when Zoom is basically free.
So, what else is there? Gaming and maybe some "recreational apps" (aka. alsogaming, only less interactive). And since not all games will take place in the same unified MMORPG-ish permanent universe (yes, people want to play in sessions, and people want to play single player, and people want to play while not connected to the internet), this will not be a paradigm-shift, but rather a new toy in an already large collection of other toys.
I did though and I feel like I can't be the only one who finds it really frustrating to the point of making me furious.
1. I started with my eyes at floor level.
2. It moans about a guardian, this by far is the most soul sapping thing of all time. The thing is, I have dev mode, but I do find guardian useful (I punched some walls previously). It's just so annoying though.
3. It asks me to set up guardian every fucking time.
4. Followed by when I try the Oculus Link... do I want to trust this computer.
5. I start steam VR but it doesn't work as it cannot find my headset but at this point, I am strapped in and 2 meters from my desk (ala stationary guardian) so I take it off to restart steam VR and the Oculus app.
6. Sometimes the Oculus app simply doesn't work and I have to reinstall it.
7. For some reason my Oculus link cable is loose unlike other USB-C cables/ports so it disconnects from the movement of my standing desk intermittently enough to not be a problem but also highly annoying.
8. Sometimes things don't start in VR but in Flat mode, this means removing the headset to sort it out (see point 5). I feel like jumping in and out of the experience makes it almost unusable.
It really doesn't take me long to just give up.On the news the other day there was a guy from Microsoft and Facebook talking about how like "WOAH AVATARS ARE THE FUTURE". Like it is something new. I actually stopped playing Consoles (PS3/360?) because of all the Avatar setup shit with profiles. It's just that but in a fake office or room looking at bad 3d avatars and somehow this changes everything....
There is a long way to go. The best thing I have ever seen on my Oculus was when my girlfriend sent me a porn film for a laugh and it was actually pretty good as far as experiences go.
In saying all of this though. Beat Sabre and Super Hot are genuinely good experiences but they are as old as time itself so I feel that very very few things work well in VR. They are either completely shit (Skyrim VR etc), or very good with no in between.
Your complaints # 1-3 are a software issue that, while mildly annoying, will ultimately be resolved in an update. Creating a new guardian is literally a 15 second process.
Numbers 4-8 are because you choose to use a wireless headset from Oculus as a wired headset through Steam. Of course that's your choice, but the optimized workflow - that the vast majority of buyers use and is the primary product design - is to use on-headset apps without a PC or cable entirely.
It's also highly unfair to say "very very few things work well in VR" based on your experience. The vast majority of consumers don't care about Facebook IDs, or PCVR, or Steam libraries, or future compatibility concerns, or the other reasons you probably choose the setup you do.
If you want to judge the current state of VR for the mass market, go to the on-headset Quest store and try Walking Dead, Resident Evil 4, Walkabout Mini Golf, Contractors, Eleven Table Tennis, I Expect You to Die, Thrill of the Fight, In Death: Unchained, Fisherman's Tale, RealVR Fishing, Golf+, Moss, Tetris Effect, Pistol Whip, Red Matter, Shadow Point, or any of the other highly rated games that work out of the box.
Definitely agree for the vanilla game. Patching it up with a few mods (functioning hands that collide with the environment, ability to smash containers, attacks impacting enemies, HL:A-like gravity gloves, changing weapons/spells without navigating menus) makes it far more playable, though that shouldn't be required for a full-price game.
It's disappointing with how much more could have been done, but still nice to have a full open-world RPG for VR, and prior to most notable VR titles like Beatsaber/Boneworks/HL:A/etc.
Just think of everything in terms of observing / monitoring / tracking and you can see why some of it will start getting pushed really hard.
There are some neat things. If I could have a virtual workspace that rivaled 4k monitors and brought my real keyboard / mouse into the VR world, I can't say I'd be opposed to setting up in a virtual office with an amazing view instead of the 10'x10' box I currently live in.
> I also have a hard time imagining management sitting in meetings while wearing a 400g headset for 3h.
I have a Quest 2 and after about 1h I need to take it off and have a break. That's not an issue for gaming, but it has a long way to go before being a productivity tool.
There's also going to be huge commercial benefits for anyone that can convince the public to adopt VR environments instead of real environments. Imagine a generation of movie goers where friends gather in a VR theater to watch the newest movie. They still pay admission, but you have no costs beyond licensing IP. There are apps on the VR stores that are already laying the ground work for that type of setup.
IF that happens, and IF the input devices are not weighty headsets, and IF they offer the same level of haptic feedback, count me in.
Imagine the same improvements are made to VR that were made to phones. The VR headsets are expensive bricks right now but they'll be in glasses form factor (or better) with extraordinary usability in the relatively near future. An overlay on the real world that brings remote and nearby contacts into the same room seamlessly.
Besides, the hardware doesn't get much smaller than it is. The chips are not the problem, the problem is the power supply.
We have already reached a limit for phones, and that only because advertising somehow managed to convince people that it's okay for one of their most important personal electronic devices to go flat in less than a day (quick reminder that mobile phones used to last 4-5 without recharging ;-) )
So, what do we do? Put super small Li-Ion batteries into our "metaverse" devices? Not much of an immersive experience if the thing goes down after 20 minutes. So, big heavy battery it is then, and that's that about slim, cool, SciFi VR glasses.
And what about input? Displaying information is not enough, the whole thing is supposed to be interactive. Voice control only gets you so far, and is unsuitable for most interesting things we want to do (virtual keyboards, games, movement, etc.), not to mention it's not even possible in most scenarios without being permanently online to contact the ASR service (oh, did I mention that the WiFi and LTE/5G modules also gobble up power like noones business?).
So it's not just the headset, I also need an input device, or rather 2.
> Imagine the same improvements are made to VR that were made to phones.
Your missing some important details here. The DynaTAC was the whole telephone. All the electronics and battery were in the unit. The better VR headsets need a giant PC attached to them. Even with the giant PC on mains power and brick of a headset top of the line VR experiences are pretty lackluster.
What you're talking about isn't going from the DynaTAC to the iPhone. You're talking about a giant PC on mains power with a brick of a VR headset and shrinking it to just a headset (or glasses) powered by a battery. Even if you set your VR baseline to the fully detached headsets they're not at a fully usable by normal people state.
While it's not impossible to go from the giant PC on mains power, it's unlikely to be happening in the near term. The DynaTAC was battery powered so it was a continuum of development from it to an iPhone. The DynaTAC was a user device for an existing and well developed telephone system (infrastructure and services). VR still doesn't even have that everyday use case let alone the technology to make it really workable.
This is all the more challenging because today's technology is pushing up against hard physical limits. Today's GPUs on mains power with at the cutting edge of semiconductor manufacturing are hard pressed to render 4K resolution at consistently high framerates. Mobile GPUs aren't even close. So there's still a lot of question marks between today and "realistically usable VR" and a whole lot more between that an VR sunglasses.
"The internet has been described as people and screens. I've been arguing that the metaverse is just more people and more screens. Trillions of dollars of investment have made 2D screens very effective tools for delivering information. If the metaverse can deliver that information anywhere, at scale, shared with geographically disparate people in the same virtual location, then it will have real value."
I've thought about this and come to agree. Ignore the Snow Crash/Ready Player One "everyone lives in VR" hype, and for now ignore the "all games are interconnected in a shared world" fantasy and just think people and screens.
Already in VR, you can create multiple 2D screens for working in any number or configuration. You can place that in any kind of environment you find comfortable working in. You can create a "window" into the real world to see your real keyboard and mouse. You can see avatars of people in the same room that really feel like they're next to you. All of this is a bit clunky and the resolution is lower than hoped, but it works.
Now scale this up, advance technology, and add time. Higher resolution and lighter headsets are inevitable.
Will people want to work in VR? If it means 4 screens on the balcony of a Tuscan villa instead of a tiny desk in a depressing space? Maybe. They can still type "git push" on a 2d screen with a real keyboard in a virtual space.
Will companies buy a $300 device for every employee? If that replaces the $300 monitors they already buy, it could actually save money.
Will execs wear a 400g headset for 3h? What about if it's a 100g headset and lets them feel present with a globally distributed team that can sit around the same virtual conference table with spatial audio and see body language and facial expressions[1]? Maybe.
The benefits for shared movies, group gatherings, co-working, and social gaming are very compelling. If you stop projecting your preconceived ideas of what a "metaverse" is, and instead ask "what are the opportunities afforded by immersive shared and networked spaces being available to the masses through pervasive cheap technology?", you can come up with pretty compelling use cases that, taken together, ultimately form a far more likely "metaverse" in the near-term (<5 years).
[1] Search for "Codec Avatars" to see progress here. Here's an article with some videos: https://www.theverge.com/2021/10/28/22751177/facebook-meta-c...
There's a lot more to physical environments than high resolution graphics. You've got more senses than just sight. Your body is still experiencing the depressing desk even if your eyes are trying to convince you that you're at a Tuscan villa. That seems more depressing.
You're essentially just describing a high resolution 3D desktop wallpaper.
That's exactly what I did with my 4 high resolution monitors. And my current setup even lets me go to the kitchen to make a really tasty espresso or a snack without having to disentangle myself from a headset first. And I can even continue to listen to the music from my speakers while sipping aforementioned coffee.
>If it means 4 screens on the balcony of a Tuscan villa instead of a tiny desk in a depressing space? Maybe.
But they still feel the tiny desk in front of them, and still see it through the mentioned "keyboard window", still hear the janitor vacuuming his merry way down the hallway.
So essentially, this would be a desktop background, which costs 400$ and requires recharging a headset every 2h.
> What about if it's a 100g headset
Then they better start making meetings REALLY short, unless they want the battery go flat halfway through the second slide.
> The benefits for shared movies, group gatherings, co-working, and social gaming are very compelling.
Shared movies: big screen + streaming + comfy couch
Group Gatherings: until such time as VirtualReality figures out how to get me an ActualReality beverage, I pass.
Co-Working: Teams/Zoom/etc.
Social Gaming: Already possible, metaverse not required.
But as long as this isn't faster, easier or more reliable than typing it on a keyboard, I won't.
People don't work a certain way because it's possible. People work a certain way because it saves time, money, sanity, or simply because it's convenient.
If wearing a VR headset while coding isn't providing substantial benefits over what my current system provides, why would I do it?