Introduction and Background
We take a deep dive look at the Lighthouse tracking system used by the HTC Vive VR headset.
Introduction
VR is rapidly gaining steam lately with the recent launch of several capable platforms. I’ve briefly sampled the various iterations of development kits and pre-release units coming through our office, and understanding how they tracked the headset position was relatively easy. Then we got to play with an HTC Vive, and things got a bit more interesting. The Vive is a ‘whole room’ VR experience. You’re not sitting at a desk with a game controller. Instead, you are holding a pair of controllers that behave more like extensions of yourself (once you get used to them, that is). Making all of this work took some extra pieces included with the kit, and the electronics technician in me was dying to know just what made this thing tick. I’d imagine other readers of this site might feel the same, so I thought it appropriate to do some digging and report my findings here.
Before diving straight into the HTC Vive, a brief history lesson of game system positional tracking is in order.
I'll start with the Wii Remote controllers, which had a front mounted IR camera that ‘saw’ a pair of IR LED banks mounted in the ‘Sensor Bar’ – an ironic naming as the ‘sensor’ was actually in the Remotes. This setup lets you point a Wii Remote at the television and use it as a mouse. Due to the limited number of points in use, the system could not tell the Wii Remote location within the room. Instead, it could only get a vector relative to the Sensor Bar itself. Wii Remotes also contained accelerometers, but those were typically not used to assist in the accuracy of the pointing (but were used to determine if the remote was inverted, as the Sensor Bar had only two light sources).
The Oculus Rift was essentially a reversing of the technology used in the old Nintendo Wii Remotes. The headset position and orientation are determined by a desk-mounted IR camera which ‘looks’ at IR LEDs mounted to the headset. The system dubbed ‘Constellation’, can decode the pattern (seen faintly in the above photo) and determine the headset position and orientation in space.
Even the sides and rear of the headset have a specific LED pattern to help the camera lock on to someone looking away from it. If the IR camera sees the triangular pattern on the headset strap, it can conclude that the viewer us looking behind them.
The HTC Vive takes a different approach here. Since it was launching with a headset and two controllers that would all need to be tracked in space simultaneously. The Wii Remote style idea would only work with a much larger grid of sensor bars (or QR codes) peppered all over the room, so that idea was out. The Rift’s constellation system might have a hard time identifying unique light patterns on multiple devices that could be far away and possibly occluding each other. So if having cameras on the headset and controllers is out, and having a camera on the desk is out, what’s left?
wow thats lot of accessories
wow thats lot of accessories and lot of plugs
It looks more simple once
It looks more simple once everything is set up, but yeah, there's a few parts.
i think htc’s tech is a good
i think htc’s tech is a good idea as with each sweep it establishes a reference datum point for the sensors it can see making it quite accurate. but what do you think there might be issues with syncing both light houses as any error in the speed of theos motors would throw it oull out of sync.
well out of the 2 main tecnologies which one do you think would have a larger cpu overhead or prone to errors.
The motors are synchronous,
The motors are synchronous, meaning they spin at the exact speed that they are directed to by the controller. The controller sends a three-phase AC signal which the permanent magnet rotor must follow. The only way they could be out of sync was if there was some sort of failure.
Position tracking is done in hardware on the Vive, so no CPU overhead necessary. It may be done in software on the Rift, but it doesn't appear to have any noticeable impact (position tracking math is very simple compared to other game tasks performed by a CPU).
Does the Rift camera require
Does the Rift camera require a high speed USB port? It seems like it would need to essentially stream video to the system for processing but that processing would not take much power. It would be at 60 fps or more, but all it is doing is tracking LEDs which doesn’t take much. It doesn’t seem like a very good solution since expanding to multiple cameras and multiple tracked devices will increase overhead. There is no overhead for multiple lighthouse units and there is minimal overhead for tracking more devices. I would expect a second generation Rift to switch to lighthouse style tracking.
The rift requires a USB3 for
The rift requires a USB3 for position tracking. Using only USB2 will make the position tracking on the oculus quite unstable if you have other USB things connected at same time, so its quite demanding actually. So proper USB3 connector to at least the oculus tracking camera is a minimum requirement in my eyes if you want rock steady tracking.
>Position tracking is done in
>Position tracking is done in hardware on the Vive, so no CPU overhead necessary.
This is incorrect.
Both the Vive and Rift perform the model-fit algorithm and Sensor Fusion (the actually CPU intensive part, though not all that intensive) on the host CPU. The only difference between the two systems is how they populate the array of XY camera-relative (or basestation-relative) coords for each tracked object: Lighthouse does so asynchronously by the sensor readouts providing sweep-relative timings, Constellation does this synchronously by processing the camera image (thresholding then centroid-finding, logging the blink codes frame-to-frame for marker ID). While the Constellation cameras do use the host CPU to do the image processing, this is a very basic process to do: in the Wii Remote this processing was done by the ISP on the Pixart camera, to give an example of how low-power the task is.
The Lighthouse inventor has
The Lighthouse inventor has stated that pose / positional calculations are done in ASIC within the controllers and HMD. Is this different than what you are talking about?
From Alan Yates (reddit
From Alan Yates (reddit username vr2zay): https://www.reddit.com/r/oculus/comments/3a879f/how_the_vive_tracks_positions/csaffaa
” Presently the pose computation is done on the host PC, the MCU is just managing the IMU and FPGA data streams and sending them over radio or USB.
A stand-alone embeddable solver is a medium term priority and if Lighthouse is adopted will likely become the standard configuration. There are currently some advantages to doing the solve on the PC, in particular the renderer can ask the Kalman filter directly for predictions instead of having another layer of prediction. It also means the complete system can use global information available to all objects the PC application cares about, for example the solver for a particular tracked object can know about Lighthouses it hasn’t seen yet, but another device has.”
Interesting, in his interview
Interesting, in his interview video on Tested he talked more like everything being done in the ASIC, but that was back when everything was prototype so that must have changed in production.
I think that the light house
I think that the light house techniques would be almost no CPU overhead. It just reads the IR sensors on the headset and determines the location based on the different times of arrival. It then just reports the location data to the system. This is actually how GPS basically works. GPS satellites are moving though, so they broadcast the time and position. The receiver than calculates position based on the time difference of arrival of the signals.
With the Rift, it actually has to take a picture with the camera at (I would guess) 90 to 120 Hz. It then has to process each image to determine the location of the IR LEDs on the headset. It is probably a low resolution image though. If the camera attachment has an integrated processor, then the CPU usage would be the same. I doubt that it has an integrated processor though. The Xbox One actually used a portion of it’s processing power to handle input from the Kinect camera, although it was doing more than just tracking IR lights. If the Rift camera requires a high speed USB port, then it is probably streaming the images to the computer for processing since just sending coordinated would be low bandwidth. Making the camera separate seems like it was a bad idea. Even the CastAR device integrated the camera on the headset and used stationary LEDs for tracking.
The light house solution seems much better. It allows you to completely rotate around in a larger area. To do that with the Rift’s set-up, you would need multiple, probably wired cameras. The tracking is done by the camera, not the headset, so wireless cameras would probably be too high latency. With light house, the boxes are essentially passive. They have some communication ability to sync with other light house units, but that is it.
People seem impressed with the Rift’s screen/lenses but that is about it. You will be limited to sitting there with an old style game controller. It cannot easily do things like the “Hover Junkers” game which uses 3D mapped controllers to represent guns. You just point at what you want to shoot and pull the trigger. You can’t do this with the Rift. In fact there are a lot of games which have been demonstrated on the Vive Pre which can not easily be implemented with the Rift hardware. Any game for the Rift could be easily played in the Vive though. I think Facebook needs to make a new revision of the Rift using light house type tracking and with 3D mapped controllers. This would obsolete this initial version of the Rift though.
My guess would be that the
My guess would be that the Vive has lower position reporting latency and better accuracy. The Rift is the cheaper option to build due to the lower number of electronic parts though.
I’m glad PCPer is pushing full throttle into VR technology, reminds me of the nascent SSD days.
The Vive is supposed to be
The Vive is supposed to be $100 more, I think. I don’t know if that represents that much of a difference in the cost to make the device though. I suspect that the Rift is being sold at closer to cost than the Vive. I kind of doubt that the two Light House units cost $100 to make; they mostly look like off-the-shelf components. You get more with the Vive though, so that extra money is probably worth it. The Rift will be a very limited experience just sitting with an old style game controller. I don’t know what they will be doing to support 3D mapped controllers with the Rift. The current implementation doesn’t seem like it can support it easily.
You’re right about the Rift
You’re right about the Rift controller being limited. Also with just one camera as soon as you turn away it’s screwed, I don’t think you can get past that so the controller will need another camera also.
I’m not sure how cost is factored into the price of either of these devices (or overhead). It seems likely that both could sell at a loss to get people using their platform (like consoles).
The Rift is designed to track
The Rift is designed to track position even when looking to the side / away. There are LEDs all the way around. Agreed on the controller tracking though.
(Question) Hey Ryan have you
(Question) Hey Ryan have you guys tested the USB compatibility of the CV1 yet? I have heard ALOT of people have been getting incompatibility rating in the Oculus Tool and I don’t know whether these parts will actually cause trouble or if they just aren’t on Oculus’ white list (VIVE doesn’t have this Problem at all) or if they are just being over protective. For example I have 3930k@4.8ghz and it says my CPU isn’t good enough but even per core it is stronger than a 4590 (I got an 11 on steam VR tool) so I know they are wrong. Now it also says my USB is no good now it could be because the x79 platform is not on their radar or also it turns out if you have even one incpable usb port and 4 usable ones the tool will still fail you if you don’t disable the offending usb ports you guys could test any older but viable platform like z77 or X79 before intel had integrated usb 3.0
From what we’ve gathered so
From what we've gathered so far, the Rift camera is picky and prefers native USB 3.0 implementations.
Allyn, thanks for the neat
Allyn, thanks for the neat break down and analysis. We’ve known for some time the lighthouse system’s base stations are mostly passive and the actual sensors and location calculation happen on the headset and controllers. Despite that knowledge, so many outlets and commentators online continue to refer to the lighthouses as “sensors” or “cameras”. Thanks for providing a resource to point to so that can be corrected.
I’d also like to point out the lighthouse system has an analog in GPS. The base stations are similar to GPS satellites in that they provide a timing signal via their blinks, and identification from the order of the sweeps. The headset and controllers have the sensors and logic to pick up the signals and then do the trigonometry to calculate their position. It’s not a perfect analogy, since lighthouse uses IR instead of RF, and the receivers have more sensors to determine orientation as well position. Also the IMUs play an important role for responsiveness, power consumption, and orientation accuracy. Allyn thanks for the neat break down.
It actually is quite similar
It actually is quite similar to GPS, but it is not, strictly speaking, triangulation. There has been a lot of people referring to such systems as triangulation when they are really multilateration. They do not measure angles to something as with triangulation. With triangulation, you measure the angles that the signals are arriving from, and it is very simple to determine position based on trigometric identities. Actual triangulation also only requires two receivers; it is basically how your eyes determine the position of something. They turn inward while focusing on something close providing an angle measurement to your brain.
GPS and Light House use time difference of arrival of signals to determine location. With GPS, the satalites broadcast the time and their position, since they are moving. All of the satalites have extremely accurate atomic clocks and all are synced precisely. The receiving device can determine its location based on the time difference in the arrival of the signals. Since the time is embedded in the signals, the receiver doesn’t need to have a super accurate clock. The distances are great enough that the speed of light signals take a sufficiently long enough time in transit to provide a difference in arrival time. The solution is calculated, as far as I know, by solving a system of equations, not by applying trigometric functions.
The Light House system doesn’t use the time difference of arrival based on the speed of light. That small of a distance would be difficult to measure. It uses the time difference of when the laser sweeps over the sensors instead. If the LH is sweeping right to left (based on your frame of reference), the beam will strike the sensors on your right first and sweep across the sensors to the left. The timing of when each sensor is triggered can then be plugged into a system of equations to solve for your relative position from the LH. I don’t know what the system of equations look like, but it is not the simple trigonometry used in triangulation. Computers can solve such simple systems of equations very easily though.
This whole crew thing is a
This whole crew thing is a fad,
Hey,
Very nice article. I
Hey,
Very nice article. I have question though.
In the podcast you said you only get a position at 15hz. But even with 2 sensors, and one of them occluded, you should get 30 positions per second. Because the sweeps from one sensor should be sufficient to determine your position. The second lighthouse is just there for better occlusion coverage.
Or am I missing something?
You’re right, I divided by
You're right, I divided by too many twos on the podcast. It's 30 Hz when one station is occluded, 60 with both visible (assuming a controller can track under both simultaneously, which we are not sure of yet).
I would say they can track
I would say they can track both simultaneously. They can see the blinks as well so they know the second set of sweeps are form a different base.
Hi,
Can you talk about
Hi,
Can you talk about expandability? Valve has on a couple of occasions mentioned that this system is very expandable by adding base-stations. However, the question recently came up on Reddit/r/vive, and someone used your article as a point that if you keep increasing # of lighthouse units, you run into issues. Cut & Paste: “The more you add the more infrequent each base’s sweeps are sent out. Only one sweep can fill a volume at a time. Otherwise the sensors will get confused on which sweep they are seeing. So, more base stations means each one has to wait longer before it can sweep again while the others get their turn. Other than that, adding additional base statioms is as easy as firmware updates.
As it is, each base station does its sweeps at 30Hz. That would be down to 20Hz with three and 15Hz with four. If the other three were occluded, then you’d be down to only 15Hz tracking with the one station that can see you with ~50ms between each sweep set.
Unless they can change the speeds of the motors at will, then this all goes out the window. However, if they speed them up the sweeps cross the colume faster and that means reduced reaction time for the sesnors to do their math. Not sure if they’re anywhere close to the limits of the ASICSs they’re using. Also, higher speed means more noise and more Lighthouse motor wear.”
It seems to me Valve must have thought of this? However, getting hold of anyone there is a crap-shoot, and you seem to know a lot on the subject!
I think your math is correct
I think your math is correct regarding scalability vs report rate. I think even 3 stations would potentially create a scenario where may you occlude 2 and only be getting 20Hz data from a single station. As it is now, 30Hz from a single station isn’t enough to maintain solid tracking, so dropping to 20Hz would just make that worse. If you however have a setup where no more than 1 station is occluded at all times, you would only be dropping to a 40Hz rate, with two different viewpoints. This situation is much better than a 2 station setup with one station occluded.
In order to really scale up, there needs to be some frequency domain separation of the lighthouses. That way they can all sweep at the same time and the receiver should be able to differentiate between the stations in the frequency domain. This may or may not be possible with firmware updates. I doubt it because I would think this would have been the route they would have taken initially if the hardware was capable.
How does the controller of
How does the controller of the vive know which base station hit it ?, so as to calculate the right relative position.
The system is identical to
The system is identical to how aircraft VOR locations works.
At 0 Degrees an omnidirectional radio transmission occurs. Then as a directional radio signal swipes across your aircraft the time difference between the two gives you the angle from the base station.
Ive spent the last few weeks integrating both vive and rift into Aces High. The rifts software prediction for lost frames is very impressive as compared to the openvr.
The vive controllers forces a rethink of how to do all player interactions. A simple thing like changing the controller into a hand makes it simple to just press virtual buttons with your finger for GUI’s. Or just reach over and grab the virtual joystick or throttle in aircraft.
HiTech
How does the lighthouse
How does the lighthouse tracking system work on a low level? In terms of the mathematics need to process the inputs. What are the inputs, and outputs?