Spatial mapping (Part 2)

After a few weeks of travel, I am back on track and back to working on the spatial mapping. Last time, I left off with the ability to view the debug information but things were visibly wrong. We were able to see the walls and part of the ceiling but there was no floor or any part of the couches. Its time to figure out why and do some serious debugging!

Before we jump to the debugger its good to come up with a few theories as to why we are experiencing these issues. This way we have a method to our process instead of blindly running through the code until we stumble on the problem.  This becomes more difficult as the project gets bigger, but it is still a good mental exercise.   If we look at the previous rendering results, things were starting to look good, but a lot of information was still missing.  While developing the initial code base I primarily focused on getting the data out to the screen as fast as possible which is always guaranteed to produce a number of bugs.  A project this large really needs short cycles to make sure I keep constant progress but it is necessary to do a lot of testing.  Programmers in general don't do enough testing and in this instance its very easy to see that there s a problem and more testing is indeed necessary.


Previous post mesh data


The first question that comes to my mind, does the data conversion work correctly? We're able to see the vertex points but they only overlap the walls and the ceiling. I wasn't sure what the data was doing or if the data was being skewed, flattened or corrupted.  Let's compare the vertex position values from the spatial mapping and the debug points. Time to pull out the graphics debugger and analyze the input buffers.


Visual studio graphics debugging


Since we are dealing with vertical data issues, Y position values are a good start. With a frame snapshot, I extracted the raw values from the couch arm which we can use as a base for comparison. Now we can go back to the debug meshes to find the same area and compare.  Is it possible that they are being flattened and combined into one plane? I don't know and it's too early to tell.


Buffer data example


The raw values stated that the position for the couch arm were in the area of (.00415, .0149, .0112). The raw data extracted from the buffer as SHORT4 normalized were roughly (138,504,392,32767). W is max 2 byte value, which means its invalid and simply not used. If we denormalize the data by dividing by 32767 we get float values approximately (.0042, .015, .011) which is really close. So we know that our conversion is good. So that theory is wrong. 

The spatial mesh is provided by the SpatialObserver which provides multiple meshes that represent a complete room. The meshes are streamed in an async process one by one as requested. Possible theory, are certain meshes from the spatial observer failing to complete the processing or never got processed?  To verify this, I setup an output log each time update is called on the Spatial Observer. This should help me see the function execution and get the counts of meshes as each process completes. Sometimes its hard to track multiple threads, so an output log is a good way to see what is happening in a real time application. Here are the results: 


debug output


Looks like the total count of spatial meshes is equal to total count of game objects. For each mesh created by the spatial observer a game object is generated with a matching id. Well, that looks like its working but we still don't know what is happening to the actual data. Time to go back to the visualization tools. Maybe the graphics analyzer can help us see a potential problem and possibly develop theories from any new clues.


Raw data

Looking at and comparing meshes between the spatial observer and the debug data, something is odd and the debug meshes look a little anemic. Considering how much data is processed, the debug mesh just seems empty. Maybe not all of the vertices are being processed. Going back to the mesh processing I started looking directly at the vertex conversion code. It took mere seconds to see the problem. The vertex counts didn't match at all. Finally, a "doh" moment here! We're not processing all the vertices! 

So what happened?  Well, last time I wrote the buffer conversion, I started with (SHORT*) buffer casting so that I can check individual values. To simplify the loop I changed the buffer casting to (SHORT4*) without removing the stride division. This in result gave me 1/8 of the actual vertexes to be processed!  So what does our debug data look like now?


Look a couch!


We have a couch! Each dot is a debug point that we will use for our terrain.  We have a floor and sides of our couch! Nice! I added keyboard toggles so that I can turn the solid mesh on and off. Here is some more spatial mapping. 


Dots Dots Dots Dots And More Dots Dots Overlayed


The dots alone can be hard to visualize without overlaying the spatial mesh, but we can get an idea of how the room is laid out. This will help guide us during terrain generation and help make sure it aligns properly to the room. 



So to recap, we have spatial mapping, we have debug data and everything lines up. Next comes the hard part, building the actual terrain.  I feel pretty confident that the debug data will help us stay on track and aligned.  Now we need to examine the data and start stripping things out like the ceiling, walls and any surfaces that are too high. In reality this is a game and we need to look at terrain, not the ceiling. Although we could use the ceiling for placing clouds and other sky stuff, but that may come at a later time.

Stay tuned, as we continue down this road and hopefully make something of a rough terrain in the next post. I'm thinking we might implement hardware tessellation of a plane as our next step. Please stay tuned. 

I got you mapped (spatially mapped) Part 1:

Ahh, strategy games. All great rts games need a terrain and we need a great terrain to get started.  We need hills, mountains, valleys, water and more.  Its no secret that I'm not an artist but I've pushed a few pixels around before and I have an idea where to get started. I decided try out the next gen of gaming, or at least more up close and personal version.  Let's try to make it for the HoloLens first and port to other VR systems later. While the device is expensive, the HoloLens has some pretty cool tech. The built in spatial mapping, especially, will be useful. I think this is a great starting point for the game and this adventure.

So, how do we make the living room into a battlefield? The HoloLens build in HPU is designed to processes sensor data which provides vertex, normal and index buffers to the calling application. Data can be polled or collected with even based methods. Cool!  After we ask for the buffers we have to do some reverse engineering to decompose the vertex buffers, the spatial coordinate matrices and rebuild them back to a format we can use. How we use it, is totally up to us. The API is DirectX native which makes things a whole lot simpler.

What does the buffer look like? Well, visually like this:

 Mapped Couch    Couch other side

Its rough. A lot of holes and missing information. Unfortunately that coffee table is not a hover board.  As you can see there is a lot of work to be done.

The Plan:

  1. Strip the buffer down to the basic model space vertex positions
  2. Use the normal buffer to help determine shape directions
  3. Produce a visual output that represents each of the points in the room

We need to be able to test if the data extraction is working and verify that its accurate. This will also provide a great way to test our mesh alignments as we progress.  I've heard this many times before but, any amount of time spent on visualizing data is never a waste!  

The spatial observer produces a IMapView of all the meshes that were created via the SpatialSurfaceiInfo class.  Spatial surface info stores the mesh id with a the necessary buffers. Unfortunately the buffers are already setup to the processed by the GPU which means we have to extract the data back into a vector format.  Each mesh needs to be translated and scaled to make it fit the original room size. While we're not interested in the scaling component at the moment, but we do need the translation matrix. Without the translation matrix, each mesh will be situated in a different world space causing things like the couch to float outside of the room. 

So how do we extract data from the buffer?  The buffer needs to be cast back to a SHORT4* array. The buffer data format is R16G16B16A16IntNormalized. Meaning the data is 2 bytes normalized from 0-255 where 255 is 1.0. 

Traversing the short4 array to create the vertex positions in Vector3 format:

for (int index = 0; index < vertexCount / stride; ++index)
      auto ss = rawVertexData;
      DirectX::XMFLOAT3 byteVec = DirectX::XMFLOAT3(

       (float)( ss->x/ 32767.0f), (float)(ss->y / 32767.0f), (float)(ss->z / 32767.0f)


As we process the data, we end up with a vector full of vertex position data. This data still needs to be translated by the mesh coordinate system but that is done in our update loop. Because the Camera Space can change and does change by the lense, we have to  reapply the world translations on each update, so a one time translations won't work in this case. To visualize the data, I created a diamond (very simplified sphere) and place one for each vertex position. The shader helps break up the amount of diamonds by providing a color scale based on Y value. This helps break up the numerous dots. 


First spatial mapping   SEcond

 Hey, we have walls and ceiling!  But where is the couch, the floor?  Good question. I see other artifacts as well, such as meshes outside of the room. Time to debug, We'll save that for part 2!  

So far a great start. We extracted the data and visualized the points. Hopefully soon we'll have correct alignment and we can begin creating a custom mesh based on the mapped out room. Keep tuned for the next section. Also, notice something? Tons of dots, (7,000 to be exact) and I'm not complaining about the performance?  Early optimization is bad but in this instance, instancing is necessary.  The HoloLens just can't handle too too many draw calls.  Instancing helps keep performance up with one draw call and an instance buffer. But I'll cover that a bit more later on along with stereo rendering and few other shader topics.