Kinect Tutorial – Hacking 101

Microsoft’s Kinect has been out for a few months now and has become a fairly popular accessory for the Xbox 360. Let’s face it though, using the Kinect for what it was intended didn’t end up being the most exciting part of this new toy. What has become far more interesting is seeing the various hacks developed that makes the device so much more than simply an input mechanism for games. Now it’s your turn to do something amazing, and this tutorial will get you started. Today I’m going to get your Kinect up and running and demonstrate how to get the camera and depth information into your very own C# application.

Example Kinect Output

Above is some example output that our app will produce. In this case it’s the corner of my office with some bookshelves and a guitar. The RGB data is on the left and the depth information is on the right. The darker things are, the closer they are to the camera.

1. Setup libfreenect

openkinect.org is going to be your best friend for this portion of the project. We’re going to be depending on libfreenect for our drivers and the library used to communicate with the Kinect. They’ve got pretty good instructions for all platforms, but since we’re doing C#, you’ll want to follow the Windows installation guide. I found the instructions to be fairly straight forward and worked without too much hassle. After you’ve followed all of those instructions return here to continue the tutorial. This portion of the project will probably take a while to complete – so be patient.

2. Setup the C# Application

Since our plan with this tutorial is just to display output, we can get away with a basic WPF application, which actually performs surprisingly well. If your app is going to do some really incredible things, you may want to consider something like DirectX or OpenGL.

New WFP Application

Bundled as part of the libfreenect source are a set of wrappers for various languages. We’re going to want the C# one (\libfreenect\wrappers\csharp). Go ahead and add the wrapper project to your new solution.

Solution Explorer

You should now be able to build the solution without any errors. Of course, since we haven’t written any code, nothing will happen when you run the app. Also, our application now depends on the freenect.dll file that was created as part of step 1. Depending on how you installed it, you may have to copy this file somewhere where you app can load it (somewhere in the path, or the project’s output directory).

3. Writing some Code

Now we’re at the meat of this tutorial, writing some code to retrieve the Kinect’s output. The first thing we’re going to do is setup the Kinect and tell it to start recording data. I did all of this in my MainWindow’s constructor.

using System;
using System.Net;
using System.Runtime.InteropServices;
using System.Threading;
using System.Windows;
using System.Windows.Media;
using System.Windows.Media.Imaging;
using freenect;

namespace Kinect101
{
  /// <summary>
  /// Interaction logic for MainWindow.xaml
  /// </summary>
  public partial class MainWindow : Window
  {
    // The kinect object.
    Kinect _kinect;

    // Whether or not the Window has been closed.
    bool _closed;

    // Prevents handlers from being called while
    // a previous one is still working.
    bool _processingRGB;
    bool _processingDepth;

    public MainWindow()
    {
      InitializeComponent();

      // See if any Kinects are connected.
      int kinectCount = Kinect.DeviceCount;

      if (kinectCount > 0)
      {
        // Get the first connected Kinect – I guess you could have more
        // than one connected.
        _kinect = new Kinect(0);

        // Open a connection to the Kinect.
        _kinect.Open();

        // Setting these to IntPtr.Zero notifies the wrapper
        // library to manage the memory for us.  More advanced apps
        // will probably provide a pointer for their own buffers.
        _kinect.VideoCamera.DataBuffer = IntPtr.Zero;
        _kinect.DepthCamera.DataBuffer = IntPtr.Zero;

        // Hook the events that are raised when data has been recieved.
        _kinect.VideoCamera.DataReceived += VideoCamera_DataReceived;
        _kinect.DepthCamera.DataReceived += DepthCamera_DataReceived;

        // Start the cameras.
        _kinect.VideoCamera.Start();
        _kinect.DepthCamera.Start();

        // Create a thread to continually instruct the Kinect
        // to process pending events.
        ThreadPool.QueueUserWorkItem(
          delegate
          {
            while (!_closed)
            {
              _kinect.UpdateStatus();
              Kinect.ProcessEvents();

              Thread.Sleep(30);
            }
          });
      }
    }

As you read through the code, it should be very self-explanatory. Basically we’re just connecting to a Kinect and telling it to start recording video and depth information. In order to events to be processed and raised by the Kinect library, we have to periodically call Kinect.ProcessEvents. Since we can’t block our main thread doing that, I just created a simple worker thread using the ThreadPool.

Now that we’ve got the Kinect setup, let’s take a look at the event handler, VideoCamera_DataReceived. This is where we’re going to receive the raw RGB data and convert it to something that can be displayed on the screen.

void VideoCamera_DataReceived(object sender, VideoCamera.DataReceivedEventArgs e)
{
  // Prevent re-entrancy so events don’t stack up.
  if (_processingRGB)
    return;

  _processingRGB = true;

  this.Dispatcher.Invoke(
    new Action(
      delegate()
      {
        // Convert the byte[] returned by the Kinect library
        // to a BitmapSource and set it as the source of our
        // RGB Image control.
        _colorImage.Source = BitmapSource.Create(
          e.Image.Width,
          e.Image.Height,
          96,
          96,
          PixelFormats.Rgb24,
          null,
          e.Image.Data,
          e.Image.Width * 3);
      }));

  _processingRGB = false;
}

Fortunately for us, this part of the project is a breeze. The Kinect library returns RGB data in a form that can be directly fed into a BitmapSource object. I added some very basic protection against this event handler being called before the previous one had completed. I noticed that as the app ran longer and longer, it got slower and slower. This simple fix seemed to address that bug. All we have to do now is feed the raw data into BitmapSource.Create and give that to our Image control, which was added to our MainWindow in the XAML. The most complicated part of the Create call is the last parameter – stride. This argument represents the number of bytes in a single row of the image. Since our image contains 3 bytes per pixel, the number of bytes in a row will by the width multiplied by 3.

Now on to depth. This one is slightly more complicated, but still not too bad.

void DepthCamera_DataReceived(object sender, DepthCamera.DataReceivedEventArgs e)
{
  if (_processingDepth)
    return;

  _processingDepth = true;

  // Create an array to hold translated image data.
  short[] image = new short[e.DepthMap.Width * e.DepthMap.Height];
  int idx = 0;

  for (int i = 0; i < e.DepthMap.Width * e.DepthMap.Height * 2; i += 2)
  {
    // Read a pixel from the buffer.
    short pixel = Marshal.ReadInt16(e.DepthMap.DataPointer, i);

    // Convert to little endian.
    pixel = IPAddress.HostToNetworkOrder(pixel);
    image[idx++] = pixel;
  }

  this.Dispatcher.Invoke(
    new Action(
      delegate()
      {
        // Create the image.
        _depthImage.Source = BitmapSource.Create(
          e.DepthMap.Width,
          e.DepthMap.Height,
          96,
          96, PixelFormats.Gray16, null, image, e.DepthMap.Width * 2);
      }));

  _processingDepth = false;
}

Depth data comes from the Kinect as 11 bits per pixel. The Kinect library packs that into a little friendlier 16 bits per pixel before passing it up to the C# wrapper and then on to us. What makes this difficult is that the bit order is big endian, whereas our Windows box needs little endian. In order to fix this, we need to read each and every pixel from the buffer and convert the endianness. Fortunately, .NET has a helper function designed for networking that comes in handy for this task – IPAddress.HostToNetworkOrder. After we’ve got all the pixels out and converted, we simply do the same thing as before to create our image. Instead of RGB format, this time we need to use Gray16, since each pixel is now represented by 2 bytes (16 bits).

And that’s it for retrieving Kinect data. If you combine this code with our XAML:

<Window x:Class="Kinect101.MainWindow"
       xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
       xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
       Closing="Window_Closing"
       Title="MainWindow" Height="350" Width="525">
  <Grid>
    <Grid.ColumnDefinitions>
      <ColumnDefinition Width="*" />
      <ColumnDefinition Width="*" />
    </Grid.ColumnDefinitions>
    <Image x:Name="_colorImage" />
    <Image x:Name="_depthImage" Grid.Column="1" />
  </Grid>
</Window>

and a little cleanup code:

private void Window_Closing(object sender, System.ComponentModel.CancelEventArgs e)
{
  _closed = true;

  // All of these seem to lock up the app.
  //_kinect.VideoCamera.Stop();
  //_kinect.DepthCamera.Stop();
  //_kinect.Close();
  //Kinect.Shutdown();
}

you should now have a working app that displays output very similar to the image below.

Example Kinect Output

All of the libraries and wrappers used for this tutorial are in constant flux. Please refer to the most up-to-date documentation before starting to make sure nothing has changed. The libraries are also pretty buggy – so be prepared for things to not work correctly right out of the box.

Hopefully this tutorial will help save you some time bootstrapping your awesome Kinect hack. As we get more time to work with the libraries, we’ll be creating some more compelling demos. If you happen to make something neat, please drop us a line so we can check it out. If you have questions or comments, feel free to leave them below.

Leave a Reply

Your email address will not be published. Required fields are marked *