Natural user interface is a hot topic these days: Kinect, Leap, HoloLenses… Recently, Intel joined other players with their RealSense 3D technology. For now, it is represented by F200 camera only (as part of RealSense Developer Kit), but soon should appear in notebooks and all-in-one devices.
F200 consists of several components: color camera, depth camera with IR light source and microphone array. It allows performing face tracking and recognition, hand/object tracking and voice recognition.
Setting up environment
F200 camera shall be connected to PC using USB3 port – OS does not detect camera connected via USB2. Other requirements include 64-bit Windows 8.1 and Intel processor with SSE4.2 instruction set support. It is recommended to use 4th generation of Core processors, but at least some of the 2nd and 3rd generation processors in combination with RealSense SDK ver4 are capable to support F200.
RealSense SDK is available at Intel RealSense portal and support development using C++, C#/Unity and Java. SDK includes redistributable modules, documentation and samples.
In most of the cases provided sample just work, but from time to time (at least for me), camera initialization failing. In this case killing of Intel RealSense Depth Camera Manager Service may solve the problem.
Developing for F200
SDK provides access to raw sensor data as well as several algorithms for predefined data processing, for example face tracking or gesture recognition.
Simple RealSense application includes the following steps:
- Initialize RealSense subsystem
- Configure and begin data acquisition
- Process acquired data
- Dispose SDK objects
Next paragraphs demonstrate use of SDK to detect two gestures – “thumb up” and “thumb down”.
Access RealSense subsystem
PXCMSession
is an entry point that provides access to all SDK’s I/O and algorithm modules. For predefined usages, such as gesture recognition, another interface provides simplified access – PXCMSenseManager
_senseManager = PXCMSenseManager.CreateInstance();
Configure and begin data acquisition
F200 color and depth sensors can acquire images in different resolutions (for example 320x240, 640x480, 1920x1080, etc.) and in the same time, different algorithms are designed to work with different image sizes. When multiple algorithms are required to work together, resolution negotiation can be performed manually or automatically, using EnableXXX
functions of PXCMSenseManager
_senseManager.EnableHand();
_senseManager.EnableEmotion();
Enabled modules require configuration. The following example demonstrate configuration of Hand module for detecting “thumb up” and “thumb down” gestures:
var handManager = _senseManager.QueryHand();
_handConfig = handManager.CreateActiveConfiguration();
_handConfig.EnableGesture("thumb_up");
_handConfig.EnableGesture("thumb_down");
_handConfig.EnableAllAlerts();
_handConfig.ApplyChanges();
Next step is to begin acquisition:
_senseManager.Init();
Process acquired data
Depending on application’s architecture, data can be processed in for-loop, using callback function or task/thread. The following example demonstrate usage of Task:
private void ProcessInput(CancellationToken token)
{
// Wait for available data
while (!token.IsCancellationRequested &&
_senseManager.AcquireFrame(true) >= pxcmStatus.PXCM_STATUS_NO_ERROR)
{
try
{
var handQuery = _senseManager.QueryHand();
if (handQuery != null)
{
var handData = handQuery.CreateOutput(); // Get processing results
handData.Update();
PXCMHandData.GestureData gestureData;
if (handData.IsGestureFired("thumb_down", out gestureData))
{
Dispatcher.Invoke(ThumbDown);
}
else if (handData.IsGestureFired("thumb_up", out gestureData))
{
Dispatcher.Invoke(ThumbUp);
}
handData.Dispose();
}
}
finally
{
_senseManager.ReleaseFrame();
}
}
}
As the result, UI is updated when one of the enabled gestures is detected.
Dispose SDK objects
C# SDK is a managed wrapper around native C++ code, so all SDK objects shall be disposed using Dispose()
method.
_handConfig.Dispose();
_senseManager.Dispose();
Links
Full source code for this simple application is available at GitHub and the following posts will describe more complicated use cases.