How To Start Building Spatially Aware Apps With Google’s Project Tango

How To Start Building Spatially Aware Apps With Google’s Project Tango

“Tango can enable a whole new range of applications that simply weren’t possible before.”

On a fundamental basis, a smartphone’s camera is not really a camera at all. Sure, it can take pictures and videos. But on a more practical basis, the camera is really a cluster of sensors, lens and hardware that are capable of so much more.

Google is putting those sensors to the test with what it calls Project Tango, a platform that uses computer vision and machine learning to comprehend the world around it.

“Fundamentally, Tango is about enabling mobile devices to understand the world around them,” said Wic Meeussen, Google’s engineer in charge of Area Learning for Project Tango at Google I/O 2016. “Tango can enable a whole new range of applications that simply weren’t possible before.”

The use cases for Project Tango have the potential to be immense. Developers will be able to use it to create significant augmented reality experiences by mapping objects in the real world. Virtual reality multi-player gamers will benefit by knowing where other people and objects are in the physical area near them. Project Tango will ultimately be the tool that solves the problem of indoor object mapping for the era of augmented reality.

The Essential Guide Guide to Android 6.0 MarshmallowWhat you need to know about all the features of Android 6.0GET IT NOW

As far as smartphone cameras have come, what we have in 2016 are going to look like ignorant and unintelligent relics compared to what is to come. With that in mind, let’s take a look at everything developers will need to know to get started with Project Tango.

The History Of Project Tango

Google’s Project Tango is an excellent example of the type of innovation happening at the conjunction of hardware, software, sensors and the cloud happening at the smartphone level. Tango, introduced in June 2014, is a platform that brings computer vision, image processing and advanced vision sensors to smartphones. Tango can perform motion tracking, depth perception and area learning to help understand the world around it.

Project Tango is a truly open project with dozens of companies contributing to its development including the NASA Jet Propulsion Laboratory, Lenovo, Autodesk, Qualcomm, Nvidia and the Open Source Robotics Foundation. A developer kit, a 7-inch tablet called Yellowstone, has been available since 2014 and currently costs $512 through Google’s Project Tango website.

The Project Tango developer kit tablet includes a four mega-pixel, two millimeter lens back RGB-IR (infrared) pixel sensor with a large cluster of sensors, many of which you’d find in a normal smartphone (accelerometer, barometer, GPS, gyroscope) but also additional sensors for motion tracking and 3D depth sensing.

Another Project Tango device called Peanut was a smartphone tested at the MARS lab at the University of Minnesota. Two Peanut smartphones were sent to the International Space Station to be tested by NASA in SPHERES robots. Google stopped supporting the Peanut in September 2014.

Three Core Concepts For Project Tango

Project Tango is organized around three core concepts: motion tracking, area learning and depth perception.

Motion Tracking: Allows a device to understand its motion at it moves through an area.

Before jumping into how Project Tango implements Motion Tracking, it is important to understand a simple concept of physics. In school, students are taught to make plots on a two-degree (X,Y) Cartesian coordinate plane in horizontal or vertical points. A third degree (Z) to represents depth (imagine turning a square into a cube).

A free floating object is not bound to just three points but rather on a notion called degrees of freedomthat govern attitude and trajectory. Imagine a satellite in space. Its attitude is governed by its movement along the X,Y,Z axes. Its movement along those axes (pitching, yawing, rolling) are three additional points to monitor. Thus the movement of a rigid body (a satellite … or a smartphone) in a three-dimensional space can have up to six degrees of freedom.

The combination of position and orientation of an object is called its “pose.”

Pose data acts as the starting point in all in Project Tango Motion Tracking sessions and plots the movement of the device through space along its course.

Google’s developer page states:

The APIs support two ways to get pose data: callbacks to get the most recent pose updates, and functions to get a pose estimate at a specific time. The data is returned with two main parts: a vector in meters for translation and a quaternion for rotation. Poses are specified within specific reference frame pairs, and you must specify a target frame with respect to a base frame of reference when asking for a pose.

Motion Tracking in Project Tango only provides relative location and distance with its sensors, not GPS coordinates (use the Google Location API for that). Motion Tracking does not give a device the ability to know or learn an area nor does it “remember” previous sessions. Over long periods of time and distance, small errors will creep up and cause errors in measurement, leading to “drift” (where the perceived and absolute positions of the device are no longer aligned).

Area Learning: If Motion Tracking only measures movement from point to point along a trajectory, area learning corrects for drift by learning and remembering where it has been.

Area Learning is a function of computer vision. The camera is walked through an area and can remember it by saving an Area Description File (ADF). Area Learning will remember the turns and curves, structures and organization of where it has been. Combined with motion tracking, you can see how a trajectory can be plotted and then its features memorized to know exactly where it is. These capabilities are important when it comes to the notion of creating interactive augmented reality environments or games in which a person moves through a physical space (like a retail store, for instance).

“This is what really gives your Tango device a memory,” Meeussen said.

To create an ADF, developers can either use the Project Tango Explorer or the Project Tango APIs to handle learning, saving and loading in the app. To create consistent experiences within an app, developers must perform localization, which consists of loading a previously saved ADF and move the device into the area that was saved with that ADF.

“This is what really gives your Tango device a memory,” said Meeussen. “When you bring Tango device into a new space it will use its camera to create a mathematical description in its memory.”

Google notes that Area Learning works best in areas with visually diverse aspects. A blank room with white walls will be hard for a computer to distinguish. At the same time, areas should look nearly the same when they learned (and saved to the ADF) as when they are booted up again. To handle different angles or changes in areas, developers should create multiple ADFs from different angles and orientations from a particular starting point.

Depth Perception: The ability to see the distance of objects in the real world is something that humans take for granted. I know that my laptop is about two feet in front of me and the wall another five feet from that. Computers have a harder time in sensing depth.

To sense depth, Project Tango allows manufacturers to choose among three depth perception technologies: Structured Light, Time of Flight (both use infrared) and Stereo (which does not use infrared).

Depth Perception in Project Tango is designed to work best indoors between 0.5 and 4 meters. Areas with high degrees of IR light (sunlight or incandescent bulbs) cannot be scanned well.

The Project Tango API uses point clouds to scan depth data of all objects within its purview to give X,Y,Z coordinates of specific objects.

Project Tango Development Tools

Project Tango functions on Android device platforms that run in C, Java and Unity. Google has APIs and code samples available for each programming language (see here for Java, Unity and C).

Project Tango has four primary tools for building apps:

Explorer: The Project Tango Explorer on a device allows developers to switch between Area Learning capabilities, Point Cloud (for depth perception) and diagnostics.
Permissions: Project Tango—per Android’s granular permission scheme—must ask for the use of a device’s camera and other functions. The Permissions Manager allows developers and users to set up and revoke permissions for apps in Project Tango.
Constructor: The Constructor allows developers access to 3D models of environments and then they save and export the 3D mesh. The Constructor is an app specific to Project Tango devices.
ADF Inspector: Once a developer creates an ADF, the Inspector allows them to “see inside” the file to be able to improve it. The ADF Inspector is an app specific to Project Tango devices.

Google and Lenovo will release a consumer-grade Project Tango device later in 2016.