From clicking to touching, and now pointing and grabbing, the future of computer interaction is going through a major overhaul. A gesture-based operating environment or Spatial Operating Environment (SOE) is now realized and will transform the way we work and design, making the most of our hands. Now we can use spatial interface to control applications and data across many displays. Anyone on a network can share content and applications with any coworker anywhere in the world on an interactive basis.

The Real World of Spatial Operating Environment

Len Calderone

Oblong Industries, a MIT Media Lab spin-off, has been working on a new computing platform for several years, labeled g-speak that is the basis for a spatial operating environment. G-Speak uses a mounted camera to track the user’s gloves as they move in space. It is very precise and can handle quick motions and multiple users. Never touch a computer again.

The film, Minority Report, featured a gesture-driven computer that closely matches the real-world project that g-speak has become.

Courtesy DreamWorks

With g-speak, using gestures to input and output information and commands enables a user to work faster than a mouse and keyboard, used by contemporary computers. Gestural movements make for more effective routing, sorting and selecting of data. Anything on-screen can be manipulated precisely with a wand, or special gloves, such as those used in the movie. Every object in a g-speak environment has real world spatial identity and position, allowing display objects to be manipulated by literally pointing, touching or grabbing.

Interpreting a gesture relies on key pointers represented in a 3D coordinate system. Based on the relative motion of the gestures, the movement can be detected with a high degree of accuracy, depending of the quality of the input and the algorithm’s approach.

Imagine pausing a movie clip of two race cars, speeding around a NASCAR track, and using only gestures, you extract one of the race cars, and move it onto another screen.

Gesture recognition is not something new to the market. Toshiba introduced a laptop with gesture identification specifically used to control multimedia playback. For example, you could raise your arm in front of the web cam, indicating that you want the DVD to stop playing. Toshiba is working on this technology for televisions, but are not quite there yet. Cost is the major delay.

SOE helps managers solve crucial concerns on large data sets. It also allows teamwork by having multiple users work with real-world problems by data mining with a 3D interface. Some of the other beneficiaries of SOE are financial services, the military, automotive, medical imaging, high-end retail, operations centers and the movie industry. Gamers are also enthusiastic about taking advantage of SOE, as well, with first-person shooter games that fit gestural movements in 3D and allowing for multiple players. Can you just imagine this technology in a courtroom? G-speak would be useful in accident reconstruction and enable expert witnesses to interact with the judge or jury from a remote location. No juror could sleep through such a presentation. Designing structures with CAD implementation would be made easy for engineers and architects with SOE.  

The g-speak platform is designed around free-hand, three-space gestural input. Using hand poses with movement, an operator can control the data to within 0.1 mm while pointing is pixel accurate. The system supports two handed and multiple-user input.  

The g-speak platform is composed of gestural I/O, recombinant networking, and real-world pixels. Applications are controlled by hand poses, movement and pointing. Finger and hand motion are tracked to 0.1mm at 100 Hz, while pointing is pixel-accurate. Two-handed and multi-user inputs are fully supported.

Gestural input is demonstrably more effective at performing complex routing, sorting and choosing. This type of input makes it simple for application programmers to take advantage of gestural assistance. Gestural I/O removes the imbalance between high-definition graphical output available from modern computers and the narrow input channel of the mouse and keyboard.

SOE applications process large data sets and support multi-person work-flows by providing a compilation of core library components that allow applications to range transparently and animatedly across groups of machines. The transparent scalability offers three main benefits: the effective use of CPU power in a LAN environment; built-in support for applications, which allow shared work across the network; and the ability to add functionality to applications, using different machines, screens and people.

In a g-speak environment, every graphical and input object has real-world spatial uniqueness and location. Anything that is on the screen can be manipulated individually, such as the race car mentioned above. When using g-speak, "pointing" is literal. Spatial semantics provide application programmers with a ready-made solution to the problems of supporting multiple screens and multiple users. Large screens can work with desktop monitors, tablets and hand-held devices, such as iPhones, as g-speak moves data to the appropriate displays at the same time.

Oblong also has product, Mezzanine, which is designed for collaborative areas, such as meeting rooms, where multiple users can work in the spatial operating environment with any colleague, anywhere in the world.Mezzanine is a turnkey appliance which combines presentation design and delivery, application sharing, whiteboard capture, and video conferencing, all within a framework of multi-participant control.

Wands serve as the primary controls, enabling users to manipulate interface elements across several screens. The system can also be connected to notebook computers, smartphones, tablets, or other devices connected to the network.

Each side of the three-sided control wands provides a distinctive set of controls for multiple functions, such as moving subject matter around and between screens or zooming in or out. A user can crop and manipulate data or zoom out for a full view. The wand feels natural and acts as an extension of the body so that when gestures are performed, some of their motion can be conveniently captured by software.

This technology is truly revolutionary. Gesture-control is opening an exciting assortment of possibilities for consumers, such as new ways to control TVs and interactive displays in shop windows and information kiosks. Think of the future of computing as being multiuser, multi-screen, and multi-device. Seeing is believing. Watch the videos below to see the wonders of g-speak and Mezzanine.

For further information:


About Len

Len started in the audio visual industry in 1975 and has contributed articles to several publications. He also writes opinion editorials for a local newspaper. He is now retired.

This article contains statements of personal opinion and comments made in good faith in the interest of the public. You should confirm all statements with the manufacturer to verify the correctness of the statements.

Comments (0)

This post does not have any comments. Be the first to leave a comment below.

Post A Comment

You must be logged in before you can post a comment. Login now.

Featured Product

This is Control4 Home Automation with Amazon Alexa.

This is Control4 Home Automation with Amazon Alexa.

INTRODUCING THE SIMPLEST WAY TO CONTROL YOUR ENTIRE HOUSE YOUR VOICE. Imagine this... We've all been there-walking through the door into a dark house, arms full. Wouldn't it be nice to tell your house to offer a helping hand? Now you can. A simple voice command-such as "Alexa, turn on Welcome"-lights up the hallway and kitchen, fires up your favorite Pandora station, while the door locks itself behind you. This is Control4 Home Automation with Amazon Alexa.