A group of four people walk into a room and the leader says, “Watson, bring me the last working session.” The computer recognizes and greets the group, then retrieves the materials used in the last meeting and displays them on three large screens. Settling down to work, the leader approaches one screen, and swipes his hands apart to zoom into the information on display. The participants interact with the room through computers that can understand their speech, and sensors that detect their position, record their roles and observe their attention. When the topic of discussion shifts from one screen to another, but one participant remains focused on the previous point, the computer asks a question: “What are you thinking?”
It's a simple scene that illustrates a milestone in the development of environments allowing humans to interact naturally with machines. In a collaboration between Rensselaer Polytechnic Institute and IBM Research, the Cognitive and Immersive Systems Laboratory (CISL) has reached that milestone, and is poised to advance cognitive and immersive environments for collaborative problem-solving in situations like board rooms, classrooms, diagnosis rooms, and design studios.
“This new prototype is a launching point – a functioning space where humans can begin to interact naturally with computers,” said Hui Su, director of CISL. “At its core is a multi-agent architecture for a cognitive environment created by IBM Watson Research Center to link human experience with technology. In CISL, we created this architecture to integrate technologies that register different kinds of human behavior captured by sensors as individual events and forward them to the cognitive agents behind the scene for interpretation. Enhancing this architecture will allow us to link new sensing technologies and computer vision technologies into the system, and to enable collaborative decision making tools on top of these technologies.”
The current capabilities of the space are rudimentary in comparison with human understanding. The room can understand and register speech, three specific gestures, the position of occupants of the room, their roles, and the spatial orientation of those occupants, triggering the correct cognitive computing agents to take action and bring data and information relevant to the discussion into the room in real-time. But the promise is clear.
“From this point, we can build the capability for better interpreting what happens in the room,” said Su. “Our architecture provides a framework for incorporating new technologies such as more cognitive computing capabilities that interpret human behavior. That allows us to really dig in to what people mean during a discussion, triggering the cognitive computing agents to bring valuable analysis and insights to the discussion. In terms of interpreting behavior, we are at the very beginning, but from here the terrain gets very interesting.”
CISL is developing its prototype “situations room” using Studio 2 in the Curtis R. Priem Experimental and Performing Arts Center (EMPAC) at Rensselaer. Studio 2 was designed as an “exceptionally versatile space for the integration of digital technology with human expression and perception,” and easily incorporates the technology CISL is creating. The prototype relies on several cognitive technologies developed by Rensselaer and IBM, as well as sensors – such as microphones, cameras, and Kinnect motion sensors – linked by the CISL architecture.
Within Studio 2, sensors detect human activity, such as a change in the position of an occupant of the room, speech, gesture, and head movement. Absent the CISL architecture, each of the cognitive technologies acts in solitude, responding to a specific activity detected by a single type of sensor and provided to the computer for interpretation. A sensor provides an input, and the computer provides an output. The interaction between human and machine is based on a single action with a finite duration.
The CISL architecture makes it possible for the computer to register and track activities from multiple sensors for interpretation by multiple cognitive technologies through a message queue. The sensors and cognitive technologies work in concert, to register and interpret “multimodal” human behavior through multiple activities over an extended duration. When a person enters the environment, sensors capture different kinds of activity, and – through the CISL architecture – the computer records each activity as a specific event, and forwards it to cognitive technologies for interpretation and response.
“Humans don’t stop to distinguish between the modalities they use to communicate. You point to something on the screen, move your hands and you talk about it, and I understand which parts are significant and interpret them,” Su said. “The first step to bridging that barrier is to make it possible for the machine to absorb that behavior in the correct order and understand which part is significant. They have to absorb and interpret multiple modalities simultaneously.”
The new CISL prototype draws on several technologies from the IBM Bluemix cloud platform that interpret text – first translating speech to text, then using natural language processing through Watson to interpret text – and trigger the correct cognitive computing agents to take the correct actions. Cognitive technologies developed at Rensselaer can interpret three gestures (hands swiping together to zoom in or out of a window on screen, or swiping in one direction to close a window), track and interpret the position of occupants in the room, and track and interpret the orientations of those occupants. The machine also tracks and registers information displayed on the screens installed in the space for machines to interpret and help long-term human activities such as a mergers and acquisitions discussion
The interaction is fluid and continuous, and the future is within grasp.
“This work is important because now we can start to do more interpretation,” Su said. “Now we can add modalities – more than just basic movement and speech, and richer understanding and interpretation. We can begin to talk about the subtleties of human behavior like bias and emotion. With this step, we have opened up a broad horizon. This helps us build a symbiotic relationship between humans and machines.”
The prototype drew upon numerous experts from Rensselaer and IBM Research, including: researchers from the lab of Rensselaer professor of electrical, computer, and systems engineering Qiang Ji; researchers from the lab of Rensselaer professor of cognitive science Selmer Bringsjord; researchers from the lab of Rensselaer professor of electrical, computer, and systems engineering Rich Radke; Gordon Clement from CISL; and IBM Researchers Jeff Kephart, Yunfeng Zhang, and Yedendra Shrinivasan.
CISL at Rensselaer is enabled by the vision of The New Polytechnic, an emerging paradigm for higher education which recognizes that global challenges and opportunities are so great they cannot be adequately addressed by even the most talented person working alone. Rensselaer serves as a crossroads for collaboration — working with partners across disciplines, sectors, and geographic regions — to address complex global challenges, using the most advanced tools and technologies, many of which are developed at Rensselaer. Research at Rensselaer addresses some of the world’s most pressing technological challenges — from energy security and sustainable development to biotechnology and human health. The New Polytechnic is transformative in the global impact of research, in its innovative pedagogy, and in the lives of students at Rensselaer.
About Rensselaer Polytechnic Institute
Rensselaer Polytechnic Institute, founded in 1824, is America’s first technological research university. For nearly 200 years, Rensselaer has been defining the scientific and technological advances of our world. Rensselaer faculty and alumni represent 84 members of the National Academy of Engineering, 17 members of the National Academy of Science, 25 members of the American Academy of Arts and Sciences, 8 members of the Institute of Medicine, 7 members of the National Academy of Inventors, and 4 members of the National Inventors Hall of Fame, as well as a Nobel Prize winner in Physics. With 7,000 students and nearly 100,000 living alumni, Rensselaer is addressing the global challenges facing the 21st century—to change lives, to advance society, and to change the world. To learn more, go to www.rpi.edu.