Yang Cai, PhD., Senior Systems Scientist at CyLab, is founder of Ambient Intelligence Lab at Carnegie Mellon University. Dr. Cai's research interests include image understanding and ambient intelligence. He was NASA Faculty Fellow in 2002 and 2003 and has organized international workshops in ambient intelligence and digital human modeling. His publications include LNCS/LNAI Volume 3345 "Ambient Intelligence for Scientific Discovery" published by Springer-Verlag in Feb., 2005 and LNCS/LNAI Volume 3864 "Ambient Intelligence for Everyday Life" published by Springer-Verlag in August, 2006. Dr. Cai’s Human Algorithms, Lecture Notes in Computer Science (LNCS), is to be published by Springer in 2008
posted by Richard Power
CyLab Chronicles: Tell us about your research into Visual Thinking Agents. What aspects of your research would you like to highlight?
CAI: Visual Thinking Agents (VTAs) are embeddable software tools for visual intelligence. The broadband Internet enables users to access more and more images and videos. There has been increasing concerns about the security and privacy of the visual contents coming in or going out of their phones, computers and network servers. We have so much visual information but not enough eyes. Image and video collections grow at a rate that exceeds the capacities of human attention and networks combined. In real-time video network systems, over a terabyte per hour are transmitted for only a small number of platforms and sensors. VTAs aim to reduce the overflow of visual information for both humans and networks.
CyLab Chronicles: What technology are you working on?
CAI: The Instinctive Computing Lab has developed a few concept-proofing prototypes of Visual Thinking Agents, including Pattern Detection, Visual Privacy and Visual Digest Networks.
Pattern Detection is to make invisible to be visible. It aims to detect interesting patterns or anomalous events. We have developed a series of demonstrative models that can reveal hidden visual patterns, including network traffic anomalous events, mobility of wireless user positions, harmful algal blooms in satellite images, and words on the eroded stone surfaces. Our technology has been transferred to NASA and NOAA and reported by BBC News, Radio Austria and Radio Germany.
Visual Privacy is to make visible to be invisible. Visual privacy is a sensitive case because it literally deals with human private parts. It is a bold challenge to the field of Computer Science.
The NSF Cyber Trust Program sponsored our project, “Privacy Algorithms for Human Imaging.” We are building a virtual human model for designing and evaluating visual privacy technologies before a security system is built. This forward-thinking approach intends to transform the development of visual privacy technologies from being device-specific and proprietary to being device-independent and open source-oriented. It will also transform privacy research into a systematic design process, enabling multidisciplinary innovations in digital human modeling, computer vision, information visualization, and computational aesthetics. The result of this project will greatly impact the privacy-aware imaging systems in airports and medical systems. They can also benefit consumer products such as custom-fit technologies that are designed from personal 3D scanning data.
The ‘Visual Digest Network’ is to send the visual information on-demand. It aims to reduce the network bandwidth by orders of magnitude. We have investigated the conceptual design of Visual Digest Network at gaze and object levels. Our goal is to minimize the media footprint during visual communication while sustaining essential semantic data. The Attentive Video Network is designed to detect the operator’s gaze and adjust the video resolution at the sensor side across the network. The results show significant improvement of the network bandwidth.
The Object Video Network is designed for mobile video and vehicle surveillance applications, where faces and cars are detected. The multi-resolution profiles are configured for the media according to the network footprint. The video is sent across the network with multiple resolutions and metadata, controlled by the bandwidth regulator. The results show that the video is able to be transmitted in many worse conditions. We have filed two US Utility Patents through Carnegie Mellon.
CyLab Chronicles: What are the unique attributes of what you are working on?
CAI: 1) Rapid Prototyping. My team often works on extreme research projects with tangible deliverables to fields such as the light-rail test tracks, Baite Fles Saline in Alps, and Old St. Luke Cemetery;
2) Innovation. The Instinctive Computing Lab is a rapid evolving research group in Cylab. I recruit innovative students. In the Fall semester, 2008, we will grow to 5 Ph.D. students, 1 research staff and several work-study students. A couple of alumni became serial entrepreneurs, for example, Ophir Tanz has started up his fourth company GumGum in LA.
CyLab Chronicles: What problem(s) does your work address?
CAI: I have been focused on innovative concepts for the next generation computers and networks, including Visual Thinking, Ambient Intelligence and Human Algorithms. Some of the concepts have been published in three books by Springer in Lecture Notes in Artificial Intelligence, The State of the Art Survey: Ambient Intelligence for Scientific Discovery (LNAI 3345), Ambient Intelligence in Everyday Life (LNAI 3486), and Digital Human Modeling (LNCS 4650 in press). Three journal special issues are published in English and Spanish. Three international workshops have been organized in Austria, Spain, and UK, in adjunction with ACM SIGCHI and ICCS.
CyLab Chronicles: What are the commercial implications of your work?
CAI: Our technologies have been field-proven at the test beds of Boeing, St. Louis; GM, Detroit; and Bombardier Transportation, Pittsburgh. I also have received inquiries for commercialization of our technologies from UK, Australia, Canada and The States.
See all CyLab Chronicles articles