Distributed attention

Authors:
Maurice Chu;Patrick Cheung;James Reich
Affiliations:
Palo Alto Research Center, Palo Alto, CA;Palo Alto Research Center, Palo Alto, CA;Palo Alto Research Center, Palo Alto, CA
Venue:
SenSys '04 Proceedings of the 2nd international conference on Embedded networked sensor systems
Year:
2004

Citing 0
Cited 1

3D Real-Time Reconstruction Approach for Multimedia Sensor Networks

International Journal of Organizational and Collective Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

As sensing technology becomes cheaper and more effective, large networks of intelligent sensors with rich sensing modalities, such as video, will extend into busy large-scale environments in applications such as traffic safety, urban surveillance, and situational awareness. In these unstructured and highly active environments, we consider the problem of detecting, identifying, and monitoring anomalous or suspicious activities among a large number of "normal" behaviors of people or vehicles going about their everyday business. Monitoring the activities of all objects in these environments is overly intrusive as well as impractical due to physical limitations of sensing devices and computing power. Therefore, a system to be deployed in such environments must be able to simultaneously (1) ignore objects behaving normally, (2) search for newly emerging anomalous behaviors, and (3) track and monitor objects behaving abnormally. In such systems, the large areas covered and huge amount of data generated require us to implement the system on a distributed platform with limited sensing, processing, and communication resources. Our solution is motivated by the visual attention system in humans, where the eye is attracted to interesting stimuli which "pop out" from the environment, and the brain interprets the most important stimuli in greater detail by focusing the eyes and its mental faculties on them. Our distributed attention architecture consists of a layered information processing module that interprets data from multiple perspectives and a resource allocation module to allocate limited sensing resources based on the system's current knowledge. Thus, distributed attention implements a complete loop, consisting of data observation, information extraction, and finally, sensor control for the next set of observations. The information processing tasks are layered and range from evaluating simple predicates to verifying hypotheses about complex behaviors. A "low level" observation, such as detecting a cluster of pixels moving at an anomalous speed may spawn a vehicle localization and tracking task. This may, in turn, spawn "high level" hypotheses about its behavior or intent, or may evaluate its behavior in relation to other nearby vehicles. Reasoning about these hypotheses may reassure the system that the behavior is normal so that the lower level anomaly detection can be adjusted to ignore similar behavior in the future, or it may lead to further focusing of resources to investigate. By processing data incrementally, the system can use intermediate information to make decisions on what to ignore and on what to focus. Executing a task requires data from some local region of the world. Since there are multiple tasks with different, possibly conflicting data requirements, we have developed a resource allocation module to optimize each task's access to limited sensing resources to maximize the global usefulness of the sensor configuration. In our current implementation, all requests are tagged with a utility value based on the importance of the task they are accomplishing, and sensors are assigned to observe the regions which maximize the sum of the utilities of the all the tasks served. Our testbed to demonstrate this distributed attention system consists of a network of 6 or more OpenBrick x86 Linux boxes, outfitted with steerable pan-tilt cameras which observe a rectangular board (approx. 6' x 8') on the floor where the objects of interest move. The field of view (FOV) of each camera is too small to see the entire board at once. The cameras pan and tilt to discrete directions, so that any one camera can only sense a small portion of the board. Some regions may be observed by multiple cameras, while others only have single coverage. This work is not primarily focused on image understanding, so vehicles and people are represented here as moving, colored dots, performing various behaviors under automatic or manual control. Information processing tasks can jump from node to node, issuing requests to the resource allocator, which points the cameras appropriately. We will demonstrate a scenario with many moving dots flowing normally around the board. One or more objects then exhibit anomalous behavior, like pushing other objects out of the way. The layered information processing module detects anomalous behavior, looking for local regions with statistically unlikely deviations from the learned "normal" velocity field, hypothesizing candidate objects that caused the anomalous behavior, tracking them to gather more information, and finally, identifying the behavior based on existing high-level models of anomalous behavior. From the intermediate processing results, the tasks generate data requests for viewing a subset of FOV's, and the resource allocation module then maximizes the utility of the competing tasks by formulating the set of utility tagged requests over possible camera pan-tilt directions as a factor graph and applying the max-sum algorithm (a variant of the sum-product algorithm). The cameras are then moved, new images captured, and the cycle is repeated.