Learning internal representations by error propagation
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Neural Computation
A Value-Driven System for Autonomous Information Gathering
Journal of Intelligent Information Systems
BIG: an agent for resource-bounded information gathering and decision making
Artificial Intelligence - Special issue on Intelligent internet systems
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
A logic for uncertain probabilities
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
The Complexity of Decentralized Control of Markov Decision Processes
Mathematics of Operations Research
Electric Elves: Applying Agent Technology to Support Human Organizations
Proceedings of the Thirteenth Conference on Innovative Applications of Artificial Intelligence Conference
A POMDP formulation of preference elicitation problems
Eighteenth national conference on Artificial intelligence
The Case for a Hybrid Passive/Active Network Monitoring Scheme in the Wireless Internet
ICON '00 Proceedings of the 8th IEEE International Conference on Networks
Decision-theoretic active sensing for autonomous agents
AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
R-max - a general polynomial time algorithm for near-optimal reinforcement learning
The Journal of Machine Learning Research
If not now, when?: the effects of interruption at different moments within task execution
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Heuristic search value iteration for POMDPs
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
IEEE Transactions on Knowledge and Data Engineering
Learning to Forget: Continual Prediction with LSTM
Neural Computation
Experiences creating three implementations of the repast agent modeling toolkit
ACM Transactions on Modeling and Computer Simulation (TOMACS)
A utility-based sensing and communication model for a glacial sensor network
AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Partially observable Markov decision processes for spoken dialog systems
Computer Speech and Language
A framework for meta-level control in multi-agent systems
Autonomous Agents and Multi-Agent Systems
The cost of interrupted work: more speed and stress
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
The permutable POMDP: fast solutions to POMDPs for preference elicitation
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Like an intuitive and courteous butler: a proactive personal agent for task management
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Near-optimal observation selection using submodular functions
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Online planning algorithms for POMDPs
Journal of Artificial Intelligence Research
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Point-based value iteration: an anytime algorithm for POMDPs
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Planning and acting in partially observable stochastic domains
Artificial Intelligence
Solving deep memory POMDPs with recurrent policy gradients
ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
Introducing the tileworld: experimentally evaluating agent architectures
AAAI'90 Proceedings of the eighth National conference on Artificial intelligence - Volume 1
IEEE Transactions on Learning Technologies
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Local area network characteristics, with implications for broadband network congestion management
IEEE Journal on Selected Areas in Communications
Active sensing in complex multiagent environments
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
MineralMiner: An active sensing simulation environment
Multiagent and Grid Systems
Hi-index | 0.00 |
In many real-world applications of multi-agent systems, agent reasoning suffers from bounded rationality caused by both limited resources and limited knowledge. When agent sensing to overcome its knowledge limitations also requires resource use, the agent's knowledge refinement is affected due to its inability to always sense when and as accurately as needed, further leading to poor decision making. In this paper, we consider what happens when sensing actions require the use of stateful resources, which we define as resources whose state-dependent behavior changes over time based on usage. Current literature addressing agent sensing with limited resources primarily investigates stateless resources, such as avoiding the use of too much time or energy during sensing. However, sensing itself can change the state of a resource, and thus its behavior, which affects both the information gathered and the resulting knowledge refinement. This produces a phenomenon where the sensing action can and will distort its own outcome (and potentially future outcomes), termed the Observer Effect (OE) after the similar phenomenon in the physical sciences. Under this effect, when deliberating about when and how to perform sensing that requires use of stateful resources, an agent faces a strategic tradeoff between satisfying the need for (1) knowledge refinement to support its reasoning, and (2) avoiding knowledge corruption due to distorted sensing outcomes. To address this tradeoff, we model sensing action selection as a partially observable Markov decision process where an agent optimizes knowledge refinement while considering the (possibly hidden) state of the resources used during sensing. In this model, the agent uses reinforcement learning to learn a controller for action selection, as well as how to predict expected knowledge refinement based on resource use during sensing. Our approach is unique from other bounded rationality and sensing research as we consider how to make decisions about sensing with stateful resources that produce side effects such as the OE, as opposed to simply using stateless resources with no such side effect. We evaluate our approach in a fully and partially observable agent mining simulation. The results demonstrate that considering resource state and the OE during sensing action selection through our approach (1) yielded better knowledge refinement, (2) appropriately balanced current and future refinement to avoid knowledge corruption, and (3) exploited the relationship (i.e., high, positive correlation) between sensing and task performance to boost task performance through improved sensing. Further, our methodology also achieved good knowledge refinement even when the OE is not present, indicating that it can improve sensing performance in a wide variety of environments. Finally, our results also provide insights into the types and configurations of learning algorithms useful for learning within our methodology.