Fault isolation and event correlation for integrated fault management
Proceedings of the fifth IFIP/IEEE international symposium on Integrated network management V : integrated management in a virtual world: integrated management in a virtual world
Yemanja—A Layered Fault Localization System for Multi-Domain Computing Utilities
Journal of Network and Systems Management
A Case-Based Reasoning Approach to the Resolution of Faults in Communication Networks
Proceedings of the IFIP TC6/WG6.6 Third International Symposium on Integrated Network Management with participation of the IEEE Communications Society CNOM and with support from the Institute for Educational Services
Hi-index | 0.00 |
One of the important issues for telecom carrier is that the time required to identify the root cause of a failure has increased since the number and types of alarms caused by network or service failures has increased in a fixed-mobile convergence network environment,. To address this issue, this paper proposes a root cause analysis (RCA) mechanism which classifies alarms based on their types of failures, such as resource, performance and service failures, and then promptly identifies the root cause by using a hierarchical alarm information model. Our proposed mechanism which is the implemented into a prototype system was successfully demonstrated in a testbed. Its effectiveness was validated that our RCA mechanism handled 65,000 alarms within 550 seconds in a practical network consisting of 100,000 equipments. The results also show that the algorithm minimizes the overhead of RCA itself to apply large-scale environment, and thus the total RCA performance is limited only by the DB access.