Using hedges to enhance a disease outbreak report text mining system
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Model checking of healthcare domain models
Computer Methods and Programs in Biomedicine
Towards role-based filtering of disease outbreak reports
Journal of Biomedical Informatics
BioNLP '10 Proceedings of the 2010 Workshop on Biomedical Natural Language Processing
An ontology-driven system for detecting global health events
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
An exploratory study of news article clustering for web-based bio-surveillance
Proceedings of the 1st ACM International Health Informatics Symposium
Towards detecting influenza epidemics by analyzing Twitter messages
Proceedings of the First Workshop on Social Media Analytics
Linking lexical resources and ontologies on the semantic web with lemon
ESWC'11 Proceedings of the 8th extended semantic web conference on The semantic web: research and applications - Volume Part I
A geospatial analysis on the potential value of news comments in infectious disease surveillance
PAISI'11 Proceedings of the 6th Pacific Asia conference on Intelligence and security informatics
Combining statistical and semantic approaches to the translation of ontologies and taxonomies
SSST-5 Proceedings of the Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation
Making use of social media data in public health
Proceedings of the 21st international conference companion on World Wide Web
Classifying Vietnamese disease outbreak reports with important sentences and rich features
Proceedings of the Third Symposium on Information and Communication Technology
Journal of Biomedical Informatics
Characterizing dengue spread and severity using internet media sources
Proceedings of the 3rd ACM Symposium on Computing for Development
Lightweight methods to estimate influenza rates and alcohol sales volume from Twitter messages
Language Resources and Evaluation
Mining web data for epidemiological surveillance
PAKDD'12 Proceedings of the 2012 Pacific-Asia conference on Emerging Trends in Knowledge Discovery and Data Mining
The picture of health: map-based, collaborative spatio-temporal disease tracking
Proceedings of the First ACM SIGSPATIAL International Workshop on Use of GIS in Public Health
Hi-index | 3.84 |
Summary: BioCaster is an ontology-based text mining system for detecting and tracking the distribution of infectious disease outbreaks from linguistic signals on the Web. The system continuously analyzes documents reported from over 1700 RSS feeds, classifies them for topical relevance and plots them onto a Google map using geocoded information. The background knowledge for bridging the gap between Layman's terms and formal-coding systems is contained in the freely available BioCaster ontology which includes information in eight languages focused on the epidemiological role of pathogens as well as geographical locations with their latitudes/longitudes. The system consists of four main stages: topic classification, named entity recognition (NER), disease/location detection and event recognition. Higher order event analysis is used to detect more precisely specified warning signals that can then be notified to registered users via email alerts. Evaluation of the system for topic recognition and entity identification is conducted on a gold standard corpus of annotated news articles. Availability: The BioCaster map and ontology are freely available via a web portal at http://www.biocaster.org. Contact: collier@nii.ac.jp