Web-assisted annotation, semantic indexing and search of television and radio news

Authors:
Mike Dowman;Valentin Tablan;Hamish Cunningham;Borislav Popov
Affiliations:
University of Sheffield, Sheffield, UK;University of Sheffield, Sheffield, UK;University of Sheffield, Sheffield, UK;Sirma AI EAD, Sofia, Bulgaria
Venue:
WWW '05 Proceedings of the 14th international conference on World Wide Web
Year:
2005

Citing 12
Cited 18

Statistical Models for Text Segmentation

Machine Learning - Special issue on natural language learning
Indexing and retrieval of broadcast news

Speech Communication - Special issue on accessing information in spoken audio
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
Topic-based document segmentation with probabilistic latent semantic analysis

Proceedings of the eleventh international conference on Information and knowledge management
Domain-Specific Keyphrase Extraction

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Multimedia indexing through multi-source and multi-language information extraction: the MUMIS project

Data & Knowledge Engineering - NLDB2002
Advances in domain independent linear text segmentation

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
KIM – a semantic platform for information extraction and retrieval

Natural Language Engineering
A new probabilistic model for title generation

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Coherent keyphrase extraction via web mining

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Semantic annotation, indexing, and retrieval

Web Semantics: Science, Services and Agents on the World Wide Web
Text segmentation by product partition models and dynamic programming

Mathematical and Computer Modelling: An International Journal

Web-based inference detection

SS'07 Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium
Detecting privacy leaks using corpus-based association rules

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Detecting reviewer bias through web-based association mining

Proceedings of the 2nd ACM workshop on Information credibility on the web
Crossing textual and visual content in different application scenarios

Multimedia Tools and Applications
Faceted search and retrieval based on semantically annotated product family ontology

Proceedings of the WSDM '09 Workshop on Exploiting Semantic Annotations in Information Retrieval
Semantic ambient media--an introduction

Multimedia Tools and Applications
Named Entity Recognition Experiments on Turkish Texts

FQAS '09 Proceedings of the 8th International Conference on Flexible Query Answering Systems
Semantic annotation for knowledge management: Requirements and a survey of the state of the art

Web Semantics: Science, Services and Agents on the World Wide Web
Using automatic metadata extraction to build a structured syllabus repository

ICADL'07 Proceedings of the 10th international conference on Asian digital libraries: looking back 10 years and forging new frontiers
Multi-facet product information search and retrieval using semantically annotated product family ontology

Information Processing and Management: an International Journal
A case for query by image and text content: searching computer help using screenshots and keywords

Proceedings of the 20th international conference on World wide web
Exploiting information extraction techniques for automatic semantic video indexing with an application to Turkish news videos

Knowledge-Based Systems
A hybrid named entity recognizer for Turkish

Expert Systems with Applications: An International Journal
Web-based semantic analysis of chinese news video

PCM'06 Proceedings of the 7th Pacific Rim conference on Advances in Multimedia Information Processing
Automated semantic tagging of speech audio

Proceedings of the 21st international conference companion on World Wide Web
A platform for collaborative semantic annotation

EACL '12 Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics
An information theoretic framework for web inference detection

Proceedings of the 5th ACM workshop on Security and artificial intelligence
A semi-automatic text-based semantic video annotation system for Turkish facilitating multilingual retrieval

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Rich News system, that can automatically annotate radio and television news with the aid of resources retrieved from the World Wide Web, is described. Automatic speech recognition gives a temporally precise but conceptually inaccurate annotation model. Information extraction from related web news sites gives the opposite: conceptual accuracy but no temporal data. Our approach combines the two for temporally accurate conceptual semantic annotation of broadcast news. First low quality transcripts of the broadcasts are produced using speech recognition, and these are then automatically divided into sections corresponding to individual news stories. A key phrases extraction component finds key phrases for each story and uses these to search for web pages reporting the same event. The text and meta-data of the web pages is then used to create index documents for the stories in the original broadcasts, which are semantically annotated using the KIM knowledge management platform. A web interface then allows conceptual search and browsing of news stories, and playing of the parts of the media files corresponding to each news story. The use of material from the World Wide Web allows much higher quality textual descriptions and semantic annotations to be produced than would have been possible using the ASR transcript directly. The semantic annotations can form a part of the Semantic Web, and an evaluation shows that the system operates with high precision, and with a moderate level of recall.