Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Searcher performance in question answering
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Data integration: a theoretical perspective
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Finding the flow in web site search
Communications of the ACM
Visual Web Information Extraction with Lixto
Proceedings of the 27th International Conference on Very Large Data Bases
Querying Heterogeneous Information Sources Using Source Descriptions
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Topic detection and tracking: event-based information organization
Topic detection and tracking: event-based information organization
Visualizing argumentation: software tools for collaborative and educational sense-making
Visualizing argumentation: software tools for collaborative and educational sense-making
The roots of computer supported argument visualization
Visualizing argumentation
Parsimonious language models for information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Report on the INEX 2004 interactive track
ACM SIGIR Forum
MonetDB/XQuery: a fast XQuery processor powered by a relational engine
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Managing information extraction: state of the art and research directions
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Data integration: the teenage years
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Articulating information needs in XML query languages
ACM Transactions on Information Systems (TOIS)
Tag clouds for summarizing web search results
Proceedings of the 16th international conference on World Wide Web
Evaluating XML retrieval effectiveness at INEX
ACM SIGIR Forum
Generating summary keywords for emails using topics
Proceedings of the 13th international conference on Intelligent user interfaces
Data clouds: summarizing keyword search results over structured data
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Digital weight watching: reconstruction of scanned documents
Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data
Who said what to whom?: capturing the structure of debates
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Comparing corpora using frequency profiling
CompareCorpora '00 Proceedings of the Workshop on Comparing Corpora
Narrowed extended XPath i (NEXI)
INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval
Structuring political documents for importance ranking
NLDB'12 Proceedings of the 17th international conference on Applications of Natural Language Processing and Information Systems
Browsing interaction events in recordings of small group activities via multimedia operators
Proceedings of the 18th Brazilian symposium on Multimedia and the web
Aggregated search: A new information retrieval paradigm
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
This paper presents a case-study in which we use a large semi-structured data set consisting of official transcripts of meetings of the Dutch parliament for focused retrieval and result aggregation. Transcripts of meetings are a document genre characterized by a complex narrative structure. The essence is not only what is said, but also by who and to whom. We have notes of more than 40 years of Dutch parliamentary debates where this structure is exploited to automatically make semantic annotations. These annotations yield numerous new ways of searching, browsing, mining and summarizing these documents. Concerning result aggregation, we summarise and visualise the structure of meetings into tables of content and interruption graphs. The contents of meetings or parts of meetings are condensed into word clouds that are created using a parsimonious language model. Furthermore, we have developed a search engine that exploits the structure and annotations of our data making it possible to provide entry points, to group search results, and to use faceted search techniques for data-exploration. Evaluation shows that our content and structure summarization tools provide a good first impression of a debate. Users reported that, compared to a standard document retrieval system, our search engine gives a better overview of the data. Search tasks are performed faster and the users felt more certain of their answers.