Statistical methods for speech recognition
Statistical methods for speech recognition
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Summarizing text documents: sentence selection and evaluation metrics
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Problems in automatic abstracting
Communications of the ACM
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Mining the Web for bilingual text
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
The Candide system for machine translation
HLT '94 Proceedings of the workshop on Human Language Technology
Seeing the whole in parts: text summarization for web browsing on handheld devices
Proceedings of the 10th international conference on World Wide Web
Temporal summaries of new topics
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance based language models
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Finding topic words for hierarchical summarization
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Searcher performance in question answering
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient web browsing on handheld devices using page and form summarization
ACM Transactions on Information Systems (TOIS)
Finding relevant documents using top ranking sentences: an evaluation of two alternative schemes
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
The use of unlabeled data to improve supervised learning for text summarization
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic summarization of open-domain multiparty dialogues in diverse genres
Computational Linguistics - Summarization
Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
Enhanced web document summarization using hyperlinks
Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
Web Page Summarization for Handheld Devices: A Natural Language Approach
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
A framework for abstracting data sources having heterogeneous representation formats
Data & Knowledge Engineering
Web-page classification through summarization
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Web page summarization using dynamic content
Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
World wide web site summarization
Web Intelligence and Agent Systems
Analysis of titles and readers: for title generation centered on the readers
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Discovering "title-like" terms
Information Processing and Management: an International Journal
Gist summaries for visually impaired surfers
Proceedings of the 7th international ACM SIGACCESS conference on Computers and accessibility
Narrative text classification for automatic key phrase extraction in web document corpora
Proceedings of the 7th annual ACM international workshop on Web information and data management
Combining linguistic and machine learning techniques for email summarization
ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
GIST-IT: summarizing email using linguistic knowledge and machine learning
HLTKM '01 Proceedings of the workshop on Human Language Technology and Knowledge Management - Volume 2001
WebInSight:: making web images accessible
Proceedings of the 8th international ACM SIGACCESS conference on Computers and accessibility
A system for query-specific document summarization
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
FASIL email summarisation system
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Temporal multi-page summarization
Web Intelligence and Agent Systems
Noise reduction through summarization for Web-page classification
Information Processing and Management: an International Journal
Automatic summarising: The state of the art
Information Processing and Management: an International Journal
Just-in-time contextual advertising
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Learning query-biased web page summarization
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
PeRSSonal's core functionality evaluation: Enhancing text labeling through personalized summaries
Data & Knowledge Engineering
Improving relevance judgment of web search results with image excerpts
Proceedings of the 17th international conference on World Wide Web
Mobile web: web manipulation for small displays using multi-level hierarchy page segmentation
Mobility '07 Proceedings of the 4th international conference on mobile technology, applications, and systems and the 1st international symposium on Computer human interaction in mobile technology
Towards mining past content of Web pages
The New Review of Hypermedia and Multimedia - Web Archiving
A Technique for Summarizing Web Reviews
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Keyphrase extraction for labeling a website topic hierarchy
Proceedings of the 11th International Conference on Electronic Commerce
CollabRank: towards a collaborative approach to single-document keyphrase extraction
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Single document keyphrase extraction using neighborhood knowledge
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
MagicCube: choosing the best snippet for each aspect of an entity
Proceedings of the 18th ACM conference on Information and knowledge management
Exploiting neighborhood knowledge for single document summarization and keyphrase extraction
ACM Transactions on Information Systems (TOIS)
Query-topic focused web pages summarization
PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
Summarizing web sites automatically
AI'03 Proceedings of the 16th Canadian society for computational studies of intelligence conference on Advances in artificial intelligence
Using landing pages for sponsored search ad selection
Proceedings of the 19th international conference on World wide web
Enriching the contents of enterprises' wiki systems with web information
WAIM'10 Proceedings of the 2010 international conference on Web-age information management
A hierarchical model of web summaries
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Web Page Summarization for Just-in-Time Contextual Advertising
ACM Transactions on Intelligent Systems and Technology (TIST)
Keyword extraction using support vector machine
WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Features combination for extracting gene functions from MEDLINE
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Comparing topiary-style approaches to headline generation
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Why read if you can skim: towards enabling faster screen reading
Proceedings of the International Cross-Disciplinary Conference on Web Accessibility
PostRank: a new algorithm for incremental finding of persian blog representative words
Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
Accessible skimming: faster screen reading of web pages
Proceedings of the 25th annual ACM symposium on User interface software and technology
Enhancing biomedical concept extraction using semantic relationship weights
International Journal of Data Mining and Bioinformatics
UAHCI'13 Proceedings of the 7th international conference on Universal Access in Human-Computer Interaction: applications and services for quality of life - Volume Part III
Coping tactics employed by visually disabled users on the web
International Journal of Human-Computer Studies
Effective named entity recognition for idiosyncratic web collections
Proceedings of the 23rd international conference on World wide web
Hi-index | 0.00 |
We introduce OCELOT, a prototype system for automatically generating the “gist” of a web page by summarizing it. Although most text summarization research to date has focused on the task of news articles, web pages are quite different in both structure and content. Instead of coherent text with a well-defined discourse structure, they are more often likely to be a chaotic jumble of phrases, links, graphics and formatting commands. Such text provides little foothold for extractive summarization techniques, which attempt to generate a summary of a document by excerpting a contiguous, coherent span of text from it. This paper builds upon recent work in non-extractive summarization, producing the gist of a web page by “translating” it into a more concise representation rather than attempting to extract a text span verbatim. OCELOT uses probabilistic models to guide it in selecting and ordering words into a gist. This paper describes a technique for learning these models automatically from a collection of human-summarized web pages.