Effective phrase prediction

Authors:
Arnab Nandi;H. V. Jagadish
Affiliations:
University of Michigan, Ann Arbor;University of Michigan, Ann Arbor
Venue:
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Year:
2007

Citing 24
Cited 21

The Reactive Keyboard: A Predictive Typing Aid

Computer
The information visualizer, an information workspace

CHI '91 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Techniques for automatically correcting words in text

ACM Computing Surveys (CSUR)
Estimating alphanumeric selectivity in the presence of wildcards

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Browsing in digital libraries: a phrase-based approach

DL '97 Proceedings of the second ACM international conference on Digital libraries
Machine learning techniques to make computers easier to use

Artificial Intelligence - Special issue: artificial intelligence 40 years later
Substring selectivity estimation

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A Space-Economical Suffix Tree Construction Algorithm

Journal of the ACM (JACM)
Predicting text entry speed on mobile phones

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Scalable browsing for large collections: a case study

DL '00 Proceedings of the fifth ACM conference on Digital libraries
Past, present, and future of user interface software tools

ACM Transactions on Computer-Human Interaction (TOCHI) - Special issue on human-computer interaction in the new millennium, Part 1
Optimal aggregation algorithms for middleware

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Burst tries: a fast, efficient data structure for string keys

ACM Transactions on Information Systems (TOIS)
Optimal suffix tree construction with large alphabets

FOCS '97 Proceedings of the 38th Annual Symposium on Foundations of Computer Science
A simple rule-based part of speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
Professional usability in open source projects: GNOME, OpenOffice.org, NetBeans

CHI '04 Extended Abstracts on Human Factors in Computing Systems
Sentence completion

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Context-aware design and interaction in computer systems

IBM Systems Journal
Fast phrase querying with combined indexes

ACM Transactions on Information Systems (TOIS)
Building Rich Web Applications with Ajax

Computer
Word-sense disambiguation for machine translation

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Practical suffix tree construction

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Learning to complete sentences

ECML'05 Proceedings of the 16th European conference on Machine Learning

Efficient interactive fuzzy keyword search

Proceedings of the 18th international conference on World wide web
Efficient type-ahead search on relational data: a TASTIER approach

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Extending autocompletion to tolerate errors

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Autocompletion for mashups

Proceedings of the VLDB Endowment
A taxonomy of sequential pattern mining algorithms

ACM Computing Surveys (CSUR)
Text entry for mobile devices using ad-hoc abbreviation

Proceedings of the International Conference on Advanced Visual Interfaces
SnipSuggest: context-aware autocompletion for SQL

Proceedings of the VLDB Endowment
A study of the uniqueness of source code

Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
An evaluation of the GhostWriter system for case-based content suggestions

AICS'09 Proceedings of the 20th Irish conference on Artificial intelligence and cognitive science
Efficient interactive smart keyword search

WISE'10 Proceedings of the 11th international conference on Web information systems engineering
Code completion of multiple keywords from abbreviated input

Automated Software Engineering
Efficient fuzzy full-text type-ahead search

The VLDB Journal — The International Journal on Very Large Data Bases
The GhostWriter-2.0 Case-Based Reasoning system for making content suggestions to the authors of product reviews

Knowledge-Based Systems
Auto-completion learning for XML

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
AutoComPaste: auto-completing text as an alternative to copy-paste

Proceedings of the International Working Conference on Advanced Visual Interfaces
Supporting efficient top-k queries in type-ahead search

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Time-sensitive query auto-completion

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Learning to personalize query auto-completion

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Space-efficient data structures for Top-k completion

Proceedings of the 22nd international conference on World Wide Web
Behavioral dynamics on the web: Learning, modeling, and prediction

ACM Transactions on Information Systems (TOIS)
Efficient error-tolerant query autocompletion

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

Autocompletion is a widely deployed facility in systems that require user input. Having the system complete a partially typed "word" can save user time and effort. In this paper, we study the problem of autocompletion not just at the level of a single "word", but at the level of a multi-word "phrase". There are two main challenges: one is that the number of phrases (both the number possible and the number actually observed in a corpus) is combinatorially larger than the number of words; the second is that a "phrase", unlike a "word", does not have a well-defined boundary, so that the autocompletion system has to decide not just what to predict, but also how far. We introduce a FussyTree structure to address the first challenge and the concept of a significant phrase to address the second. We develop a probabilistically driven multiple completion choice model, and exploit features such as frequency distributions to improve the quality of our suffix completions. We experimentally demonstrate the practicability and value of our technique for an email composition application and show that we can save approximately a fifth of the keystrokes typed.