Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
A new approach to text searching
Communications of the ACM
Fast text searching: allowing errors
Communications of the ACM
Document filtering for fast ranking
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Overview of the second text retrieval conference (TREC-2)
TREC-2 Proceedings of the second conference on Text retrieval conference
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Accessibility of information on the Web
intelligence
Fast and flexible word searching on compressed text
ACM Transactions on Information Systems (TOIS)
Modern Information Retrieval
Adding Compression to Block Addressing Inverted Indexes
Information Retrieval
Rank-preserving two-level caching for scalable search engines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Compression of inverted indexes For fast query evaluation
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Streams, structures, spaces, scenarios, societies (5s): A formal model for digital libraries
ACM Transactions on Information Systems (TOIS)
Proceedings of the 2005 ACM symposium on Applied computing
Efficiently decodable and searchable natural language adaptive compression
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Combining Structural and Textual Contexts for Compressing Semistructured Databases
ENC '05 Proceedings of the Sixth Mexican International Conference on Computer Science
An approximate multi-word matching algorithm for robust document retrieval
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
ACM Computing Surveys (CSUR)
Using structural contexts to compress semistructured text collections
Information Processing and Management: an International Journal
User modeling for personalized Web search with self-organizing map: Research Articles
Journal of the American Society for Information Science and Technology
Efficient in-memory extensible inverted file
Information Systems
Fast blocking of undesirable web pages on client PC by discriminating URL using neural networks
Expert Systems with Applications: An International Journal
New technique for data compression
SEPADS'05 Proceedings of the 4th WSEAS International Conference on Software Engineering, Parallel & Distributed Systems
Compression of small text files
Advanced Engineering Informatics
RLH: Bitmap compression technique based on run-length and Huffman encoding
Information Systems
Manipulating lossless video in the compressed domain
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Efficient Similarity Search by Reducing I/O with Compressed Sketches
SISAP '09 Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
Index compression using 64-bit words
Software—Practice & Experience
An efficient compression code for text databases
ECIR'03 Proceedings of the 25th European conference on IR research
Compressing semistructured text databases
ECIR'03 Proceedings of the 25th European conference on IR research
Dynamic lightweight text compression
ACM Transactions on Information Systems (TOIS)
Edge-guided natural language text compression
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Towards better TV viewing rates: exploiting crowd's media life logs over Twitter for TV rating
Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication
Sample selection for dictionary-based corpus compression
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Natural Language Compression on Edge-Guided text preprocessing
Information Sciences: an International Journal
Relative Lempel-Ziv factorization for efficient storage and retrieval of web collections
Proceedings of the VLDB Endowment
Mapping words into codewords on PPM
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Accelerating multipattern matching on compressed HTTP traffic
IEEE/ACM Transactions on Networking (TON)
Generalized biwords for bitext compression and translation spotting
Journal of Artificial Intelligence Research
On compressing and indexing repetitive sequences
Theoretical Computer Science
Compressing IP forwarding tables: towards entropy bounds and beyond
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
High volumes of event stream indexing and efficient multi-keyword searching for cloud monitoring
Future Generation Computer Systems
Generalized biwords for bitext compression and translation spotting: extended abstract
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 4.10 |
As online textual information explodes through the widespread use of digital libraries, office automation systems, document databases, and the Web, the need arises for an effective information retrieval (IR) system. The Web alone comprises approximately 800 million static pages, containing 6 trillion bytes of plain text--enough to store the text of a million books. Today's IR systems face the dynamic challenge of providing rapid and immediate access to this textual mass.Recent methods have demonstrated that directly searching compressed text is faster than searching original text and that flexible word searching improves the amount of compression obtained.Text compression focuses on finding ways to represent actual text in less space. This process involves replacing text symbols with equivalent symbols that use fewer bits or bytes. Text compression is attractive because it is cost efficient, requires less storage space, speeds up data transmittal, and reduces search time.The authors discuss the recent techniques that allow fast and direct searching of compressed text, and they explain how these techniques can improve the overall efficiency of IR systems.