ACM Computing Surveys (CSUR) - Annals of discrete mathematics, 24
Handbook of algorithms and data structures: in Pascal and C (2nd ed.)
Handbook of algorithms and data structures: in Pascal and C (2nd ed.)
Fast text searching: allowing errors
Communications of the ACM
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Adding Compression to Block Addressing Inverted Indexes
Information Retrieval
Querying Compressed Data in Data Warehouses
Information Technology and Management
Approximate String Matching in LDAP Based on Edit Distance
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Approximate String Joins in a Database (Almost) for Free
Proceedings of the 27th International Conference on Very Large Data Bases
OdeFS: A File System Interface to an Object-Oriented Database
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Searching the World Wide Web: Challenges and Partial Solutions
IBERAMIA '98 Proceedings of the 6th Ibero-American Conference on AI: Progress in Artificial Intelligence
An Experimental Evaluation of Hybrid Data Structures for Searching
WAE '99 Proceedings of the 3rd International Workshop on Algorithm Engineering
WIA '98 Revised Papers from the Third International Workshop on Automata Implementation
Indexing Text with Approximate q-Grams
COM '00 Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching
A New Indexing Method for Approximate String Matching
CPM '99 Proceedings of the 10th Annual Symposium on Combinatorial Pattern Matching
A Metric Index for Approximate String Matching
LATIN '02 Proceedings of the 5th Latin American Symposium on Theoretical Informatics
Efficient in-memory extensible inverted file
Information Systems
Dependency-Based Construction of Semantic Space Models
Computational Linguistics
The Performance Impact of Kernel Prefetching on Buffer Cache Replacement Algorithms
IEEE Transactions on Computers
EnsemBlue: integrating distributed storage and consumer electronics
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
RACE: A Robust Adaptive Caching Strategy for Buffer Cache
IEEE Transactions on Computers
Documenting and automating collateral evolutions in linux device drivers
Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008
Toward a multi-tier index for information retrieval system
SEPADS'05 Proceedings of the 4th WSEAS International Conference on Software Engineering, Parallel & Distributed Systems
Events and streams: harnessing and unleashing their synergy!
Proceedings of the second international conference on Distributed event-based systems
Algorithms and data structures for external memory
Foundations and Trends® in Theoretical Computer Science
Toward a multi-tier index for information retrieval system
TELE-INFO'05 Proceedings of the 4th WSEAS International Conference on Telecommunications and Informatics
Compressed text indexes: From theory to practice
Journal of Experimental Algorithmics (JEA)
TinyLex: static n-gram index pruning with perfect recall
Proceedings of the 17th ACM conference on Information and knowledge management
Spyglass: fast, scalable metadata search for large-scale storage systems
FAST '09 Proccedings of the 7th conference on File and storage technologies
FI-based file access predictor
Proceedings of the 47th Annual Southeast Regional Conference
Dynamic storage cache allocation in multi-server architectures
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Algorithms for memory hierarchies: advanced lectures
Algorithms for memory hierarchies: advanced lectures
Adaptive multi-level cache allocation in distributed storage architectures
Proceedings of the 24th ACM International Conference on Supercomputing
Difference engine: harnessing memory redundancy in virtual machines
Communications of the ACM
Difference engine: harnessing memory redundancy in virtual machines
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
A hardware-accelerated novel IR system
EUROMICRO-PDP'02 Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing
Finding a needle in Haystack: facebook's photo storage
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Toward a multi-tier index for information retrieval system
ICAI'05/MCBC'05/AMTA'05/MCBE'05 Proceedings of the 6th WSEAS international conference on Automation & information, and 6th WSEAS international conference on mathematics and computers in biology and chemistry, and 6th WSEAS international conference on acoustics and music: theory and applications, and 6th WSEAS international conference on Mathematics and computers in business and economics
Inverted files versus suffix arrays for locating patterns in primary memory
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
String matching with alphabet sampling
Journal of Discrete Algorithms
Delta-FTL: improving SSD lifetime via exploiting content locality
Proceedings of the 7th ACM european conference on Computer Systems
The use of the data dictionary in DBMS based on graphs
Interfaces'96 Proceedings of the 1996 international conference on Interfaces to Databases
Exploiting SIMD instructions in current processors to improve classical string algorithms
ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems
FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies
Hi-index | 0.02 |
GLIMPSE, which stands for GLobal IMPlicit SEarch, provides indexing and query schemes for file systems. The novelty of glimpse is that it uses a very small index - in most cases 2-4% of the size of the text - and still allows very flexible full-text retrieval including Boolean queries, approximate matching (i.e., allowing misspelling), and even searching for regular expressions. In a sense, glimpse extends agrep to entire file systems, while preserving most of its functionality and simplicity. Query times are typically slower than with inverted indexes, but they are still fast enough for many applications. For example, it took 5 seconds of CPU time to find all 19 occurrences of Usenix AND Winter in a file system containing 69MB of text spanning 4300 files. Glimpse is particularly designed for personal information, such as one's own file system. The main characteristic of personal information is that it is non-uniform and includes many types of documents. An information retrieval system for personal information should support many types of queries, flexible interaction, low overhead, and customization, All these are important features of glimpse.