Automatic text processing
Information retrieval in the World-Wide Web: making client-based searching feasible
Selected papers of the first conference on World-Wide Web
Internet agents: spiders, wanderers, brokers, and bots
Internet agents: spiders, wanderers, brokers, and bots
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Performance limitations of the Java core libraries
JAVA '99 Proceedings of the ACM 1999 conference on Java Grande
Greenstone: a comprehensive open-source digital library software system
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Greenstone: Open-source DL software
Communications of the ACM
Personalized spiders for web search and analysis
Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Power to the people: end-user building of digital library collections
Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
ACM Transactions on Internet Technology (TOIT)
MetaSpider: meta-searching and categorization on the Web
Journal of the American Society for Information Science and Technology
The Unicode Standard: Worldwide Character Encoding
The Unicode Standard: Worldwide Character Encoding
NanoPort: a web portal for nanoscale science and technology
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Java Internationalization
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Mercator: A scalable, extensible Web crawler
World Wide Web
CI Spider: a tool for competitive intelligence on the web
Decision Support Systems
HelpfulMed: intelligent searching for medical information over the internet
Journal of the American Society for Information Science and Technology
CMedPort: a cross-regional Chinese medical portal
Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
Multilingual Web retrieval: An experiment in English–Chinese business intelligence
Journal of the American Society for Information Science and Technology
CMedPort: an integrated approach to facilitating Chinese medical information seeking
Decision Support Systems
WebGlimpse: combining browsing and searching
ATEC '97 Proceedings of the annual conference on USENIX Annual Technical Conference
Designing the user interface and functions of a search engine development tool
Decision Support Systems
A multi-region empirical study on the internet presence of global extremist organizations
Information Systems Frontiers
A hybrid system for online detection of emotional distress
PAISI'12 Proceedings of the 2012 Pacific Asia conference on Intelligence and Security Informatics
Hi-index | 0.00 |
While small-scale search engines in specific domains and languages are increasingly used by Web users, most existing search engine development tools do not support the development of search engines in languages other than English, cannot be integrated with other applications, or rely on proprietary software. A tool that supports search engine creation in multiple languages is thus highly desired. To study the research issues involved, we review related literature and suggest the criteria for an ideal search tool. We present the design of a toolkit, called SpidersRUs, developed for multilingual search engine creation. The design and implementation of the tool, consisting of a Spider module, an Indexer module, an Index Structure, a Search module, and a Graphical User Interface module, are discussed in detail. A sample user session and a case study on using the tool to develop a medical search engine in Chinese are also presented. The technical issues involved and the lessons learned in the project are then discussed. This study demonstrates that the proposed architecture is feasible in developing search engines easily in different languages such as Chinese, Spanish, Japanese, and Arabic.