Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Querying across languages: a dictionary-based approach to multilingual information retrieval
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Phrasal translation and query expansion techniques for cross-language information retrieval
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Utaclir @ CLEF 2001 - Effects of Compound Splitting and N-Gram Techniques
CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Making MIRACLEs: Interactive translingual search for Cebuano and Hindi
ACM Transactions on Asian Language Information Processing (TALIP)
Cross-language information retrieval: the way ahead
Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
Technical issues of cross-language information retrieval: a review
Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
Information Retrieval: Searching in the 21st Century
Information Retrieval: Searching in the 21st Century
Cluster-based patent retrieval
Information Processing and Management: an International Journal
Advanced learning algorithms for cross-language patent retrieval and classification
Information Processing and Management: an International Journal
Text mining techniques for patent analysis
Information Processing and Management: an International Journal
A Latent Semantic Indexing-based approach to multilingual document clustering
Decision Support Systems
Cross-lingual thesaurus for multilingual knowledge management
Decision Support Systems
Introduction to Information Retrieval
Introduction to Information Retrieval
Discovering Compound and Proper Nouns
RSEISP '07 Proceedings of the international conference on Rough Sets and Intelligent Systems Paradigms
A method for multilingual text mining and retrieval using growing hierarchical self-organizing maps
Journal of Information Science
Learning Image-Text Associations
IEEE Transactions on Knowledge and Data Engineering
Journal of Information Science
Evaluating effects of machine translation accuracy on cross-lingual patent retrieval
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
A fuzzy ontological knowledge document clustering methodology
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Designing a cross-language comparison-shopping agent
Decision Support Systems
An IPC-based vector space model for patent retrieval
Information Processing and Management: an International Journal
Double-pass clustering technique for multilingual document collections
Journal of Information Science
Evaluating Google queries based on language preferences
Journal of Information Science
Dynamic context generation for natural language understanding: a multifaceted knowledge approach
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Effective query generation and postprocessing strategies for prior art patent search
Journal of the American Society for Information Science and Technology
PatMedia: augmenting patent search with content-based image retrieval
IRFC'12 Proceedings of the 5th conference on Multidisciplinary Information Retrieval
Automatic keyphrase annotation of scientific documents using Wikipedia and genetic algorithms
Journal of Information Science
Hi-index | 0.00 |
Patent documents with sophisticated technical information are valuable for developing new technologies and products. They can be written in almost any language, leading to language barrier problems during retrieval. Traditionally, cross-language information retrieval and cross-language document matching have used text-translation-based or index-set-mapping methods. There are several challenges to the traditional methods, however, such as difficulties with natural language translation, complications owing to bilingual or multi-lingual translations (translating between two or more than two languages), and the unavailability of a parallel dual-language document set. This study offers a new and robust solution to cross-language patent document matching: the International Patent Classification (IPC) based concept bridge approach. The proposed method applies Latent Semantic Indexing to extract concepts from each set of patent documents and utilizes the IPC codes to construct a cross-language mediator that expresses patent documents in different languages. Experiments were carried out to demonstrate the performance of the proposed method. There were 3000 English patents and 3000 Chinese patents gathered as training documents from the United States Patent and Trademark Office and the Taiwan Intellectual Property Office, respectively. Another 30 English patents and another 30 Chinese patents were collected to be query patents. Finally, evaluations using an objective measure and subjective judgement were conducted to prove the feasibility and effectiveness of our method. The results show that our method out-performs the traditional text-translation methods.