The significance of the Cranfield tests on index languages
SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
Assessing agreement on classification tasks: the kappa statistic
Computational Linguistics
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
Bootstrapping for hierarchical document classification
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
The reliability of a dialogue structure coding scheme
Computational Linguistics
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Document classification through interactive supervision of document and term labels
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Taxonomies by the numbers: building high-performance taxonomies
Proceedings of the 14th ACM international conference on Information and knowledge management
An integrated system for building enterprise taxonomies
Information Retrieval
Hi-index | 0.00 |
Designers usually begin with a database to look for historical design solution, available experience and techniques through design documents, when initiating a new design. This database is a collection of labeled design documents under a few of predefined categories. However, little work has been done on labeling a relatively small number of design documents for information organization, so that most of design documents in this database can be automatically categorized. This paper initiates a study on this topic and proposes a methodology in four steps: design document collection, documents labeling, finalization of documents labeling and categorization of design database. Our discussion in this paper focuses on the first three steps. The key of this method is to collect relatively small number of design documents for manual labeling operation, and unify the effective labeling results as the final labels in terms of labeling agreement analysis and text classification experiment. Then these labeled documents are utilized as training samples to construct classifiers, which can automatically give appropriate labels to each design document. With this method, design documents are labeled in terms of the consensus of operators' understanding, and design information can be organized in a comprehensive and universally accessible way. A case study of labeling robotic design documents is used to demonstrate the proposed methodology. Experimental results show that this method can significantly benefit efficient design information search.