Intelligent data entry assistant for XML using ensemble learning

Authors:
Danico Lee;Costas Tsatsoulis
Affiliations:
University of Kansas, Lawrence, KS;University of Kansas, Lawrence, KS
Venue:
Proceedings of the 10th international conference on Intelligent user interfaces
Year:
2005

Citing 4
Cited 2

Induction of Decision Trees

Machine Learning
XEM: Managing the evolution of XML Documents

Eleventh International Workshop on Research Issues in Data Engineering on Document Management for Data Intensive Business and Scientific Applications
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
An empirical evaluation of bagging and boosting

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

Domain independent data discrepancy detection using ensemble learning

ICCOMP'08 Proceedings of the 12th WSEAS international conference on Computers
Designing adaptive feedback for improving data entry accuracy

UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

XML has emerged as the primary standard of data representation and data exchange [13]. Although many software tools exist to assist the XML implementation process, data must be manually entered into the XML documents. Current form filling technologies are mostly for simple data entry and do not provide support for the complexity and nested structures of XML grammars. This paper presents SmartXAutofill, an intelligent data entry assistant for predicting and automating inputs for XML documents based on the contents of historical document collections in the same XML domain. SmartXAutofill incorporates an ensemble classifier, which integrates multiple internal classification algorithms into a single architecture. Each internal classifier uses approximate techniques to propose a value for an empty XML field, and, through voting, the ensemble classifier determines which value to accept. As the system operates it learns which internal classification algorithms work better for a specific XML document domain and modifies its weights (confidence) in their predictive ability. As a result, the ensemble classifier adapts itself to the specific XML domain, without the need to develop special learners for the infinite number of domains that XML users have created. We evaluated our system performance using data from eleven different XML domains. The results show that the ensemble classifier adapted itself to different XML document domains, and most of the time (for 9 out of 11 domains) produced predictive accuracies as good as or better than the best individual classifier for a domain.