Identifying syntactic differences between two programs
Software—Practice & Experience
Semiautomatic disabbreviation of technical text
Information Processing and Management: an International Journal
Partial evaluation for program comprehension
ACM Computing Surveys (CSUR) - Special issue: electronic supplement to the September 1998 issue
The role of software measures and metrics in studies of program comprehension
ACM-SE 37 Proceedings of the 37th annual Southeast regional conference (CD-ROM)
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
CCFinder: a multilinguistic token-based code clone detection system for large scale source code
IEEE Transactions on Software Engineering
What's the code?: automatic classification of source code archives
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Evaluation Experiments on the Detection of Programming Patterns Using Software Metrics
WCRE '97 Proceedings of the Fourth Working Conference on Reverse Engineering (WCRE '97)
Clone Detection Using Abstract Syntax Trees
ICSM '98 Proceedings of the International Conference on Software Maintenance
ICTAI '00 Proceedings of the 12th IEEE International Conference on Tools with Artificial Intelligence
Aiding Comprehension of Cloning Through Categorization
IWPSE '04 Proceedings of the Principles of Software Evolution, 7th International Workshop
Towards revealing JavaScript program intents using abstract interpretation
Proceedings of the Sixth Asian Internet Engineering Conference
Hi-index | 0.00 |
Web pages often embed scripts for a variety of purposes, including advertising and dynamic interaction. Understanding embedded scripts and their purpose can often help to interpret or provide crucial information about the web page. We have developed a functionality-based categorization of JavaScript, the most widely used web page scripting language. We then view understanding embedded scripts as a text categorization problem. We show how traditional information retrieval methods can be augmented with the features distilled from the domain knowledge of JavaScript and software analysis to improve classification performance. We perform experiments on the standard WT10G web page corpus, and show that our techniques eliminate over 50% of errors over a standard text classification baseline.