Ethnographically-informed systems design for air traffic control
CSCW '92 Proceedings of the 1992 ACM conference on Computer-supported cooperative work
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
P2P-4-DL: Digital Library over Peer-to-Peer
P2P '04 Proceedings of the Fourth International Conference on Peer-to-Peer Computing
Supporting Law Enforcement in Digital Communities through Natural Language Analysis
IWCF '08 Proceedings of the 2nd international workshop on Computational Forensics
Who said what to whom?: capturing the structure of debates
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Multilingual term extraction from domain-specific corpora using morphological structure
EACL '06 Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters & Demonstrations
Language-independent bilingual terminology extraction from a multilingual parallel corpus
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Scary films good, scary flights bad: topic driven feature selection for classification of sentiment
Proceedings of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion
SemEval-2010 task 17: All-words word sense disambiguation on a specific domain
SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
IIITH: Domain specific word sense disambiguation
SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
Automatic Documentation and Mathematical Linguistics
Focused retrieval and result aggregation with political data
Information Retrieval
Exploring variations across biomedical subdomains
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
A corpus of Australian contract language: description, profiling and analysis
Proceedings of the 13th International Conference on Artificial Intelligence and Law
Analyzing word frequencies in large text corpora using inter-arrival times and bootstrapping
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Cross-Domain Effects on Parse Selection for Precision Grammars
Research on Language and Computation
Semantic Processing of Legal Texts
“Without the clutter of unimportant words”: Descriptive keyphrases for text visualization
ACM Transactions on Computer-Human Interaction (TOCHI)
Term extraction from sparse, ungrammatical domain-specific documents
Expert Systems with Applications: An International Journal
An online system with end-user services: mining novelty concepts from tv broadcast subtitles
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
There's no such thing as gaining a pound: reconsidering the bathroom scale user interface
Proceedings of the 2013 ACM international joint conference on Pervasive and ubiquitous computing
Hi-index | 0.00 |
This paper describes a method of comparing corpora which uses frequency profiling. The method can be used to discover key words in the corpora which differentiate one corpus from another. Using annotated corpora, it can be applied to discover key grammatical or word-sense categories. This can be used as a quick way in to find the differences between the corpora and is shown to have applications in the study of social differentiation in the use of English vocabulary, profiling of learner English and document analysis in the software engineering process.