Near-duplicate detection for eRulemaking
dg.o '05 Proceedings of the 2005 national conference on Digital government research
Ontology generation for large email collections
dg.o '08 Proceedings of the 2008 international conference on Digital government research
Opinion Mining and Sentiment Analysis
Foundations and Trends in Information Retrieval
Learning the distance metric in a personal ontology
Proceedings of the 2nd international workshop on Ontologies and information systems for the semantic web
Get out the vote: determining support or opposition from congressional floor-debate transcripts
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.00 |
In this project, we are developing new text processing tools that help people perform advanced analysis of large collections of text commentary. This problem is increasingly faced by the United States federal government's regulation writers who formulate the rules and regulations that define the details of laws enacted by Congress. Our research focuses on text clustering, text searching using information retrieval, near-duplicate detection, opinion identification, stakeholder characterization, and extractive summarization, as well as the impact of such tools on the process of rulemaking itself. Versions of a Rule-Writer's Workbench will be built by Computer Science researchers at ISI and CMU, deployed annually for experimental use by our government partners, and evaluated by social science researchers from the Library and Information Science and Sociology departments at the Universities of Pittsburgh and San Francisco respectively. This three-year project started in October 2004 and is funded under the National Science Foundation's Digital Government program.