The vocabulary problem in human-system communication
Communications of the ACM
A case for interaction: a study of interactive information retrieval behavior and effectiveness
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Query expansion using local and global document analysis
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
An evolutionary approach to constructing effective software reuse repositories
ACM Transactions on Software Engineering and Methodology (TOSEM)
Analysis of a very large web search engine query log
ACM SIGIR Forum
Supporting program comprehension using semantic and structural information
ICSE '01 Proceedings of the 23rd International Conference on Software Engineering
Modern Information Retrieval
Supporting reuse by delivering task-relevant and personalized information
Proceedings of the 24th International Conference on Software Engineering
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
An examination of software engineering work practices
CASCON '97 Proceedings of the 1997 conference of the Centre for Advanced Studies on Collaborative research
Intelligent search techniques for large software systems
CASCON '01 Proceedings of the 2001 conference of the Centre for Advanced Studies on Collaborative research
Archetypal Source Code Searches: A Survey of Software Developers and Maintainers
IWPC '98 Proceedings of the 6th International Workshop on Program Comprehension
Query Expansion by Mining User Logs
IEEE Transactions on Knowledge and Data Engineering
Identification of High-Level Concept Clones in Source Code
Proceedings of the 16th IEEE international conference on Automated software engineering
ICTAI '00 Proceedings of the 12th IEEE International Conference on Tools with Artificial Intelligence
The Journal of Machine Learning Research
An Information Retrieval Approach to Concept Location in Source Code
WCRE '04 Proceedings of the 11th Working Conference on Reverse Engineering
Information Retrieval: Algorithms and Heuristics (The Kluwer International Series on Information Retrieval)
Using structural context to recommend source code examples
Proceedings of the 27th international conference on Software engineering
Jungloid mining: helping to navigate the API jungle
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
How Are Java Software Developers Using the Eclipse IDE?
IEEE Software
MUDABlue: an automatic categorization system for open source repositories
Journal of Systems and Software - Special issue: Selected papers from the 11th Asia Pacific software engineering conference (APSEC 2004)
Modeling successful performance in Web searching
Journal of the American Society for Information Science and Technology
Sourcerer: a search engine for open source code supporting structure-based search
Companion to the 21st ACM SIGPLAN symposium on Object-oriented programming systems, languages, and applications
Questions programmers ask during software evolution tasks
Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering
Semantic clustering: Identifying topics in source code
Information and Software Technology
Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search
ACM Transactions on Information Systems (TOIS)
IEEE Transactions on Software Engineering
Information Needs in Collocated Software Development Teams
ICSE '07 Proceedings of the 29th international conference on Software Engineering
IEEE Transactions on Software Engineering
Mining Eclipse Developer Contributions via Author-Topic Models
MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Combining Formal Concept Analysis with Information Retrieval for Concept Location in Source Code
ICPC '07 Proceedings of the 15th IEEE International Conference on Program Comprehension
Assieme: finding and leveraging implicit references in a web search interface for programmers
Proceedings of the 20th annual ACM symposium on User interface software and technology
Parseweb: a programmer assistant for reusing open source code on the web
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Mining concepts from code with probabilistic topic models
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Data mining of search engine logs
Journal of the American Society for Information Science and Technology
ICTAI '07 Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
Mining business topics in source code using latent dirichlet allocation
ISEC '08 Proceedings of the 1st India software engineering conference
Statistical Debugging Using Latent Topic Models
ECML '07 Proceedings of the 18th European conference on Machine Learning
Code Conjurer: Pulling Reusable Software out of Thin Air
IEEE Software
Source Code Retrieval for Bug Localization Using Latent Dirichlet Allocation
WCRE '08 Proceedings of the 2008 15th Working Conference on Reverse Engineering
A theory of aspects as latent topics
Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
Two studies of opportunistic programming: interleaving web foraging, learning, and writing code
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Sourcerer: mining and searching internet-scale software repositories
Data Mining and Knowledge Discovery
Applying test-driven code search to the reuse of auxiliary functionality
Proceedings of the 2009 ACM symposium on Applied Computing
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
ICSE '09 COMPANION Proceedings of the 2009 31st International Conference on Software Engineering: Companion Volume
Mining search topics from a code search engine usage log
MSR '09 Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories
Using Latent Dirichlet Allocation for automatic categorization of software
MSR '09 Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories
Reformulation of queries using similarity thesauri
Information Processing and Management: an International Journal
How are we searching the World Wide Web? A comparison of nine search engine transaction logs
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Software traceability with topic modeling
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Searching API usage examples in code repositories with sourcerer API search
Proceedings of 2010 ICSE Workshop on Search-driven Development: Users, Infrastructure, Tools and Evaluation
Leveraging usage similarity for effective retrieval of examples in code repositories
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
Short text similarity based on probabilistic topics
Knowledge and Information Systems
Query expansion using web access log files
DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
ICSR'06 Proceedings of the 9th international conference on Reuse of Off-the-Shelf Components
What do developers search for in source code and why
Proceedings of the 3rd International Workshop on Search-Driven Development: Users, Infrastructure, Tools, and Evaluation
Asking and answering questions about unfamiliar APIs: an exploratory study
Proceedings of the 34th International Conference on Software Engineering
Hi-index | 0.00 |
This paper presents an analysis of a year long usage log of Koders, the first commercially available Internet-Scale code search engine ( http://www.koders.com ). The usage log comprises about ten million activities from more than three million users. Analysis of the usage data shows that despite of attracting a large number of visitors, Koders has a very sparse usage and that it lacks regular usage from many of its users. When compared to Web search, search behavior in Koders showed many similar patterns. A topic modeling analysis of the usage data shows what topics users of Koders are looking for. Observations on the prevalence of these topics among the users, and observations on how search and download activities vary across topics, lead to the conclusion that users who find code search engines usable are those who already know to a high level of specificity what to look for. This paper also presents a general categorization of these topics that provides insights on the different ways code search engine users express their queries. It identifies various forms of queries in Koders's log and the kinds of results addressed by the queries. It also provides several suggestions for improvements in code search engines based on the analysis of usage, topics, and query forms. The work presented in this paper is the first of its kind that reveals several insights on the usage of an Internet-Scale code search engine.