The KDD process for extracting useful knowledge from volumes of data
Communications of the ACM
Mining the network value of customers
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
A hit-miss model for duplicate detection in the WHO drug safety database
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Electronic Commerce Research and Applications
A model for serendipitous music retrieval
Proceedings of the 2nd Workshop on Context-awareness in Retrieval and Recommendation
Measuring the validity of peer-to-peer data for information retrieval applications
Computer Networks: The International Journal of Computer and Telecommunications Networking
Hi-index | 0.00 |
Record label companies would like to identify potential artists as early as possible in their careers, before other companies approach the artists with competing contracts. The vast number of candidates makes the process of identifying the ones with high success potential time consuming and laborious. This paper demonstrates how datamining of P2P query strings can be used in order to mechanize most of this detection process. Using a unique intercepting system over the Gnutella network, we were able to capture an unprecedented amount of geographically identified (geo-aware) queries, allowing us to investigate the diffusion of music related queries in time and space. Our solution is based on the observation that emerging artists, especially rappers, have a discernible stronghold of fans in their hometown area, where they are able to perform and market their music. In a file sharing network, this is reflected as a delta function spatial distribution of content queries. Using this observation, we devised a detection algorithm for emerging artists, that looks for performers with sharp increase in popularity in a small geographic region though still unnoticable nation wide. The algorithm can suggest a short list of artists with breakthrough potential, from which we showed that about 30% translate the potential to national success.