GriPhyN and LIGO, Building a Virtual Data Grid for Gravitational Wave Scientists
HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
A Scalable Peer-to-Peer System for Music Information Retrieval
Computer Music Journal
Semantic Audio Hyperlinking: A Multimedia-Semantic Web Scenario
AXMEDIS '05 Proceedings of the First International Conference on Automated Production of Cross Media Content for Multi-Channel Distribution
Taverna: lessons in creating a workflow environment for the life sciences: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Scientific workflow management and the Kepler system: Research Articles
Concurrency and Computation: Practice & Experience - Workflow in Grid Systems
Alchemist: user driven searching in ubiquitous networks
Proceedings of the 1st international workshop on Advanced data processing in ubiquitous computing (ADPUC 2006)
myExperiment: social networking for workflow-using e-scientists
Proceedings of the 2nd workshop on Workflows in support of large-scale science
International Journal of High Performance Computing Applications
A super-peer model for multiple job submission on a grid
Euro-Par'06 Proceedings of the CoreGRID 2006, UNICORE Summit 2006, Petascale Computational Biology and Bioinformatics conference on Parallel processing
A peer-to-peer architecture for data-intensive cycle sharing
Proceedings of the first international workshop on Network-aware data management
Hi-index | 0.00 |
This paper discusses issues in the distribution of bundled workflows across ubiquitous peer-to-peer networks for the application of music information retrieval. The underlying motivation for this work is provided by the DART project, which aims to develop a novel music recommendation system by gathering statistical data using collaborative filtering techniques and the analysis of the audio itsel, in order to create a reliable and comprehensive database of the music that people own and which they listen to. To achieve this, the DART scientists creating the algorithms need the ability to distribute the Triana workflows they create, representing the analysis to be performed, across the network on a regular basis (perhaps even daily) in order to update the network as a whole with new workflows to be executed for the analysis. DART uses a similar approach to BOINC but differs in that the workers receive input data in the form of a bundled Triana workflow, which is executed in order to process any MP3 files that they own on their machine. Once analysed, the results are returned to DART's distributed database that collects and aggregates the resulting information. DART employs the use of package repositories to decentralise the distribution of such workflow bundles and this approach is validated in this paper through simulations that show that suitable scalability is maintained through the system as the number of participants increases. The results clearly illustrate the effectiveness of the approach.