The elements of graphing data
Efficient use of local edge histogram descriptor
MULTIMEDIA '00 Proceedings of the 2000 ACM workshops on Multimedia
Modern Information Retrieval
PANGAEA: an information system for environmental sciences
Computers & Geosciences
Efficient Similarity Search In Sequence Databases
FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Fast Time Sequence Indexing for Arbitrary Lp Norms
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
The Truth about Corel - Evaluation in Image Retrieval
CIVR '02 Proceedings of the International Conference on Image and Video Retrieval
On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration
Data Mining and Knowledge Discovery
SMI '04 Proceedings of the Shape Modeling International 2004
HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Content-based multimedia information retrieval: State of the art and challenges
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Image retrieval: Ideas, influences, and trends of the new age
ACM Computing Surveys (CSUR)
Features for image retrieval: an experimental comparison
Information Retrieval
Proceedings of the VLDB Endowment
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
A visual digital library approach for time-oriented scientific primary data
International Journal on Digital Libraries - Focused Issue on ECDL 2010
Visual-interactive querying for multivariate research data repositories using bag-of-words
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Hi-index | 0.00 |
Huge amounts of various research data are produced and made publicly available in digital libraries. An important category is bivariate data (measurements of one variable versus the other). Examples of bivariate data include observations of temperature and ozone levels (e.g., in environmental observation), domestic production and unemployment (e.g., in economics), or education and income level levels (in the social sciences). For accessing these data, content-based retrieval is an important query modality. It allows researchers to search for specific relationships among data variables (e.g., quadratic dependence of temperature on altitude). However, such retrieval is to date a challenge, as it is not clear which similarity measures to apply. Various approaches have been proposed, yet no benchmarks to compare their retrieval effectiveness have been defined. In this paper, we construct a benchmark for retrieval of bivariate data. It is based on a large collection of bivariate research data. To define similarity classes, we use category information that was annotated by domain experts. The resulting similarity classes are used to compare several recently proposed content-based retrieval approaches for bivariate data, by means of precision and recall. This study is the first to present an encompassing benchmark data set and compare the performance of respective techniques. We also identify potential research directions based on the results obtained for bivariate data. The benchmark and implementations of similarity functions are made available, to foster research in this emerging area of content-based retrieval.