BATON: a balanced tree structure for peer-to-peer networks
VLDB '05 Proceedings of the 31st international conference on Very large data bases
A survey of data provenance in e-science
ACM SIGMOD Record
A Survey of Web Information Extraction Systems
IEEE Transactions on Knowledge and Data Engineering
Data integration: the teenage years
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Communications of the ACM - ACM at sixty: a look back in time
Pay-as-you-go user feedback for dataspace systems
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Bootstrapping pay-as-you-go data integration systems
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A practical scalable distributed B-tree
Proceedings of the VLDB Endowment
On the provenance of non-answers to queries over extracted data
Proceedings of the VLDB Endowment
Efficiently incorporating user feedback into information extraction and integration programs
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Exploiting context analysis for combining multiple entity resolution systems
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Entity resolution with iterative blocking
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Toolkit-Based High-Performance Data Mining of Large Data on MapReduce Clusters
ICDMW '09 Proceedings of the 2009 IEEE International Conference on Data Mining Workshops
A Taxonomy and Survey of Cloud Computing Systems
NCM '09 Proceedings of the 2009 Fifth International Joint Conference on INC, IMS and IDC
Reasoning about record matching rules
Proceedings of the VLDB Endowment
Managing and Mining Graph Data
Managing and Mining Graph Data
Leveraging spatio-temporal redundancy for RFID data cleansing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Pregel: a system for large-scale graph processing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Automatically incorporating new sources in keyword search-based data integration
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Schema clustering and retrieval for multi-domain pay-as-you-go data integration systems
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Ricardo: integrating R and Hadoop
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
On-the-fly entity-aware query processing in the presence of linkage
Proceedings of the VLDB Endowment
Foundations of uncertain-data integration
Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment
Reverse spatial and textual k nearest neighbor search
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
PARIS: probabilistic alignment of relations, instances, and schema
Proceedings of the VLDB Endowment
Optimal top-k generation of attribute combinations based on ranked lists
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Efficient transaction processing in SAP HANA database: the end of a column store myth
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Dynamic workload driven data integration in tableau
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
CrowdER: crowdsourcing entity resolution
Proceedings of the VLDB Endowment
Challenges and opportunities with big data
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
There is a trend that, virtually everyone, ranging from big Web companies to traditional enterprisers to physical science researchers to social scientists, is either already experiencing or anticipating unprecedented growth in the amount of data available in their world, as well as new opportunities and great untapped value. This paper reviews big data challenges from a data management respective. In particular, we discuss big data diversity, big data reduction, big data integration and cleaning, big data indexing and query, and finally big data analysis and mining. Our survey gives a brief overview about big-data-oriented research and problems.