A method for online analytical processing of text data

Authors:
Akihiro Inokuchi;Koichi Takeda
Affiliations:
Osaka University, Ibaraki, Japan;IBM Japan, Yamato, Japan
Venue:
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Year:
2007

Citing 26
Cited 4

Index structures for structured documents

Proceedings of the first ACM international conference on Digital libraries
High performance multidimensional analysis of large datasets

Proceedings of the 1st ACM international workshop on Data warehousing and OLAP
Foundations of statistical natural language processing

Foundations of statistical natural language processing
On the design and evaluation of a multi-dimensional approach to information retrieval (poster session)

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Designing Web Usability: The Practice of Simplicity

Designing Web Usability: The Practice of Simplicity
Labeling dynamic XML trees

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Accelerating XPath location steps

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Storing and querying ordered XML using a relational database system

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Statistical Language Learning

Statistical Language Learning
Multidimensional Database Technology

Computer
Index Selection for OLAP

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Spatio-Temporal Retrieval with RasDaMan

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
On the Computation of Multidimensional Aggregates

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
A Foundation for Multi-dimensional Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
DocCube: multi-dimensional visualisation and exploration of large document sets

Journal of the American Society for Information Science and Technology
Maintaining order in a linked list

STOC '82 Proceedings of the fourteenth annual ACM symposium on Theory of computing
Multidimensional Data Modeling for Complex Data

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
A Prime Number Labeling Scheme for Dynamic Ordered XML Trees

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Text analysis and knowledge mining system

IBM Systems Journal
Building an example application with the unstructured information management architecture

IBM Systems Journal
A text-mining system for knowledge discovery from biomedical documents

IBM Systems Journal
VLEI Code: An Efficient Labeling Method for Handling XML Documents in an RDB

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Multi-structural databases

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A relevance-extended multi-dimensional model for a data warehouse contextualized with documents

Proceedings of the 8th ACM international workshop on Data warehousing and OLAP
Efficiently linking text documents with relevant structured information

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
IR and OLAP in XML document warehouses

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research

iNextCube: information network-enhanced text cube

Proceedings of the VLDB Endowment
Text-to-query: dynamically building structured analytics to illustrate textual content

Proceedings of the 2010 EDBT/ICDT Workshops
OLAP operators for complex object data cubes

ADBIS'10 Proceedings of the 14th east European conference on Advances in databases and information systems
Interesting-phrase mining for ad-hoc text analytics

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

There are increasingly visible demands for structured/ unstructured information integration and advanced analytics. However, conventional database technology has not been able to present a robust and practical implementation of a truly integrated architecture for such purposes. After working on several industrial applications (in particular, in the healthcare and life sciences area), we have identified fundamental issues and technical approaches to tackle the issues. In this paper, we propose data representations and algebraic operations for integrating semantic information (e.g., ontologies) into OLAP systems, which allow us to analyze a huge set of textual documents with their underlying semantic information. The performance of the prototype implementation has been evaluated using real world datasets, and the high scalability and flexibility of our approach have been confirmed with respect to the computation time.