Bipartite graph partitioning and data clustering
Proceedings of the tenth international conference on Information and knowledge management
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Domain-independent text segmentation using anisotropic diffusion and dynamic programming
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Topic segmentation with shared topic detection and alignment of multiple documents
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Bipartite isoperimetric graph partitioning for data co-clustering
Data Mining and Knowledge Discovery
Extracting shared topics of multiple documents
PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Towards bipartite graph data management
CloudDB '10 Proceedings of the second international workshop on Cloud data management
Hi-index | 0.00 |
There is enormous amount of multilingual documents from various sources and possibly from different countries describing a single event or a set of related events. It is desirable to construct text mining methods that can compare and highlight similarities and differences of those multilingual documents. We discuss our ongoing research that seeks to model a pair of multilingual documents as a weighted bipartite graph with the edge weights computed by means of machine translation. We use spectral method to identify dense subgraphs of the weighted bipartite graph which can be considered as corresponding to sentences that correlate well in textual contents. We illustrate our approach using English and German texts.