Query expansion using an immune-inspired biclustering algorithm

Authors:
Pablo A. Castro;Fabrício O. França;Hamilton M. Ferreira;Guilherme Palermo Coelho;Fernando J. Zuben
Affiliations:
Laboratory of Bioinformatics and Bio-inspired Computing (LBiC), Department of Computer Engineering and Industrial Automation (DCA), School of Electrical and Computer Engineering (FEEC), University ...;Laboratory of Bioinformatics and Bio-inspired Computing (LBiC), Department of Computer Engineering and Industrial Automation (DCA), School of Electrical and Computer Engineering (FEEC), University ...;Laboratory of Bioinformatics and Bio-inspired Computing (LBiC), Department of Computer Engineering and Industrial Automation (DCA), School of Electrical and Computer Engineering (FEEC), University ...;Laboratory of Bioinformatics and Bio-inspired Computing (LBiC), Department of Computer Engineering and Industrial Automation (DCA), School of Electrical and Computer Engineering (FEEC), University ...;Laboratory of Bioinformatics and Bio-inspired Computing (LBiC), Department of Computer Engineering and Industrial Automation (DCA), School of Electrical and Computer Engineering (FEEC), University ...
Venue:
Natural Computing: an international journal
Year:
2010

Citing 17
Cited 1

Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Co-clustering documents and words using bipartite spectral graph partitioning

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Artificial Immune Systems: A New Computational Intelligence Paradigm

Artificial Immune Systems: A New Computational Intelligence Paradigm
Biclustering of Expression Data

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Interrelated Two-way Clustering: An Unsupervised Approach for Gene Expression Data Analysis

BIBE '01 Proceedings of the 2nd IEEE International Symposium on Bioinformatics and Bioengineering
Biclustering Algorithms for Biological Data Analysis: A Survey

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A MOE framework for Biclustering of Microarray Data

ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 01
Multi-objective evolutionary biclustering of gene expression data

Pattern Recognition
A multi-objective approach to discover biclusters in microarray data

Proceedings of the 9th annual conference on Genetic and evolutionary computation
Evaluating the Performance of a Biclustering Algorithm Applied to Collaborative Filtering - A Comparative Analysis

HIS '07 Proceedings of the 7th International Conference on Hybrid Intelligent Systems
Applying Biclustering to Perform Collaborative Filtering

ISDA '07 Proceedings of the Seventh International Conference on Intelligent Systems Design and Applications
Introduction to Information Retrieval

Introduction to Information Retrieval
A Multi-Objective Multipopulation Approach for Biclustering

ICARIS '08 Proceedings of the 7th international conference on Artificial Immune Systems
Applying biclustering to text mining: an immune-inspired approach

ICARIS'07 Proceedings of the 6th international conference on Artificial immune systems
Nearest-biclusters collaborative filtering with constant values

WebKDD'06 Proceedings of the 8th Knowledge discovery on the web international conference on Advances in web mining and web usage analysis
A fast and elitist multiobjective genetic algorithm: NSGA-II

IEEE Transactions on Evolutionary Computation
Learning and optimization using the clonal selection principle

IEEE Transactions on Evolutionary Computation

Predicting missing values with biclustering: A coherence-based approach

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Query expansion is a technique utilized to improve the performance of information retrieval systems by automatically adding related terms to the initial query. These additional terms can be obtained from documents stored in a database. Usually, this task is performed by clustering the documents and then extracting representative terms from the clusters. Afterwards, a new search is performed in the whole database using the expanded set of terms. Recently, the authors have proposed an immune-inspired algorithm, namely BIC-aiNet, to perform biclustering of texts. Biclustering differs from standard clustering algorithms in the sense that the former can detect partial similarities in the attributes. The preliminary results indicated that our proposal is able to group similar texts effectively and the generated biclusters consistently presented relevant words to represent a category of texts. Motivated by this promising scenario, this paper better formalizes the proposal and investigates the usefulness of the whole methodology on larger datasets. The BIC-aiNet was applied to a set of documents aiming at identifying the set of relevant terms associated with each bicluster, giving rise to a query expansion tool. The obtained results were compared with those produced by two alternative proposals in the literature, and they indicate that these techniques tend to generate complementary results, as a consequence of the use of distinct similarity metrics.