International Journal of Computer Vision
Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Landmarks: A New Model for Similarity-Based Pattern Querying in Time Series Databases
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Hi-index | 0.00 |
The functional state of an organism is determined largely by the pattern of expression of its genes. The analysisof gene expression data from gene chips has primarily revolved around clustering and classification of the datausing machine learning techniques based on the intensity of expression alone with the time-varying pattern mostlyignored. In this paper, we present a pattern recognition-based approach to capturing similarity by finding salientchanges in the time-varying expression patterns of genes. Such changes can give clues about important events,such as gene regulation by cell-cycle phases, or even signal the onset of a disease. Specifically, we observe thatdissimilarity between time series is revealed by the sharp twists and bends produced in a higher-dimensional curveformed from the constituent signals. Scale-space analysis is used to detect the sharp twists and turns and theirrelative strength with respect to the component signals is estimated toform ashape similarity measure between timeprofiles. A clustering algorithm is presented to cluster gene profiles using the scale-space distance as a similaritymetric. Multi-dimensional curves formed from time series within clusters are used as cluster prototypes or indexesto the gene expression database, and are used to retrieve the functionally similar genes to a query gene profile.Extensive comparison of clustering using scale-space distance in comparison to traditional Euclidean distance ispresented on the yeast genome database.