Regular Article: Geometry of the Space of Phylogenetic Trees
Advances in Applied Mathematics
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Fewer permutations, more accurate P-values
Bioinformatics
The Journal of Machine Learning Research
A Fast Algorithm for Computing Geodesic Distances in Tree Space
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Optimal graph based segmentation using flow lines with application to airway wall segmentation
IPMI'11 Proceedings of the 22nd international conference on Information processing in medical imaging
Hypothesis testing with nonlinear shape models
IPMI'05 Proceedings of the 19th international conference on Information Processing in Medical Imaging
Means in spaces of tree-like shapes
ICCV '11 Proceedings of the 2011 International Conference on Computer Vision
Complexity of computing distances between geometric trees
SSPR'12/SPR'12 Proceedings of the 2012 Joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Toward a Theory of Statistical Tree-Shape Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hi-index | 0.00 |
Statistical analysis of anatomical trees is hard to perform due to differences in the topological structure of the trees. In this paper we define statistical properties of leaf-labeled anatomical trees with geometric edge attributes by considering the anatomical trees as points in the geometric space of leaf-labeled trees. This tree-space is a geodesic metric space where any two trees are connected by a unique shortest path, which corresponds to a tree deformation. However, tree-space is not a manifold, and the usual strategy of performing statistical analysis in a tangent space and projecting onto tree-space is not available. Using tree-space and its shortest paths, a variety of statistical properties, such as mean, principal component, hypothesis testing and linear discriminant analysis can be defined. For some of these properties it is still an open problem how to compute them; others (like the mean) can be computed, but efficient alternatives are helpful in speeding up algorithms that use means iteratively, like hypothesis testing. In this paper, we take advantage of a very large dataset (N=8016) to obtain computable approximations, under the assumption that the data trees parametrize the relevant parts of tree-space well. Using the developed approximate statistics, we illustrate how the structure and geometry of airway trees vary across a population and show that airway trees with Chronic Obstructive Pulmonary Disease come from a different distribution in tree-space than healthy ones. Software is available from http://image.diku.dk/aasa/software.php.