Hierarchical clustering: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Daniel Mietchen
m (Hierarchical Clustering moved to Hierarchical clustering: CZ naming conventions)
imported>Daniel Mietchen
(slight expansion)
Line 1: Line 1:
{{subpages}}
{{subpages}}


'''Hierarchical clustering''' is a branch of [[cluster analysis]] which treats clusters hierarchically, i.e. as a set of levels. The construction of the hierarchy can be performed using two major approaches, or combinations thereof: In agglomerative hierarchical clustering, existing clusters are merged [[iteration|iteratively]], while divisive hierarchical clustering starts out with all data in one cluster that is then split iteratively. At each step of the process, a mathematical measure of [[distance]] or [[similarity]] between (agglomerative) or within clusters (divisive) is being computed to determine how to split or merge. Several different distance and similarity measures can be used, which generally result in different hierarchies, thus complicating their interpretation. Nonetheless, hierarchical clustering is more intuitively understandable than [[flat clustering]], and so it enjoys considerable popularity for multivariate analysis of data, e.g. of [[gene]] or [[protein]] [[sequence]]s.
'''Hierarchical clustering''' is a branch of [[cluster analysis]] which treats clusters hierarchically, i.e. as a set of levels. The construction of the hierarchy can be performed using two major approaches, or combinations thereof: In agglomerative hierarchical clustering (a [[bottom-up]] approach), existing clusters are merged [[iteration|iteratively]], while divisive hierarchical clustering (a [[top-down]] approach) starts out with all data in one cluster that is then split iteratively. At each step of the process, a mathematical measure of [[distance]] or [[similarity]] between (agglomerative) or within clusters (divisive) is being computed to determine how to split or merge. Several different distance and similarity measures can be used, which generally result in different hierarchies (especially for agglomerative ones which start out based on local information only), thus complicating their interpretation. Nonetheless, hierarchical clustering is more intuitively understandable than [[flat clustering]], and so it enjoys considerable popularity for multivariate analysis of data, e.g. of [[gene]] or [[protein]] [[sequence]]s.

Revision as of 05:21, 13 November 2009

This article is a stub and thus not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable Main Article is under development and subject to a disclaimer.

Hierarchical clustering is a branch of cluster analysis which treats clusters hierarchically, i.e. as a set of levels. The construction of the hierarchy can be performed using two major approaches, or combinations thereof: In agglomerative hierarchical clustering (a bottom-up approach), existing clusters are merged iteratively, while divisive hierarchical clustering (a top-down approach) starts out with all data in one cluster that is then split iteratively. At each step of the process, a mathematical measure of distance or similarity between (agglomerative) or within clusters (divisive) is being computed to determine how to split or merge. Several different distance and similarity measures can be used, which generally result in different hierarchies (especially for agglomerative ones which start out based on local information only), thus complicating their interpretation. Nonetheless, hierarchical clustering is more intuitively understandable than flat clustering, and so it enjoys considerable popularity for multivariate analysis of data, e.g. of gene or protein sequences.