What is Cluster Analysis?
Cluster Analysis helps risk managers identify different groups (clusters) in a given portfolio data. It assists in the identification of groups of similar cases in a data set. Groups represent observation subsets that exhibit homogeneity (i.e., similarities). This is due to the variables’ profiles that allow them to be distinguished from those found in other groups. In the context of a database with variables in columns and observations in rows. Cluster analysis aggregates borrowers based on their variables’ profiles.
Broadly, you can use two approaches to implement cluster analysis including hierarchical/aggregative clustering and divisive/partitioned clustering.
In hierarchical clustering, we build cluster hierarchies and aggregate them on a case-by-case basis. This is to form a tree structure with the clusters shown as leaves and the whole population shown as the roots. Combining clusters begins from the leaves, continues along the branches, and finally reaches the roots.
Why is it important?
One critical aspect of default predictive modelling is identifying the number of groups from the given portfolio to build a separate model. As every group has its distinct properties.