DAPC example

Discriminant analysis of principal components (DAPC) is a multivariate approach for identifying variation between pre-defined populations (Jombart et al. 2010).

It can be applied to SNP data, or other types of data such a gene expression.

Figure 1

Fundamental difference between PCA and DA. (a) The diagram shows the essential difference between Principal Component Analysis (PCA) and Discriminant Analysis (DA). Individuals (dots) and groups (colours and ellipses) are positioned on the plane using their values for two variables. In this space, PCA searches for the direction showing the largest total variance (doted arrow), whereas DA maximizes the separation between groups (plain arrow) while minimizing variation within group. As a result, PCA fails to discriminate the groups (b), while DA adequately displays group differences. (Jombart et al. 2010)


Run DAPC to discriminate between the nearshore and offshore M. cavernosa samples from the PCA example

cd DAPC_from_vcf


#run R script pca_from_snps.R

result