Previous: QTL Projection, Up: Meta Analysis


3.3 QTL Clustering

Here we want to address the following question: How many “real” QTL do the QTL detected in the different mapping experiments represent - one, two, three, four,... or as many as the number detected throughout the studies ? The meta-analysis of QTL can be viewed as a clustering procedure. To do so, MetaQTL implements tow kinds of clustering algorithm. Whatever the procedure used to perform the clustering, the QTL locations are assumed to be normally distributed around their true locations with variances which can be derived from the reported CI or r-square values. This Gaussian and unbiased approximation comes from the classical asymptotic Gaussian distribution of the maximum-likelihood estimation of the parameters.

3.3.1 ClustQTL

3.3.1.1 Method

ClustQTL implements a clustering procedure based on a Gaussian mixture model which parameter estimates are obtained by applying a EM-algorithm.

3.3.1.2 Command Line Options

Option Usage Type Explanation
-q,--qtlmap required string The map with the QTL to clusterize (XML format).
-o,--output required string The output file stem.
-t,--tonto optional string The trait ontology.
-k,--kmax optional integer The maximal number of clusters.
-c,--chr optional string The name of the chromosome on which to perform the meta-analysis.
--cimode optional integer The CI computation mode.
--cimiss optional integer The imputation mode for missing CI.
--emrs optional integer the number of random starting points for the EM algorithm
--emeps optional double the convergence threshold for the EM algorithm

The option --cimode controls the mode of computation of the variances of the QTL. There are four modes:

The --cimiss defines how to deal with QTL for which no variance can be computed. There are two possibilities:
3.3.1.3 Output

The output of ClustQTL is divided into 3 plain text files:

3.3.2 QTLTree

3.3.2.1 Method

Another way to clusterize the observed QTL is to use standard hierarchical clustering procedures. QTLTree implements two kinds of hierarchical clustering algorithm :

3.3.2.2 Command Line Options

Option Usage Type Explanation
-q,--qtlmap required string The map with the QTL to clusterize (XML format).
-o,--output required string The output file.
-m,--mode optional integer The clustering mode (default is 2).
-t,--tonto optional string The trait ontology.
--cimode optional integer The variance computation mode.
--cimiss optional integer The imputation mode for missing variances.

The option -m (or --mode) allows user to switch between the two possible clustering algorithms:

The options --cimode and --cimiss works as for QTLClust.
3.3.2.3 Output

The output of QTLTree consists in one plain text file. The file is organized as follows:

Identifier Value
CR The name of the linkage group.
TR The name of the trait followed by the number of related QTL on the chromosome.
QT A QTL involved in the clustering with its identifier, its name, its most probable position on the chromosome and its estimated standard deviation.
HC The tree obtained by the clustering algorithm in Newick's format.
For example,

     
     CR 10
     TR FT 16
     QT 0 Ribaut_1996_DPS_6 8.02 7
     QT 1 Bohn_2000_DPS_12 51.68 4.87
     QT 2 Poupard_2001_DPS_13 40.02 3.65
     QT 3 Mechin_2001_HT_5 71 4.26
     QT 4 Lubberstedt_1997_HT_20 59.14 3.65
     QT 5 Groh_1998_HT_7 100.01 12.2
     QT 6 qplht127 52.39 5.25
     QT 7 Rebai_1997_SD_5 66.51 5.17
     QT 8 Blanc_DFflofch10 61.57 2.55
     QT 9 Rebai_1997_SD_25 54.5 12.46
     QT 10 Rebai_1997_SD_19 62.14 10.64
     QT 11 Blanc_FXflofch10 58.17 3.57
     QT 12 Ribaut_1996_SD_6 6.78 10.83
     QT 13 Rebai_1997_SD_33 59.96 9.73
     QT 14 Rebai_1997_SD_12 49.04 14.59
     QT 15 Blanc_SFflofch10 53.13 3.32
     HC ((0:0.16,12:0.16):87.85,((((((1:0.24,((6:0.06,15:0.06):0.11,9:0.11):0.24):0.4,14:0.4)
     HC :7.43,(((4:0.04,13:0.04):0.16,11:0.16):1.11,(8:0.01,10:0.01):1.11):7.43):15.64,
     HC (3:2.43,7:2.43):15.64):24.56,5:24.56):40.9,2:40.9):87.85);