Non-homogeneity (BppSuite Manual 2.2.0)

You can specify a wide range of non-homogeneous models, by combining different options.

3.5.2.1 One-per-branch non-homogeneous models

This option share the same parameters as the homogeneous case, since the same kind of model is used for each branch. The additional options are the following:

3.5.2.2 General non-homogeneous models

You now have to configure each model individually, using the syntax introduced for the homogeneous case, excepted that model will be numbered, for instance:

model1 = T92(theta=0.39, kappa=2.79)

The additional option is available to attach the model to branches in the tree, specified by the id of the upper node in the tree:

3.5.2.3 Paths among non-homogeneous mixture models

To define constraints for sites between submodels, we can set "paths" that any site must follow. For example, in the following description:

nonhomogeneous = general
nonhomogeneous.number_of_models = 3

model1=T92()
model2=MixedModel(model=T92(kappa=Simple(values=(4,10,20),probas=(0.1,0.5,0.4))))
model3=MixedModel(model=TN93(theta1=Simple(values=(0.1,0.5,0.9),probas=(0.3,0.2,0.5))))

model1.nodes_id=0:1
model2.nodes_id=2:3
model3.nodes_id=4:5

In this case, on branches 2 & 3 a site follows any submodel of model 2 (but the same submodel on both branches), and on branches 4 & 5, a site follows any submodel of model 3 (the same on both branches as well). But there is no constraint between models 2 & 3, which means that a site can follow any submodel of model 2 and any submodel of model 3.

If the user wants that a site with T92.kappa=4 in model 2 has TN93.theta1=0.1 in model 3, that a site with T92.kappa=10 in model 2 has TN93.theta1=0.9 in model 3, and that other cases are free (in this case it means that T92.kappa=20 in model 2 is linked with TN93.theta1=0.5 in model 3), then we can use the declarations:

site.number_of_paths=2
site.path1=model2[T92.kappa_1] & model3[TN93.theta1_2]
site.path2=model2[T92.kappa_2] & model3[TN93.theta1_3]

site.path1=model2[T92.kappa_1] & model3[TN93.theta1_2] & model3[TN93.theta1_3]

means that a site that has T92.kappa=4 in model2 has either TN93.theta1=0.5 or TN93.theta1=0.9 in model3.

Because of these constraints, the probabilities of the submodels are linked. In the first example, probability of T92.kappa=4 in model 2 equals the probability of TN93.theta1=0.5 in model 3. Since it is contradictory with the probabilities defined in models 2 or 3, the reference probabilities are the ones of the first numbered mixed model, here model 2. In this case, the probabilities in model 3 may have no use, but with the second example the probability of submodel T92.kappa=4 equals the sum of the probabilities of submodels TN93.theta1=0.5 or TN93.theta1=0.9. The relative proportion of those models used in the declaration of model 3 is then used. Here their respective probabilities are then: 0.1*0.2/ (0.2+0.5)=0.0286 and 0.1*0.5/(0.2+0.5)=0.0714.

Concerning the optimization procedure, this choice may entail the non- identifiability of several parameters (here the probabilities in model 3), so the user should be careful about this.

Another example in the case of mixtures of mixed models, where the submodels are defined by their names;

nonhomogeneous = general
nonhomogeneous.number_of_models = 2

model1=LLG08_UL2()
model2=LLG08_UL3()

site.number_of_paths=2
site.path1=model1[LLG08_UL2.M2] & model2[LLG08_UL3.Q1]
site.path2=model1[LLG08_UL2.M1] & model2[LLG08_UL3.Q2] & model2[LLG08_UL3.Q3]

When nonhomogeneity option is one_per_branch, each site is constrained to follow the same submodel from leaves to root.

3.5.2.4 Root frequencies

In case of nonstationary models, the ancestral frequencies are distinct parameters. If a model is assumed to be stationary, the “None” parameter value can be used, which is strictly equivalent to setting nonhomogeneous.stationary=yes.

When the model is a mixture model, since there is not a set of equilibrium frequencies, with this option the root frequencies are set to be the average (with the respective probabilities of the submodels) of the equilibrium frequencies of the submodels.

As since version 0.4.0, BppSuite uses the keyval syntax to set up root frequencies,

The Frequencies set used can be any of the ones described below See Frequencies sets, depending on the alphabet used.

3.5.2 Setting up non-stationary / non-homogeneous models

3.5.2.1 One-per-branch non-homogeneous models

3.5.2.2 General non-homogeneous models

3.5.2.3 Paths among non-homogeneous mixture models

3.5.2.4 Root frequencies