User Tools


Differences

This shows you the differences between two versions of the page.

Link to this comparison view

srvbatch-tutorial [2016/04/13 18:02]
srvbatch-tutorial [2016/04/13 18:02] (current)
Line 1: Line 1:
 +===== Simulate Rare Variants Data =====
 +This tutorial covers simulation for DNA sequence data under
 +
 + *  Demographic models based on Scott H. Williamson 2005 NIEHS SNPs data and Adam Eyre-Walker 2006 Environmental Genome Project (EGP) data
 + *  num. haplotypes = 51,340 x 2 = 102,680
 + *  Boyko 2008 European demographic model
 + *  num. haplotypes = 52,907 x 2 = 105,940
 + *  Boyko 2008 African demographic model
 + *  num. haplotypes = 25,636 x 2 = 51,272
 + *  Kryukov 2009 European demographic model
 + *  num. haplotypes = 90,000 x 2 = 180,000
 +
 +Proportion of synonymous and proportion of protective variants can be customized. Note that specification of selection coefficients have to be adjusted accordingly. Other common parameters are set as follows for all commands in this tutorial:
 +
 +<code bash>
 +lenI=1800 # would like 1800, 5000, 10000
 +lenII=1800
 +fn="​1800"​
 +reps=200
 +</​code>​\\
 +Boyko 2008 European population model, 37% synonymous variants:
 +
 +<code bash>
 +spower simulate --regRange=[$lenI,​$lenII] --fileName="​Boyko2008European$fn"​ --numReps=$reps --N=[7947,​7947,​262,​7019,​7019,​52907,​52907] --G=[5000,​84,​1,​5217,​1,​576] --mutationModel='​finite_sites'​ --mu=1.8e-08 --selModel='​multiplicative'​ --selDist='​Boyko_2008_European'​ --steps=[100] --verbose=0 --revertFixedSites=False > Boyko2008European$fn.log
 +</​code>​\\
 +Boyko 2008 African population model, 37% synonymous variants:
 +
 +<code bash>
 +spower simulate --regRange=[$lenI,​$lenII] --fileName="​Boyko2008African$fn"​ --numReps=$reps --N=[7778,​7778,​25636,​25636] --G=[5000,​1,​6809] --mutationModel='​finite_sites'​ --mu=1.8e-08 --selModel='​multiplicative'​ --selDist='​Boyko_2008_African'​ --steps=[100] --verbose=0 --revertFixedSites=False > Boyko2008African$fn.log
 +</​code>​\\
 +Scott H. Williamson 2005 NIEHS SNPs data and Adam Eyre-Walker 2006 Environmental Genome Project (EGP) data:
 +
 +<code bash>
 +spower simulate --regRange=[$lenI,​$lenII] --fileName="​Eyre2006Williamson$fn"​ --numReps=$reps --N=[8211,​8211,​51340,​51340] --G=[5000,​1,​908] --mutationModel='​finite_sites'​ --mu=1.8e-08 --selModel='​multiplicative'​ --selDist='​Eyre-Walker_2006'​ --steps=[100] --verbose=0 --revertFixedSites=False > Eyre2006Williamson$fn.log
 +</​code>​\\
 +Kryukov 2009 European population model, 37% synonymous variants:
 +
 +<code bash>
 +spower simulate --regRange=[$lenI,​$lenII] --fileName="​Kryukov2009European$fn"​ --numReps=$reps --N=[8100,​8100,​7900,​900000] --G=[5000,​10,​370] --mutationModel='​finite_sites'​ --mu=1.8e-08 --selModel='​multiplicative'​ --selDist='​Kyrukov_2009_European'​ --steps=[100] --verbose=0 --revertFixedSites=False > Kryukov2009European$fn.log
 +</​code>​\\
 +Different proportions of protective and detrimental variants, 20% protective, 43% deleterious and 37% non-synonymous:​
 +
 +<code bash>
 +spower simulate --regRange=[$lenI,​$lenII] --fileName="​Kryukov2009EuropeanProtective$fn"​ --numReps=$reps --N=[8100,​8100,​7900,​900000] --G=[5000,​10,​370] --mutationModel='​finite_sites'​ --mu=1.8e-08 --selModel='​multiplicative'​ --selDist='​mixed_gamma'​ --selCoef=[0.37,​0.0,​0.43,​0.184,​0.320,​0.184,​0.320,​0.5] --steps=[100] --verbose=0 --revertFixedSites=False > Kryukov2009EuropeanProtective$fn.log
 +</​code>​\\
 +The resulting default output are ''​*.sfs''​ and ''​*.gdat''​ files in the input format compatible with SEQPower power analysis commands.