Simulate disease status


--d

Dominant disease model will be used. The default disease model is an additive model. This will affect the genotype coding in the following penetrance function. For a dominant model, the genotypes at a locus will be coded as 0, 2, and 2 for genotypes AA, Aa, and aa, respectively, where a is the minor allele.

--r

Recessive disease model will be used. For a recessive model, the genotypes at a locus will be coded as 0, 0, and 2 for genotypes AA, Aa, and aa, respectively, where a is the minor allele.

--mode-prev

The prevalence model will be used. This is the default disease model for SeqSIMLA. To use this model, the odds ratios for the disease loci (specified using -or) and the disease prevalence (specified using -prev) will need to be specified. The penetrance function is based on a logistic function:

logit(P(Aff))=α+β1x1+β2x2+β3x3...

x1, x2, x3,... are the genotypes at the disease loci. β1, β2, β3,... are the log of the odds ratios specified with -or. SeqSIMLA will search for α so that the disease prevalence is close to the prevalence specified in -prev.

-or number,number,...

Odds ratios for the disease loci. The default value is 1 if not specified. This option should be used with --mode-prev. The order of the odds ratios should correspond to the order of disease sites in -site.
For example,

-or 1.3,1.5,1.7

The first, second, and third disease sites have odds ratios of 1.3, 1.5, and 1.7. Note that the reference allele is the minor allele at each site.

-i number

Odds ratio for pairwise interactions. SeqSIMLA offers a quick simulations for interaction effects between two SNPs on the disease. When this option is used, SeqSIMLA simulates pairwise interactions for all possible pairs in the disease loci with the odds ratio as specified.
For example, if three disease loci are specified, the penetrance function will be:

logit(P(Aff))=α+β1x1+β2x2+β3x3+β4x1x2+β4x1x3+β4x2x3

where β4 is the log of the odds ratio as specified here.
For example,

-i 2

β4=log(2) will be used.

-prev number

Prevalence for the disease. The default value is 0.05. This option should also be used with --mode-prev.
For example,

-prev 0.3

--mode-par

The population attributable risk (PAR) mode will be used. The overall PAR for the disease loci should be specified using -par and the baseline prevalence should also be specified using -alpha. Either --fixed-par or --random-par should be specified. Assume the PAR value for site i is Pi. The effect size βi=log(1+Pi/((1-Pi ) Ri )) is calculated, where Ri is the minor allele frequency at site i. Then a logistic penetrance function is also used to determine the disease status:

logit(P(Aff))=α+β1x1+β2x2+β3x3...

where α is specified using -alpha.

-alpha number

This is the baseline penetrance without the effects of disease loci. Generally log(alpha) would be close the disease prevalence when the effects of disease loci are small.

-par number

The overall PAR. The default value is 0.05.
For example,

-par 0.1

The overall PAR for all disease loci will be assumed as 0.1. The PAR for individual sites will be determined based on the --fixed-par and --random-par options.

--fixed-par

PAR for each disease locus is equal. Therefore, if the overall PAR is P and there are k disease loci, the PAR for each locus is P/k.

--random-par

PAR for each disease locus is randomly assigned and the sum of the PAR for all disease loci is equal to the total PAR.