In this tutorial, we will use SeqSIMLA to simulate families with disease.
In this tutorial, we use Asian 500kb on chrom 1(download).
In our example pedigree file(download), there are 20 families, with 1,380 people.
We randomly choose two persons from each of the pedigrees in the pedigree file, then put them in a proband file(download).
If you don't have a pedigree file, you can just use the option "default 3-generation families" to generate fixed 3-generation pedigrees.
see "Output Options: -fam number" in User Manual.
One replicate of simulated data.
Four options below are required in SeqSIMLA to generate the disease status by the prevalence model.
The --mode-prev tells SeqSIMLA to use the prevalence model.
The -prev 0.05 specifies the disease prevalence in the general population as 5%.
We select sites 1, 200, and 3000 as the disease sites, assuming the odds ratio is 1.2 for the three sites.
Simulate pairwise interactions for all possible pairs in the disease loci with the odds ratio 1.2
Collect all file into a folder and placed in the same directory you run SeqSIMA
Execute the following command without interaction effects,
./SeqSIMLA -popfile data/ASN_500k.bed.gz -recfile data/ASN_500k.rec -famfile data/SAP.txt -proband data/probands.txt -folder test1 -header test -batch 1 -site 1,200,3000 --mode-prev -prev 0.05 -or 1.2
Execute the following command for interaction effects between two SNPs
./SeqSIMLA -popfile data/ASN_500k.bed.gz -recfile data/ASN_500k.rec -famfile data/SAP.txt -proband data/probands.txt -folder test1 -header test -batch 1 -site 1,200,3000 --mode-prev -prev 0.05 -or 1.2 -i 1.2
With our example files, this simulation would take about 150 seconds.
Notice: If you don't want to make the command yourself, we provide a generate command user interface on our website.