sim.snps {barkse}R Documentation

Simulated SNPS data with Logistic model

Description

Simple logistic regression simulation.

Usage

sim.snps(n, simtypes, datatype, ncats, delta, priors, names)

Arguments

n number of data points to create
simtypes vector of the types of the variables in simulation
-1: continuous; 1: ordered/binary; 2: dominant;
3: recessive; 4: non-ordered
datatypes vector of the types of the variables (blind) for the model
-1: continuous; 1: ordered/binary; 2: non-ordered
ncats vector of the number of categories for all variables
integers for categorical variable, scale (sd) for continous variable
delta vector of the 0/1, simulated model specification
priors the prior list for all variables
vector of probabilities for categorical variable,
center (mean) for continous variable (assuming normal population)
names vector of strings, names of the variables

Value

Returns a snpdata class, which is a list of

y n*1, 0/1 response
xid n*1, 1~m row index of x in xunique matrix
xunique m*p, unique covariate matrix
nys m*3, number of y=0/1/* with certain covariate
types p*1, 1/2 variable type ordered/non-ordered
ncats p*1, number of categories in each variable
delta p*1, 0/1 if known the true model in simulation

Examples

#  1 Specify the number of categories for all variables
#      Integers for categorical variable, scale (sd) for continous variable
ncats <- c(2, 4, rep(3, 4))
#  2 Specify the types of the variables in generating the design table
#      -1: continuous; 1: ordered/binary; 2: dominant;
#       3: recessive;  4: non-ordered
simtypes <- c(1, 1, 4, 1, 2, 3)
#  3 Specify the types goes with the snpdata class (blind)
#      -1: continuous; 1: ordered/binary; 2: non-ordered
datatypes <- c(1, 1, 2, 2, 2, 2)
#  4 Specify the prior list for all variables
#      vector of probabilities for categorical variable
#      center (mean) for continous variable (assuming normal population)
priors <- list(c(.5, .5), c(.2, .3, .4, .1),
               c(.7, .2, .1), c(.4, .5, .1),
               c(.75, .2, .05), c(.6, .35, .05))
names <- c("SEX", "AGE", "RACE", "SNP1", "SNP2", "SNP3")
#  5 Specify which variables come into the logistic model
#      For simplicity, the logistic regression coefficient are 0 or 0.7
#      You many play with this vector to see different behaviour
delta <- c(1, 0, 1, 0, 1, 0)
#  6 Simulate the data from a logistic model
#      You may need to use make.snpdata() to generate the proper data format.
snpdata <- sim.snps(500, simtypes=simtypes,
                    datatypes=datatypes, ncats=ncats,
                    delta=delta, priors=priors, names=names)
summary(snpdata)

[Package barkse version 0.1-0 Index]