Skip to contents

Generate correlated, synthetic normal variables with user-specified probability of MCAR. Specify the column length, correlation coefficient, standard deviation, number of columns and desired probability of missing values to obtain a data frame of correlated observations with missing values.

Usage

gen.mcar(len, rho, sigma, n_vars, na_prob = 0.1)

Arguments

len

number of rows per column

rho

desired correlation coefficient of generated variables. The length of rho must be equal to the product of n_vars and half of n_vars minus one.

sigma

desired standard deviation for each generated variable

n_vars

total number of variables to be generated. At least two variables must be provided.

na_prob

desired probability of missingness in each variable set to 10% by default.

Value

a data frame of at least 2 columns

Examples

syn_na <- gen.mcar(50,c(.25,.75,.044),c(1.1,.56,1.56),3,.47)
summary(syn_na)
#>        V1                V2                 V3         
#>  Min.   :-2.6477   Min.   :-0.71730   Min.   :-4.0036  
#>  1st Qu.:-0.7336   1st Qu.:-0.47288   1st Qu.:-1.1928  
#>  Median :-0.1164   Median :-0.05962   Median :-0.4745  
#>  Mean   :-0.3273   Mean   :-0.03817   Mean   :-0.3859  
#>  3rd Qu.: 0.2436   3rd Qu.: 0.34088   3rd Qu.: 0.2435  
#>  Max.   : 0.9038   Max.   : 0.98501   Max.   : 2.8767  
#>  NA's   :26        NA's   :22         NA's   :28