library(lsasim)
packageVersion("lsasim")
[1] '2.1.2'
cluster_gen(n, N = 1, cluster_labels = NULL, resp_labels = NULL, cat_prop = NULL, n_X = NULL,
n_W = NULL, c_mean = NULL, sigma = NULL, cor_matrix = NULL, separate_questionnaires = TRUE,
collapse = "none", sum_pop = sapply(N, sum), calc_weights = TRUE, sampling_method = "mixed",
rho = NULL, theta = FALSE, verbose = TRUE, print_pop_structure = verbose)
As its single mandatory argument, cluster_gen requires a numeric list or vector containing the hierarchical structure of the data. As a general rule, as far as this first argument (n
) as well as the second argument (N
, representing the population structure) are concerned, vectors can be used to represent symmetric structures and lists can be used for asymmetric structures. What follows are some examples.
The function cluster_gen
generates clustered samples which resembles the composition of international large-scale assessments participants. The required argument is n
and the other optional arguments include
n
: a numeric vector with the number of sampled observations (clusters or subjects) on each level.N
: a list of numeric vector(s) with the population size of each sampled cluster element on each level.cluster_labels
: a character vector with the names of each cluster level.resp_labels
: a character vector with the names of the questionnaire respondents on each level.cat_prop
: a list of vectors where each vector contains the cumulative proportions for each category of a given item. If theta = TRUE, the first element of cat_prop must be a scalar 1, which corresponds to the theta.n_X
: the number of continuous (X
) variables per cluster level.n_W
: the number of ordinal (W
) variables per cluster level.cor_matrix
: a correlation matrix between all variables (except weights).c_mean
: the vector of means for the continuous variables or list of vectors for the continuous variables for each level.sigma
: the vector of of standard deviations for the continuous variables or list of vectors for the continuous variables for each level.separate_questionnaires
: if the logical argument separate_questionnaires
‘TRUE’, each level will have its own questionnaire. Otherwise, it will be labeled ‘q1’.theta
: if the logical argument theta
is TRUE
then the latent trait will be generated as the first continuous variable and labeled ‘theta’.collapse
: if the logical argument collapse
is ‘TRUE’, then function output contains only one data frame with all answers.sum_pop
: is the specification of the total population at each level (sampled or not)calc_weights
: if the logical argument calc_weights
is ‘TRUE’, then sampling weights are calculated.sampling_method
: can be “SRS” for Simple Random Sampling or “PPS” for Probabilities Proportional to Size.rho
: specifies the estimated intraclass correlation.verbose
: if the logical argument verbose
is ‘TRUE’, then messages are printed in the output.print_pop_structure
: if print_pop_structure
is ‘TRUE’, then the population hierarchical structure is printed out (as long as it differs from the sample structure)....
: additional parameterss to be passed to questionnaire_gen()
.We can specify a simple structure of 3 schools with 5 students in each school. That is, n = 3
and N = 5
.
set.seed(4388)
<- cluster_gen(c(n = 3, N = 5)) cg
── Hierarchical structure ────────────────────────────────────────────────────────────────
n1 (5 Ns)
n2 (5 Ns)
n3 (5 Ns)
── Information on sampling weights ───────────────────────────────────────────────────────
$n[[1]] cg
subject q1 q2 q3 q4 q5 q6 q7 q8 n.weight within.n.weight
1 1 -0.7985768 0.55776842 0.9278102 1 1 4 2 1 1 1
2 2 -1.0486078 2.28259560 -0.2269337 3 1 1 2 3 1 1
3 3 -0.1680413 -0.02049366 -0.7900484 3 1 3 2 1 1 1
4 4 1.4115562 -1.12757547 1.6993672 2 4 3 2 4 1 1
5 5 0.6689374 -1.51117001 -0.1845164 3 4 4 2 2 1 1
final.N.weight uniqueID
1 1 N1_n1
2 1 N2_n1
3 1 N3_n1
4 1 N4_n1
5 1 N5_n1
$n[[2]] cg
subject q1 q2 q3 q4 q5 q6 q7 q8 n.weight within.n.weight
1 1 0.83893595 0.4238664 -0.7212927 2 1 2 1 1 1 1
2 2 0.07260641 -0.3279862 -0.3841153 2 3 4 2 1 1 1
3 3 0.61013495 -1.1129113 -0.6149362 2 2 4 2 1 1 1
4 4 -2.53529525 0.2130771 2.3516784 3 5 1 1 2 1 1
5 5 0.46853204 -0.1806199 -0.4231896 2 3 1 1 4 1 1
final.N.weight uniqueID
1 1 N1_n2
2 1 N2_n2
3 1 N3_n2
4 1 N4_n2
5 1 N5_n2
$n[[3]] cg
subject q1 q2 q3 q4 q5 q6 q7 q8 n.weight within.n.weight
1 1 -1.9544485 -0.7390583 -2.1217731 1 1 1 1 1 1 1
2 2 1.1813684 -0.4654562 0.7780397 3 4 4 1 4 1 1
3 3 -0.4846220 0.5223044 -0.4788684 1 1 1 2 2 1 1
4 4 1.4972177 -0.5354472 -0.4295909 3 1 1 1 2 1 1
5 5 0.0806642 -1.6934998 0.8183471 1 1 4 1 4 1 1
final.N.weight uniqueID
1 1 N1_n3
2 1 N2_n3
3 1 N3_n3
4 1 N4_n3
5 1 N5_n3
We can specify a more complex structure of 2 schools with different numbers of students, sampling weights, and custom numbers of questions.
set.seed(4388)
<- list(3, c(20, 15, 25))
n <- list(5, c(200, 500, 400, 100, 100))
N <- cluster_gen(n, N, n_X = 5, n_W = 2) cg
── Hierarchical structure ────────────────────────────────────────────────────────────────
school1 (200 students)
school2 (500 students)
school3 (400 students)
school4 (100 students)
school5 (100 students)
school1 (20 students)
school2 (15 students)
school3 (25 students)
── Information on sampling weights ───────────────────────────────────────────────────────
str(cg$school[[1]])
'data.frame': 20 obs. of 12 variables:
$ subject : int 1 2 3 4 5 6 7 8 9 10 ...
$ q1 : num -1.351 -0.249 0.241 1.178 -0.104 ...
$ q2 : num -0.672 -0.849 1.678 -0.22 1.848 ...
$ q3 : num 0.175 -0.901 0.961 0.364 1.401 ...
$ q4 : num 0.0527 0.4653 -0.8303 -0.7196 0.1548 ...
$ q5 : num -0.2185 -0.0847 1.2169 -1.363 -1.1152 ...
$ q6 : Factor w/ 2 levels "1","2": 2 1 2 1 2 2 1 1 1 2 ...
$ q7 : Factor w/ 5 levels "1","2","3","4",..: 1 2 4 5 5 3 1 4 5 5 ...
$ school.weight : num 2.17 2.17 2.17 2.17 2.17 ...
$ within.school.weight: num 10 10 10 10 10 10 10 10 10 10 ...
$ final.student.weight: num 21.7 21.7 21.7 21.7 21.7 ...
$ uniqueID : chr "student1_school1" "student2_school1" "student3_school1" "student4_school1" ...
str(cg$school[[2]])
'data.frame': 15 obs. of 12 variables:
$ subject : int 1 2 3 4 5 6 7 8 9 10 ...
$ q1 : num 0.548 -0.51 0.373 0.527 0.163 ...
$ q2 : num 0.0978 -1.6416 0.2355 0.4376 0.0315 ...
$ q3 : num 1.574 0.512 0.49 1.264 -0.279 ...
$ q4 : num -0.646 -1.127 -0.39 0.119 -1.174 ...
$ q5 : num 0.27 0.466 -0.134 -0.326 -0.153 ...
$ q6 : Factor w/ 2 levels "1","2": 2 1 2 1 1 2 2 1 2 2 ...
$ q7 : Factor w/ 3 levels "1","2","4": 1 2 2 2 2 3 1 1 3 2 ...
$ school.weight : num 0.867 0.867 0.867 0.867 0.867 ...
$ within.school.weight: num 33.3 33.3 33.3 33.3 33.3 ...
$ final.student.weight: num 28.9 28.9 28.9 28.9 28.9 ...
$ uniqueID : chr "student1_school2" "student2_school2" "student3_school2" "student4_school2" ...
str(cg$school[[3]])
'data.frame': 25 obs. of 12 variables:
$ subject : int 1 2 3 4 5 6 7 8 9 10 ...
$ q1 : num 1.405 0.273 -0.911 0.237 -0.35 ...
$ q2 : num -1.4873 0.5872 0.8679 0.5469 -0.0578 ...
$ q3 : num -0.987 -0.034 -2.169 0.486 -0.273 ...
$ q4 : num -0.95 -0.143 0.692 -0.853 0.761 ...
$ q5 : num 0.965 -0.707 0.578 0.197 -0.944 ...
$ q6 : Factor w/ 2 levels "1","2": 1 1 2 1 2 1 1 1 1 1 ...
$ q7 : Factor w/ 4 levels "1","2","3","4": 1 4 4 4 2 4 4 4 4 1 ...
$ school.weight : num 1.08 1.08 1.08 1.08 1.08 ...
$ within.school.weight: num 16 16 16 16 16 16 16 16 16 16 ...
$ final.student.weight: num 17.3 17.3 17.3 17.3 17.3 ...
$ uniqueID : chr "student1_school3" "student2_school3" "student3_school3" "student4_school3" ...
We can also control the intra-class correlations and the grand mean.
set.seed(4388)
<- cluster_gen(c(5, 1000), rho = 0.9, n_X = 2, n_W = 0, c_mean = 10) cg
── Hierarchical structure ────────────────────────────────────────────────────────────────
school1 (1000 students)
school2 (1000 students)
school3 (1000 students)
school4 (1000 students)
school5 (1000 students)
── Information on sampling weights ───────────────────────────────────────────────────────
sapply(1:5, function(s) mean(cg$school[[s]]$q1)) # means per school != 10
[1] 4.929322 4.037097 10.849103 6.516552 6.440307
mean(sapply(1:5, function(s) mean(cg$school[[s]]$q1))) # closer to c_mean
[1] 6.554476
str(cg)
List of 1
$ school:List of 5
..$ :'data.frame': 1000 obs. of 7 variables:
.. ..$ subject : int [1:1000] 1 2 3 4 5 6 7 8 9 10 ...
.. ..$ q1 : num [1:1000] 4.79 4.68 3.7 6.23 4.27 ...
.. ..$ q2 : num [1:1000] 9.74 9.76 10.94 8.27 10.37 ...
.. ..$ school.weight : num [1:1000] 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ within.school.weight: num [1:1000] 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ final.student.weight: num [1:1000] 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ uniqueID : chr [1:1000] "student1_school1" "student2_school1" "student3_school1" "student4_school1" ...
..$ :'data.frame': 1000 obs. of 7 variables:
.. ..$ subject : int [1:1000] 1 2 3 4 5 6 7 8 9 10 ...
.. ..$ q1 : num [1:1000] 4.99 2.95 5.69 4.06 5.98 ...
.. ..$ q2 : num [1:1000] 4.77 6.1 8.14 5.35 9.23 ...
.. ..$ school.weight : num [1:1000] 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ within.school.weight: num [1:1000] 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ final.student.weight: num [1:1000] 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ uniqueID : chr [1:1000] "student1_school2" "student2_school2" "student3_school2" "student4_school2" ...
..$ :'data.frame': 1000 obs. of 7 variables:
.. ..$ subject : int [1:1000] 1 2 3 4 5 6 7 8 9 10 ...
.. ..$ q1 : num [1:1000] 10.1 10.3 10.7 13.2 10.4 ...
.. ..$ q2 : num [1:1000] 14.5 14.1 14.9 19.3 15.3 ...
.. ..$ school.weight : num [1:1000] 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ within.school.weight: num [1:1000] 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ final.student.weight: num [1:1000] 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ uniqueID : chr [1:1000] "student1_school3" "student2_school3" "student3_school3" "student4_school3" ...
..$ :'data.frame': 1000 obs. of 7 variables:
.. ..$ subject : int [1:1000] 1 2 3 4 5 6 7 8 9 10 ...
.. ..$ q1 : num [1:1000] 6.59 7.78 5.26 7.74 6.19 ...
.. ..$ q2 : num [1:1000] 11.7 10.4 13.6 11.4 12 ...
.. ..$ school.weight : num [1:1000] 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ within.school.weight: num [1:1000] 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ final.student.weight: num [1:1000] 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ uniqueID : chr [1:1000] "student1_school4" "student2_school4" "student3_school4" "student4_school4" ...
..$ :'data.frame': 1000 obs. of 7 variables:
.. ..$ subject : int [1:1000] 1 2 3 4 5 6 7 8 9 10 ...
.. ..$ q1 : num [1:1000] 4.77 6.24 6.04 8.05 6 ...
.. ..$ q2 : num [1:1000] 17.6 16.2 18.1 15.4 12 ...
.. ..$ school.weight : num [1:1000] 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ within.school.weight: num [1:1000] 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ final.student.weight: num [1:1000] 1 1 1 1 1 1 1 1 1 1 ...
.. ..$ uniqueID : chr [1:1000] "student1_school5" "student2_school5" "student3_school5" "student4_school5" ...
- attr(*, "class")= chr [1:2] "lsasimcluster" "list"
We can make the intraclass variance explode by forcing “incompatible” rho and c_mean.
<- cluster_gen(c(5, 1000), rho = 0.5, n_X = 2, n_W = 0, c_mean = 1:5) x
── Hierarchical structure ────────────────────────────────────────────────────────────────
school1 (1000 students)
school2 (1000 students)
school3 (1000 students)
school4 (1000 students)
school5 (1000 students)
── Information on sampling weights ───────────────────────────────────────────────────────
anova(x)
ANOVA estimators
Source Sample.statistic Population.estimate
1 Within-group variance 2504.179 2504.179
2 Between-group variance 4853.708 4851.203
3 Total variance 6385.918 NA
Intraclass correlation
Estimated Standard.error
q1 0.6595447 0.1589391
Testing for group differences
F-statistic: 1938.243 on 4 and 4995 DF. p-value: 0
ANOVA estimators
Source Sample.statistic Population.estimate
1 Within-group variance 2419.402 2419.402
2 Between-group variance 1117.501 1115.081
3 Total variance 3311.645 NA
Intraclass correlation
Estimated Standard.error
q2 0.3154864 0.153111
Testing for group differences
F-statistic: 461.8914 on 4 and 4995 DF. p-value: 0
cluster_gen
.The named vector below represents a sampling structure of 1 country, 2 schools, 5 students per school. The naming of the vector is optional.
set.seed(4388)
<- c(cnt = 1, sch = 2, stu = 5)
n <- cluster_gen(n = n) cg
── Hierarchical structure ────────────────────────────────────────────────────────────────
cnt1
├─cnt1_sch1 (5 stus)
└─cnt1_sch2 (5 stus)
── Information on sampling weights ───────────────────────────────────────────────────────
cg
$cnt
$cnt[[1]]
subject q1 q2 q3 q4 q5 q6 q7 q8 q9 cnt.weight
1 1 -0.1857563 0.5894552 -0.004878196 4 1 4 3 3 1 1
2 2 -1.3145390 -0.6319421 0.682163912 4 2 3 3 3 3 1
within.cnt.weight final.sch.weight uniqueID
1 1 1 sch1_cnt1
2 1 1 sch2_cnt1
$sch
$sch[[1]]
subject q1 q2 q3 q4 q5 sch.weight within.sch.weight final.stu.weight
1 1 1.50583778 3 1 1 2 1 1 1
2 2 0.06801399 2 2 4 3 1 1 1
3 3 -1.50350211 2 3 1 4 1 1 1
4 4 -0.31483916 2 2 1 3 1 1 1
5 5 0.92196178 2 2 1 4 1 1 1
uniqueID
1 stu1_sch1_cnt1
2 stu2_sch1_cnt1
3 stu3_sch1_cnt1
4 stu4_sch1_cnt1
5 stu5_sch1_cnt1
$sch[[2]]
subject q1 q2 q3 q4 q5 sch.weight within.sch.weight final.stu.weight
1 1 3.6056537 3 3 1 4 1 1 1
2 2 0.9469375 3 3 2 4 1 1 1
3 3 -0.2483334 3 3 4 4 1 1 1
4 4 0.4508385 2 3 1 4 1 1 1
5 5 -1.2610979 1 2 2 3 1 1 1
uniqueID
1 stu1_sch2_cnt1
2 stu2_sch2_cnt1
3 stu3_sch2_cnt1
4 stu4_sch2_cnt1
5 stu5_sch2_cnt1
attr(,"class")
[1] "lsasimcluster" "list"
The named vector below represents a sampling structure of 1 country, 2 schools, 5 students per school. In the example, the number of continuous variables have been specified as n_X
= 10. Only 5 means have been expressed to correspond to the 10 continuous variables. That is, c_mean
= c(0.3, 0.4, 0.5, 0.6, 0.7). The function will still run by recycling the means over the other, five, variables. In this case, a warning message that reads Warning: c_mean recycled to fit all continuous variables
will be reported.
set.seed(4388)
<- c(cnt = 1, sch = 2, stu = 5)
n <- cluster_gen(n = n, n_X = 10, c_mean = c(0.3, 0.4, 0.5, 0.6, 0.7)) cg
── Hierarchical structure ────────────────────────────────────────────────────────────────
cnt1
├─cnt1_sch1 (5 stus)
└─cnt1_sch2 (5 stus)
── Information on sampling weights ───────────────────────────────────────────────────────
Warning: c_mean recycled to fit all continuous variables
Warning: c_mean recycled to fit all continuous variables
Warning: c_mean recycled to fit all continuous variables
cg
$cnt
$cnt[[1]]
subject q1 q2 q3 q4 q5 q6 q7
1 1 0.1493738 -0.2941853 -0.1356861 1.0942713 0.0007041074 -0.403876 -1.8308171
2 2 1.4250256 0.4499239 0.3912278 -0.5754092 1.6994017416 1.321272 0.7349227
q8 q9 q10 q11 q12 cnt.weight within.cnt.weight final.sch.weight
1 1.7664319 -0.5383132 1.3683602 1 1 1 1 1
2 0.2770982 0.9291194 0.7627068 1 2 1 1 1
uniqueID
1 sch1_cnt1
2 sch2_cnt1
$sch
$sch[[1]]
subject q1 q2 q3 q4 q5 q6 q7
1 1 0.7107490 -0.068956415 1.6418185 -0.2682343 2.23323556 0.1442393 0.8616713
2 2 -0.5855985 -0.585710878 -0.5414412 0.2357256 0.07303423 -1.3597101 -0.9964523
3 3 -1.0835479 1.880217061 -0.1643727 0.1607227 -0.78261691 1.1395126 -0.2356088
4 4 0.5342037 0.007799633 1.0866804 1.1594587 1.03261534 0.2807431 -0.2725278
5 5 -0.2916682 0.884475474 2.3551113 0.6455583 1.48256259 0.6658749 0.7246589
q8 q9 q10 q11 q12 q13 q14 sch.weight within.sch.weight
1 1.6934789 1.0571774 2.438611779 3 5 4 2 1 1
2 -0.3013998 0.1720372 0.001787841 5 5 2 3 1 1
3 0.8925784 0.6388118 -0.015964416 5 3 1 2 1 1
4 1.8254067 -0.7224324 0.934325677 4 5 2 2 1 1
5 0.2864523 -0.6233521 1.339058111 5 5 2 1 1 1
final.stu.weight uniqueID
1 1 stu1_sch1_cnt1
2 1 stu2_sch1_cnt1
3 1 stu3_sch1_cnt1
4 1 stu4_sch1_cnt1
5 1 stu5_sch1_cnt1
$sch[[2]]
subject q1 q2 q3 q4 q5 q6 q7
1 1 0.8857965 0.6805171 1.70146414 0.8126441 1.21032818 2.24556963 0.5436328
2 2 0.4251834 -0.9716282 -0.14179050 1.6298518 0.69780010 -0.81250703 0.9940499
3 3 -2.6623576 -0.9154311 1.00495412 -2.4114249 -0.50579666 -2.79170273 0.4186921
4 4 -0.4276990 1.1312695 -0.03547911 2.1496012 0.08720867 -0.05207824 0.6496843
5 5 1.0640377 0.5922904 -0.15503371 0.2651547 2.00796094 1.47916002 0.8115288
q8 q9 q10 q11 q12 q13 q14 q15 q16 q17 sch.weight
1 0.739390885 0.4754976 -0.006058309 5 1 4 4 2 2 3 1
2 -0.110809246 0.8357513 1.108207060 5 4 1 1 3 1 3 1
3 -0.274077470 2.3694774 2.700262947 5 5 5 2 3 1 3 1
4 0.007779354 0.6725442 0.339168074 5 1 2 2 1 1 2 1
5 2.031369897 0.1953666 1.566951833 5 4 5 1 2 1 3 1
within.sch.weight final.stu.weight uniqueID
1 1 1 stu1_sch2_cnt1
2 1 1 stu2_sch2_cnt1
3 1 1 stu3_sch2_cnt1
4 1 1 stu4_sch2_cnt1
5 1 1 stu5_sch2_cnt1
attr(,"class")
[1] "lsasimcluster" "list"
The named vector below represents a sampling structure of 3 schools, 2 classes, and 5 students per class. Again, the naming of the vector is optional. However, n_X
and sigma
can be expressed as lists that coincide with the different levels (i.e., schools and classes). For example, n_X
= c(1, 2) and sigma
= list(.1, c(1, 2) can be represented to represent the school and classroom levels. Note that, sigma
= list(.1, c(1, 2) means that at cluster 1 (class), the standard deviations are .1, where as the standard deviations for level 2 (class) are 1 and 2.
set.seed(4388)
<- c(school = 3, class = 2, student = 5)
n <- cluster_gen(n, n_X = c(1, 2), sigma = list(0.1, c(1, 2))) cg
── Hierarchical structure ────────────────────────────────────────────────────────────────
school1
├─school1_class1 (5 students)
└─school1_class2 (5 students)
school2
├─school2_class1 (5 students)
└─school2_class2 (5 students)
school3
├─school3_class1 (5 students)
└─school3_class2 (5 students)
── Information on sampling weights ───────────────────────────────────────────────────────
summary(cg)
──────────────────────────────────────────────────────────────────────────────────────────
[[1]]
q1 q2
Min. :-0.099338 1:5
Mean : 0.007348 2:1
Max. : 0.102284
Prop.
Stddev.: 0.0795 1:0.8333
2:0.1667
q1 q2
q1 1.0000000 0.7913169
q2 0.7913169 1.0000000
[[1]]
q1 q2 q3 q4 q5 q6
Min. :-2.3736 Min. :-3.609443 2: 5 1:19 1:8 1: 9
Mean :-0.1235 Mean :-0.268586 3: 4 2:11 2:7 2:10
Max. : 2.0970 Max. : 3.514859 4: 4 4:9 4: 5
5:13 Prop. 3:6 3: 6
Stddev.: 1.1163 Stddev.: 2.0337 1: 4 1:0.6333
2:0.3667 Prop. Prop.
Prop. 1:0.2667 1:0.3
2:0.1667 2:0.2333 2:0.3333
3:0.1333 4:0.3 4:0.1667
4:0.1333 3:0.2 3:0.2
5:0.4333
1:0.1333
q1 q2 q3 q4 q5 q6
q1 1.00000000 -0.08248772 0.49946142 0.2345035 -0.01698463 -0.2893691
q2 -0.08248772 1.00000000 -0.09477988 0.4691688 0.20661211 0.1793382
q3 0.49946142 -0.09477988 1.00000000 0.3187024 -0.13300839 0.1824814
q4 0.23450350 0.46916880 0.31870236 1.0000000 -0.21498773 0.2768035
q5 -0.01698463 0.20661211 -0.13300839 -0.2149877 1.00000000 -0.1618769
q6 -0.28936909 0.17933817 0.18248136 0.2768035 -0.16187694 1.0000000
The named vector below represents a sampling structure of 3 schools, 2 classes, and 5 students per class. Again, the naming of the vector is optional. However, c_mean
can also be expressed as a list that coincide with the different levels (i.e., schools and classes). For example, c_mean
= list(.1, c(0.55, 0.32) can be represented to represent the school and classroom levels. Note that, c_mean
= list(.1, c(0.55, 0.32)) means that at cluster 1 (class), the means for the continuous variables are .1, where as the means for level 2 (class) are 0.55 and 0.32.
set.seed(4388)
<- c(school = 3, class = 2, student = 5)
n <- cluster_gen(n, n_X = c(1, 2), n_W = c(0, 1), c_mean = list(0.1, c(0.55, 0.32))) cg
── Hierarchical structure ────────────────────────────────────────────────────────────────
school1
├─school1_class1 (5 students)
└─school1_class2 (5 students)
school2
├─school2_class1 (5 students)
└─school2_class2 (5 students)
school3
├─school3_class1 (5 students)
└─school3_class2 (5 students)
── Information on sampling weights ───────────────────────────────────────────────────────
cg
$school
$school[[1]]
subject q1 school.weight within.school.weight final.class.weight uniqueID
1 1 -0.2247919 1 1 1 class1_school1
2 2 -2.0902316 1 1 1 class2_school1
$school[[2]]
subject q1 school.weight within.school.weight final.class.weight uniqueID
1 1 1.9632649 1 1 1 class1_school2
2 2 0.2743303 1 1 1 class2_school2
$school[[3]]
subject q1 school.weight within.school.weight final.class.weight uniqueID
1 1 1.339111 1 1 1 class1_school3
2 2 -1.577978 1 1 1 class2_school3
$class
$class[[1]]
subject q1 q2 q3 class.weight within.class.weight final.student.weight
1 1 0.6939464 1.429263 3 1 1 1
2 2 0.6444333 -1.340612 2 1 1 1
3 3 0.7724895 1.191870 2 1 1 1
4 4 2.5861532 1.496156 3 1 1 1
5 5 0.9868058 1.428098 2 1 1 1
uniqueID
1 student1_class1_school1
2 student2_class1_school1
3 student3_class1_school1
4 student4_class1_school1
5 student5_class1_school1
$class[[2]]
subject q1 q2 q3 class.weight within.class.weight final.student.weight
1 1 0.04328771 0.9667577 1 1 1 1
2 2 0.85682751 -0.5343172 3 1 1 1
3 3 0.95461020 0.9208733 3 1 1 1
4 4 0.03384666 0.6000454 2 1 1 1
5 5 -0.01723126 2.6036897 1 1 1 1
uniqueID
1 student1_class2_school1
2 student2_class2_school1
3 student3_class2_school1
4 student4_class2_school1
5 student5_class2_school1
$class[[3]]
subject q1 q2 q3 class.weight within.class.weight
1 1 0.2135740 -0.230983361 3 1 1
2 2 0.7499582 1.431284170 1 1 1
3 3 -1.3288479 0.761635852 2 1 1
4 4 1.7718949 -0.501747308 5 1 1
5 5 0.6906895 0.004222373 3 1 1
final.student.weight uniqueID
1 1 student1_class1_school2
2 1 student2_class1_school2
3 1 student3_class1_school2
4 1 student4_class1_school2
5 1 student5_class1_school2
$class[[4]]
subject q1 q2 q3 class.weight within.class.weight
1 1 2.563471234 -0.05330786 3 1 1
2 2 -1.059654042 0.32407999 1 1 1
3 3 -0.336184584 -0.72432619 1 1 1
4 4 -0.004385578 0.58409392 4 1 1
5 5 0.296826906 0.10677686 1 1 1
final.student.weight uniqueID
1 1 student1_class2_school2
2 1 student2_class2_school2
3 1 student3_class2_school2
4 1 student4_class2_school2
5 1 student5_class2_school2
$class[[5]]
subject q1 q2 q3 class.weight within.class.weight final.student.weight
1 1 0.33690460 -1.19476433 1 1 1 1
2 2 1.56177909 -0.26130250 1 1 1 1
3 3 2.76962281 -1.28985141 1 1 1 1
4 4 1.29857289 -0.02952429 1 1 1 1
5 5 0.07510387 1.01647674 3 1 1 1
uniqueID
1 student1_class1_school3
2 student2_class1_school3
3 student3_class1_school3
4 student4_class1_school3
5 student5_class1_school3
$class[[6]]
subject q1 q2 q3 class.weight within.class.weight final.student.weight
1 1 0.1379774 0.99118261 2 1 1 1
2 2 2.1143383 -0.08084345 3 1 1 1
3 3 1.0878675 0.38597927 3 1 1 1
4 4 0.4066554 0.48029670 1 1 1 1
5 5 -0.4821055 -0.04994921 1 1 1 1
uniqueID
1 student1_class2_school3
2 student2_class2_school3
3 student3_class2_school3
4 student4_class2_school3
5 student5_class2_school3
attr(,"class")
[1] "lsasimcluster" "list"