Functions in this package provide solution to classical problem in survey methodology - an optimum sample allocation in stratified sampling schemes. In this context, the optimal allocation is in the classical Tschuprov-Neyman’s sense and it satisfies additional either lower or upper bounds restrictions imposed on sample sizes in strata. There are few different algorithms available to use, and one them is based on popular sample allocation method that applies Neyman allocation to recursively reduced set of strata.
A minor modification of the classical optimium sample allocation problem leads to the minimum sample size allocation. This problems lies in the determination of a vector of strata sample sizes that minimizes total sample size, under assumed fixed level of the pi-estimator’s variance. As in the case of the classical optimal allocation, the problem of minimum sample size allocation can be complemented by imposing upper bounds constraints on sample sizes in strata.
Stratallo provides two user functions, dopt
and nopt
that solve sample allocation problems briefly characterized above. In this context, it is assumed that the sampling designs in strata are chosen so that the variance of the pi-estimator of the population total is of the following generic form:
where H denotes total number of strata, x1,…,xH are the strata sample sizes, and parameters b and aw > 0 do not depend on xw, w = 1,…,H.
Apart from dopt
and nopt
, stratallo provides var_tst
and var_tst_si
functions that compute a value of variance D2st. The var_tst_si
is a simple wrapper of var_tst
that is dedicated for the case of simple random sampling without replacement design in each stratum. Furthermore, the package comes with two predefined, artificial populations with 507 and 969 strata. These are stored in pop507
and pop969
objects respectively.
See package’s vignette for more details.
You can install the released version of stratallo package from CRAN with:
These are basic examples that show how to use dopt
and nopt
functions to solve optimal sample allocation problems for an example population with 4 strata.
dopt
# Example population.
N <- c(3000, 4000, 5000, 2000) # Strata sizes.
S <- c(48, 79, 76, 17) # Standard deviations of a study variable in strata.
a <- N * S
M <- c(100, 90, 70, 80) # Upper bounds constraints imposed on sample sizes in strata.
all(M <= N)
n <- 190 # Total sample size.
n < sum(M)
# Optimal allocation under one-sided upper bounds constraints.
opt <- dopt(n = n, a = a, M = M)
opt
sum(opt) # Equals to n.
all(opt <= M) # Does not violate upper bounds constraints.
# Variance of the pi-estimator that corresponds to a given optimal allocation.
var_tst_si(opt, N, S)
m <- c(50, 120, 1, 1) # Lower bounds constraints imposed on sample sizes in strata.
n > sum(m)
# Optimal allocation under one-sided lower bounds constraints.
opt <- dopt(n = n, a = a, m = m)
opt
sum(opt) # Equals to n.
all(opt >= m) # Does not violate lower bounds constraints.
# Variance of the pi-estimator that corresponds to a given optimal allocation.
var_tst_si(opt, N, S)
# Tschuprov-Neyman allocation (no inequality constraints).
opt <- dopt(n = n, a = a)
opt
sum(opt) # Equals to n.
# Variance of the pi-estimator that corresponds to a given optimal allocation.
var_tst_si(opt, N, S)
nopt