Performance

Eunseop Kim

All the tests were done on an Arch Linux x86_64 machine with an Intel(R) Core(TM) i7 CPU (1.90GHz). We first load the necessary packages.

library(melt)
library(microbenchmark)
library(ggplot2)

Empirical likelihood computation

We show the performance of computing empirical likelihood with el_mean(). We test the computation speed with simulated data sets in two different settings: 1) the number of observations increases with the number of parameters fixed, and 2) the number of parameters increases with the number of observations fixed.

Increasing the number of observations

We fix the number of parameters at \(p = 10\), and simulate the parameter value and \(n \times p\) matrices using rnorm(). In order to ensure convergence with a large \(n\), we set a large threshold value using el_control().

set.seed(3175775)
p <- 10
par <- rnorm(p, sd = 0.1)
ctrl <- el_control(th = 1e+10)
result <- microbenchmark(
  n1e2 = el_mean(matrix(rnorm(100 * p), ncol = p), par = par, control = ctrl),
  n1e3 = el_mean(matrix(rnorm(1000 * p), ncol = p), par = par, control = ctrl),
  n1e4 = el_mean(matrix(rnorm(10000 * p), ncol = p), par = par, control = ctrl),
  n1e5 = el_mean(matrix(rnorm(100000 * p), ncol = p), par = par, control = ctrl)
)

Below are the results:

result
#> Unit: microseconds
#>  expr        min         lq        mean     median         uq        max neval
#>  n1e2    478.486    567.466    655.2995    610.193    664.700   1283.147   100
#>  n1e3   1431.359   1686.219   2018.6906   1892.880   2133.096   3783.401   100
#>  n1e4  13955.585  18157.292  20678.7358  20452.990  22623.294  34640.493   100
#>  n1e5 288719.574 357686.361 435098.9835 445522.969 490486.745 720364.080   100
#>  cld
#>  a  
#>  a  
#>   b 
#>    c
autoplot(result)

Increasing the number of parameters

This time we fix the number of observations at \(n = 1000\), and evaluate empirical likelihood at zero vectors of different sizes.

n <- 1000
result2 <- microbenchmark(
  p5 = el_mean(matrix(rnorm(n * 5), ncol = 5),
    par = rep(0, 5),
    control = ctrl
  ),
  p25 = el_mean(matrix(rnorm(n * 25), ncol = 25),
    par = rep(0, 25),
    control = ctrl
  ),
  p100 = el_mean(matrix(rnorm(n * 100), ncol = 100),
    par = rep(0, 100),
    control = ctrl
  ),
  p400 = el_mean(matrix(rnorm(n * 400), ncol = 400),
    par = rep(0, 400),
    control = ctrl
  )
)
result2
#> Unit: microseconds
#>  expr        min         lq        mean     median          uq        max neval
#>    p5    747.099    828.274    965.7782    884.664    990.1295   1807.282   100
#>   p25   2927.671   3153.000   3558.2610   3357.575   3552.0815   7679.986   100
#>  p100  24171.973  27815.401  31560.4120  30185.583  34235.3110  49560.778   100
#>  p400 301981.597 332774.619 374275.9353 356880.586 382856.2170 564443.300   100
#>  cld
#>  a  
#>  a  
#>   b 
#>    c
autoplot(result2)

On average, evaluating empirical likelihood with a 100000×10 or 1000×400 matrix at a parameter value satisfying the convex hull constraint takes less than a second.