Performance

All the tests were done on an Arch Linux x86_64 machine with an Intel(R) Core(TM) i7 CPU (1.90GHz).

Empirical likelihood computation

We show the performance of computing empirical likelihood with el_mean(). We test the computation speed with simulated data sets in two different settings: 1) the number of observations increases with the number of parameters fixed, and 2) the number of parameters increases with the number of observations fixed.

Increasing the number of observations

We fix the number of parameters at p = 10, and simulate the parameter value and n × p matrices using rnorm(). In order to ensure convergence with a large n, we set a large threshold value using el_control().

library(ggplot2)
library(microbenchmark)
set.seed(3175775)
p <- 10
par <- rnorm(p, sd = 0.1)
ctrl <- el_control(th = 1e+10)
result <- microbenchmark(
  n1e2 = el_mean(matrix(rnorm(100 * p), ncol = p), par = par, control = ctrl),
  n1e3 = el_mean(matrix(rnorm(1000 * p), ncol = p), par = par, control = ctrl),
  n1e4 = el_mean(matrix(rnorm(10000 * p), ncol = p), par = par, control = ctrl),
  n1e5 = el_mean(matrix(rnorm(100000 * p), ncol = p), par = par, control = ctrl)
)

Below are the results:

result
#> Unit: microseconds
#>  expr        min          lq        mean     median          uq        max
#>  n1e2    436.735    469.4955    509.8965    490.660    548.3175    819.829
#>  n1e3   1191.562   1359.6805   1588.1001   1449.614   1650.8245   5582.150
#>  n1e4  10420.193  12382.9380  14520.1949  14776.578  15705.6650  28453.073
#>  n1e5 159142.973 181864.8840 214987.5162 210712.082 243056.3325 375463.484
#>  neval cld
#>    100 a  
#>    100 a  
#>    100  b 
#>    100   c
autoplot(result)

Increasing the number of parameters

This time we fix the number of observations at n = 1000, and evaluate empirical likelihood at zero vectors of different sizes.

n <- 1000
result2 <- microbenchmark(
  p5 = el_mean(matrix(rnorm(n * 5), ncol = 5),
    par = rep(0, 5),
    control = ctrl
  ),
  p25 = el_mean(matrix(rnorm(n * 25), ncol = 25),
    par = rep(0, 25),
    control = ctrl
  ),
  p100 = el_mean(matrix(rnorm(n * 100), ncol = 100),
    par = rep(0, 100),
    control = ctrl
  ),
  p400 = el_mean(matrix(rnorm(n * 400), ncol = 400),
    par = rep(0, 400),
    control = ctrl
  )
)
result2
#> Unit: microseconds
#>  expr        min         lq        mean      median          uq       max neval
#>    p5    714.051    755.739    826.7829    790.8055    832.5525   3689.10   100
#>   p25   2732.957   2778.617   3006.2785   2807.8065   2871.2555   9866.17   100
#>  p100  21143.551  23643.669  25765.0891  24130.3925  28691.7330  44744.18   100
#>  p400 239251.732 263440.302 296609.6124 284875.0475 309075.4735 474184.04   100
#>  cld
#>  a  
#>  a  
#>   b 
#>    c
autoplot(result2)

On average, evaluating empirical likelihood with a 100000×10 or 1000×400 matrix at a parameter value satisfying the convex hull constraint takes less than a second.