r - Passing mclapply() a parameter from for (i in range) -


i'm trying this:

nmf.sub <- function(n){ sub.data.matrix <- data.matrix[, (index[n, ])] ## index permutation of original matrix @ 0.8 resampling proportion (doesn't matter) temp.result <- nmf(sub.data.matrix, rank = 2, seed = 12345) ## want change 2 return(temp.result) }  class.list <- list() (i in nmf.rank){ ## nmf.rank 2:4 results.list <- mclapply(mc.cores = 16, 1:resamp.iterations, function(n) nmf.sub(n)) ## resamp.iterations 10, nmf.sub defined above } 

but instead of having rank = 2 in nmf temp.result, want have rank = i

any idea how pass parameter? passing through mclapply function(n, i) doesn't work.

you seemingly have 2 loops: 1 in nmf.rank , 1 n in 1:resamp.iterations. therefore, need pass both i , n nmf.sub e.g. in:

nmf.sub <- function(n, i){     ## index permutation of original matrix @ 0.8     ## resampling proportion (doesn't matter)     sub.data.matrix <- data.matrix[, (index[n, ])]      ## want change 2     temp.result <- nmf(sub.data.matrix, rank = i, seed = 12345)     return(temp.result) }   resamp.iterations <- 10 nmf.rank <- 2:4  res <- lapply(nmf.rank, function(i){     results.list <- mclapply(mc.cores = 16, 1:resamp.iterations,                              function(n) nmf.sub(n,i)) }) ## can flatten/reshape res 

regarding comment (below) efficiency: bulk of numerical calculations performed within nmf() function, therefore loop set up, in sense each process/core gets numerically intensive job. however, speed computation might consider using computed result, instead of seed 12345 (unless using latter seed mandatory reason related problem). in following example 30-40% reduction in execution time:

library(nmf) rngkind("l'ecuyer-cmrg") ## use when using mclapply() nr <- 19 nc <- 2e2 set.seed(123) data.matrix <- matrix(rexp(nc*nr),nr,nc)  resamp.iterations <- 10 nmf.rank <- 2:4  index <- t(sapply(1:resamp.iterations, function(n) sample.int(nc,nc*0.8)))   nmf.sub <- function(n, i){     sub.data.matrix <- data.matrix[ ,index[n, ]]      temp.result <- nmf(sub.data.matrix, rank = i, seed = 12345)     return(temp.result) }  ## version 1 system.time({     res <- lapply(nmf.rank, function(i){         results.list <- mclapply(mc.cores = 16, 1:resamp.iterations,                                  function(n) nmf.sub(n,i))     }) })  ## version 2: swap internal , external loops system.time({     res <-          mclapply(mc.cores=16, 1:resamp.iterations, function(n){             res2 <- nmf(data.matrix[ ,index[n, ]], rank=2, seed = 12345)             res3 <- nmf(data.matrix[ ,index[n, ]], rank=3, seed = 12345)             res4 <- nmf(data.matrix[ ,index[n, ]], rank=4, seed = 12345)             list(res2,res3,res4)         }) })  ## version 3: use previous calculation starting point ##   ==> 30-40% reduction in computing time system.time({     res <-          mclapply(mc.cores=16, 1:resamp.iterations, function(n){             res2 <- nmf(data.matrix[ ,index[n, ]], rank=2, seed = 12345)             res3 <- nmf(data.matrix[ ,index[n, ]], rank=3, seed = res2)             res4 <- nmf(data.matrix[ ,index[n, ]], rank=4, seed = res3)             list(res2,res3,res4)         }) }) 

Comments