* using log directory ‘/data/gannet/ripley/R/packages/tests-MKL/kbal.Rcheck’ * using R Under development (unstable) (2025-04-06 r88113) * using platform: x86_64-pc-linux-gnu * R was compiled by gcc (GCC) 14.2.1 20240912 (Red Hat 14.2.1-3) GNU Fortran (GCC) 14.2.1 20240912 (Red Hat 14.2.1-3) * running under: Fedora Linux 40 (Workstation Edition) * using session charset: UTF-8 * using option ‘--no-stop-on-test-error’ * checking for file ‘kbal/DESCRIPTION’ ... OK * checking extension type ... Package * this is package ‘kbal’ version ‘0.1.2’ * package encoding: UTF-8 * checking package namespace information ... OK * checking package dependencies ... OK * checking if this is a source package ... OK * checking if there is a namespace ... OK * checking for executable files ... OK * checking for hidden files and directories ... OK * checking for portable file names ... OK * checking for sufficient/correct file permissions ... OK * checking whether package ‘kbal’ can be installed ... [25s/30s] OK * used C++ compiler: ‘g++ (GCC) 14.2.1 20240912 (Red Hat 14.2.1-3)’ * checking package directory ... OK * checking DESCRIPTION meta-information ... OK * checking top-level files ... OK * checking for left-over files ... OK * checking index information ... OK * checking package subdirectories ... OK * checking code files for non-ASCII characters ... OK * checking R files for syntax errors ... OK * checking whether the package can be loaded ... OK * checking whether the package can be loaded with stated dependencies ... OK * checking whether the package can be unloaded cleanly ... OK * checking whether the namespace can be loaded with stated dependencies ... OK * checking whether the namespace can be unloaded cleanly ... OK * checking loading without being on the library search path ... OK * checking use of S3 registration ... OK * checking dependencies in R code ... OK * checking S3 generic/method consistency ... OK * checking replacement functions ... OK * checking foreign function calls ... OK * checking R code for possible problems ... [16s/19s] OK * checking Rd files ... OK * checking Rd metadata ... OK * checking Rd line widths ... OK * checking Rd cross-references ... OK * checking for missing documentation entries ... OK * checking for code/documentation mismatches ... OK * checking Rd \usage sections ... OK * checking Rd contents ... OK * checking for unstated dependencies in examples ... OK * checking contents of ‘data’ directory ... OK * checking data for non-ASCII characters ... OK * checking LazyData ... OK * checking data for ASCII and uncompressed saves ... OK * checking line endings in C/C++/Fortran sources/headers ... OK * checking pragmas in C/C++ headers and code ... OK * checking compilation flags used ... OK * checking compiled code ... OK * checking examples ... [17s/22s] ERROR Running examples in ‘kbal-Ex.R’ failed The error most likely occurred in: > ### Name: kbal > ### Title: Kernel Balancing > ### Aliases: kbal > > ### ** Examples > > #---------------------------------------------------------------- > # Example 1: Reweight a control group to a treated to estimate ATT. > # Benchmark using Lalonde et al. > #---------------------------------------------------------------- > #1. Rerun Lalonde example with settings as in Hazlett, C (2017). Statistica Sinica paper: > set.seed(123) > data("lalonde") > # Select a random subset of 500 rows > lalonde_sample <- sample(1:nrow(lalonde), 500, replace = FALSE) > lalonde <- lalonde[lalonde_sample, ] > > xvars=c("age","black","educ","hisp","married","re74","re75","nodegr","u74","u75") > > #2. Lalonde with categorical data only: u74, u75, nodegree, race, married > cat_vars=c("race_ethnicity","married","nodegr","u74","u75") > > #3. Lalonde with mixed categorical and continuous data > cat_vars=c("race_ethnicity", "married") > all_vars= c("age","educ","re74","re75","married", "race_ethnicity") > > #---------------------------------------------------------------- > # Example 1B: Reweight a control group to a treated to esimate ATT. > # Benchmark using Lalonde et al. -- but just mean balancing now > # via "linkernel". > #---------------------------------------------------------------- > > # Rerun Lalonde example with settings as in Hazlett, C (2017). Statistica paper: > kbalout.lin= kbal(allx=lalonde[,xvars], + b=length(xvars), + treatment=lalonde$nsw, + linkernel=TRUE, + fullSVD=TRUE) No multicollinear columns detected. Matrix is already full rank. Warning in kbal(allx = lalonde[, xvars], b = length(xvars), treatment = lalonde$nsw, : One or more columns of "allx" contain less than 10 unique values, but "cat_data" and "mixed_data" are set to FALSE. Are you sure "allx" contains only continuous data? Building kernel matrix Running full SVD on kernel matrix Without balancing, biasbound (norm=1) is 4.03513 and the L1 discrepancy is 0 With 1 dimensions of K, ebalance convergence is TRUE yielding biasbound (norm=1) of 2.32558 With 2 dimensions of K, ebalance convergence is TRUE yielding biasbound (norm=1) of 1.97083 With 3 dimensions of K, ebalance convergence is TRUE yielding biasbound (norm=1) of 0.55838 With 4 dimensions of K, ebalance convergence is TRUE yielding biasbound (norm=1) of 0.57854 With 5 dimensions of K, ebalance convergence is TRUE yielding biasbound (norm=1) of 0.55377 With 6 dimensions of K, ebalance convergence is TRUE yielding biasbound (norm=1) of 0.49756 With 7 dimensions of K, ebalance convergence is TRUE yielding biasbound (norm=1) of 0.48767 With 8 dimensions of K, ebalance convergence is TRUE yielding biasbound (norm=1) of 0.21908 With 9 dimensions of K, ebalance convergence is TRUE yielding biasbound (norm=1) of 0.1794 With 10 dimensions of K, ebalance convergence is FALSE yielding biasbound (norm=1) of 4.03513 Disregarding ebalance convergence and re-running at optimal choice of numdims, 9 > > # Check balance with and without these weights: > dimw(X=lalonde[,xvars], w=kbalout.lin$w, target=lalonde$nsw) $dim age black educ hisp married -9.875976e+00 6.523195e-01 -1.742601e+00 1.827940e-02 -7.499336e-01 re74 re75 nodegr u74 u75 -1.693487e+04 -1.709223e+04 4.201605e-01 6.267071e-01 5.295712e-01 $dimw age black educ hisp married re74 [1,] -0.02877836 -0.002877797 0.007592949 0.0001114375 -0.001007108 1318.643 re75 nodegr u74 u75 [1,] -1354.461 -0.0006438384 0.02466336 -0.02688349 > > summary(lm(re78~nsw,w=kbalout.lin$w, data = lalonde)) Warning in summary.lm(lm(re78 ~ nsw, w = kbalout.lin$w, data = lalonde)) : essentially perfect fit: summary may be unreliable Call: lm(formula = re78 ~ nsw, data = lalonde, weights = kbalout.lin$w) Weighted Residuals: Min 1Q Median 3Q Max -50264 0 0 0 82196 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3912.8 321.4 12.173 <2e-16 *** nsw 2259.0 1122.5 2.013 0.0447 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 6886 on 493 degrees of freedom Multiple R-squared: 0.008149, Adjusted R-squared: 0.006137 F-statistic: 4.05 on 1 and 493 DF, p-value: 0.0447 > > #---------------------------------------------------------------- > # Example 2: Reweight a sample to a target population. > #---------------------------------------------------------------- > # Suppose a population consists of four groups in equal shares: > # white republican, non-white republican, white non-republicans, > # and non-white non-republicans. A given policy happens to be supported > # by all white republicans, and nobody else. Thus the mean level of > # support in the population should be 25%. > # > # Further, the sample is surveyed in such a way that was careful > # to quota on party and race, obtaining 50% republican and 50% white. > # However, among republicans three-quarters are white and among non-republicans, > # three quarters are non-white. This biases the average level of support > # despite having a sample that matches the population on its marginal distributions. #' > # We'd like to reweight the sample so it resembles the population not > # just on the margins, but in the joint distribution of characteristics. > > pop <- data.frame( + republican = c(rep(0,400), rep(1,400)), + white = c(rep(1,200), rep(0,200), rep(1,200), rep(0,200)), + support = c(rep(1,200), rep(0,600))) > > mean(pop$support) # Target value [1] 0.25 > > # Survey sample: correct margins/means, but wrong joint distribution > samp <- data.frame( republican = c(rep(1, 40), rep(0,40)), + white = c(rep(1,30), rep(0,10), rep(1,10), rep(0,30)), + support = c(rep(1,30), rep(0,50))) > > mean(samp$support) # Appears that support is 37.5% instead of 25%. [1] 0.375 > > # Mean Balancing ----------------------------------------- > # Sample is already mean-balanced to the population on each > # characteristic. However for illustrative purposes, use ebal() > dat <- rbind(pop,samp) > > # Indicate which units are sampled (1) and which are population units(0) > sampled <- c(rep(0,800), rep(1,80)) > > # Run ebal (treatment = population units = 1-sampled) > ebal_out <- ebalance_custom(Treatment = 1-sampled, + X=dat[,1:2], + constraint.tolerance=1e-6, + print.level=-1) > > # We can see everything gets even weights, since already mean balanced. > length(unique(ebal_out$w)) [1] 1 > > # And we end up with the same estimate we started with > weighted.mean(samp[,3], w = ebal_out$w) [1] 0.375 > > # We see that, because the margins are correct, all weights are equal > unique(cbind(samp, e_bal_weight = ebal_out$w)) republican white support e_bal_weight 1 1 1 1 10 31 1 0 0 10 41 0 1 0 10 51 0 0 0 10 > > # Kernel balancing for weighting to a population (i.e. kpop) ------- > kbalout = kbal(allx=dat[,1:2], + useasbases=rep(1,nrow(dat)), + sampled = sampled, + b = 1, + sampledinpop = FALSE) No multicollinear columns detected. Matrix is already full rank. Warning in kbal(allx = dat[, 1:2], useasbases = rep(1, nrow(dat)), sampled = sampled, : One or more columns of "allx" contain less than 10 unique values, but "cat_data" and "mixed_data" are set to FALSE. Are you sure "allx" contains only continuous data? Building kernel matrix Running truncated SVD on kernel matrix up to 704 dimensions Warning in value[[3L]](cond) : Truncated SVD failed with error: TridiagEigen: eigen decomposition failed Falling back to full SVD on kernel matrix. Without balancing, biasbound (norm=1) is 0.2454 and the L1 discrepancy is 0.454 With 1 dimensions of K, ebalance convergence is TRUE yielding biasbound (norm=1) of 0 With 2 dimensions of K, ebalance convergence is TRUE yielding biasbound (norm=1) of 0 With 3 dimensions of K, ebalance convergence is TRUE yielding biasbound (norm=1) of 0 With 4 dimensions of K, ebalance convergence is FALSE yielding biasbound (norm=1) of 0.2454 Disregarding ebalance convergence and re-running at optimal choice of numdims, 3 Error in kbal(allx = dat[, 1:2], useasbases = rep(1, nrow(dat)), sampled = sampled, : object 'var_explained' not found Execution halted * checking PDF version of manual ... [9s/13s] OK * checking HTML version of manual ... OK * checking for non-standard things in the check directory ... OK * checking for detritus in the temp directory ... OK * DONE Status: 1 ERROR See ‘/data/gannet/ripley/R/packages/tests-MKL/kbal.Rcheck/00check.log’ for details. Command exited with non-zero status 1 Time 2:18.11, 98.09 + 11.20