-
Notifications
You must be signed in to change notification settings - Fork 144
Description
Hello,
I’d like to report a bug in the tbl_regression function when creating regression tables for survey designs with many independent variables. The issue is that the confidence intervals and p-values are missing from the table.
my R data attached:
data_frame.zip
Below is a simplified version of my analysis code:
library(gtsummary)
library(survey)
data_frame <- data_to_save
# Constructing a complex survey design object
svy_design <- survey::svydesign(strata = ~SDMVSTRA, id = ~SDMVPSU, weights = ~WTINT2YR, nest = TRUE, data = data_frame)
# Fitting a survey-weighted logistic regression model
result <- survey::svyglm(RIAGENDR ~ DMDHRGND + DMDCITZN + SIALANG + DMDHRAGZ + RIDRETH3 + DMDHRMAZ + INDFMPIR, family = binomial(), design = svy_design)
tbl_regression(result, exponentiate = TRUE)
Here’s an example of the output, where the confidence intervals and p-values are missing:
Interestingly, when we examine the model summary, the standard errors are present, but the p-values are missing:
# Print model summary with statistical measures
summary(result)
According to the svyglm function documentation, specifying the degrees of freedom brings back the p-values:
# Print model summary with degrees of freedom adjustment (df = degf)
summary(result, df.resid = degf(result$survey.design))
# Print model summary with degrees of freedom adjustment (df = Inf)
summary(result, df.resid = Inf)
In summary, when the number of independent variables is large, the tbl_regression output lacks confidence intervals and p-values. I suspect that the missing confidence intervals are due to tbl_regression extracting profile likelihood CI, and perhaps switching to Wald CI might resolve this. The missing p-values seem related to the degrees of freedom setting, which works fine when using df.resid = Inf or df.resid = degf(result$survey.design) in the summary() function. The question is, how can this setting be passed to tbl_regression?
Could you please look into this issue? It's becoming a significant limitation as most of my analyses involve many independent variables, making the tbl_regression results unusable.
I've attached my R dataset for your reference.
Thank you!



