Engagement in Integrative and Nonpharmacologic Pain Management Modalities Among Adults with Chronic Pain

Analysis of the 2019 National Health Interview Survey

Author

Samuel N. Rodgers-Melnick, MPH, MT-BC

Published

2024-01-22

You can download the .qmd file, NHISstudy2024-01-07GitHub.qmd, from my GitHub page here.

You can read the full research article in Journal of Pain Research here

Rodgers-Melnick SN, Trager RJ, Love TE, Dusek JA. Engagement in Integrative and Nonpharmacologic Pain Management Modalities Among Adults with Chronic Pain: Analysis of the 2019 National Health Interview Survey. J Pain Res. Jan 16 2024;17:253-264. doi:10.2147/jpr.S439682

R Packages and Setup

I loaded the following packages to ensure I had the necessary data cleaning, analysis, and visualization tools needed for this project. Given the number of packages, I used conflict_prefer() to ensure functions would not interfere with each other. I also loaded uhred for use in plots.

Code
knitr::opts_chunk$set(comment = NA) # do not remove this

library(janitor)
library(gt)
library(naniar)
library(rms)
library(patchwork)
library(Epi)
library(car)
library(caret)
library(GGally)
library(pscl)
library(VGAM)
library(MASS)
library(flextable)
library(mosaic)
library(pROC)
library(ROCR)
library(gtsummary)
library(simputation)
library(ggrepel)
library(nnet)
library(kableExtra)
library(conflicted)
library(survey)
library(tidymodels)
library(tidyverse)

conflict_prefer("select", "dplyr")
conflict_prefer("summarize", "dplyr")
conflict_prefer("filter", "dplyr")
conflict_prefer("mean", "base")
conflict_prefer("sum", "base")
conflict_prefer("vif", "car")
conflict_prefer("count", "dplyr")

uhred <- "#B70022" 

theme_set(theme_bw())

1 Data Source

1.1 National Health Interview Survey

The data for this project comes from the 2019 National Health Interview Survey (NHIS) Website (National Center for Health Statistics, 2021). To obtain the data, I downloaded the CSV data file for the Sample Adult Interview to my local hard drive and unzipped the adult19.csv file.

According to the National Health Interview Survey website:

Purpose: “The main objective of the NHIS is to monitor the health of the United States population through the collection and analysis of data on a broad range of health topics. NHIS data are used widely throughout the Department of Health and Human Services (HHS) to monitor trends in illness and disability and to track progress toward achieving national health objectives. The data are also used by the public health research community for epidemiological and policy analysis of such timely issues as characterizing those with various health problems, determining barriers to accessing and using appropriate health care, and evaluating Federal health programs.”

Design: The NHIS is a cross-sectional household interview survey.

Target population: “The target population for the NHIS is the civilian non-institutionalized population residing within the 50 states and the District of Columbia at the time of the interview. Persons excluded from the universe are those with no fixed household address (e.g., homeless and/or transient persons not residing in shelters), active duty military personnel, and civilians living on military bases, persons in long-term care institutions (e.g., nursing homes for the elderly, hospitals for the chronically ill or physically or intellectually disabled, and wards for abused or neglected children), persons in correctional facilities (e.g., prisons or jails, juvenile detention centers, and halfway houses), and U.S. nationals living in foreign countries. While active-duty Armed Forces personnel cannot be sampled for inclusion in the survey, any civilians residing with Armed Forces personnel in non-military housing are eligible to be sampled.”

Sampling strategy: “To keep survey operations manageable, cost-effective, and timely, the NHIS uses geographically clustered sampling techniques to select the sample of dwelling units for the NHIS. The sample is designed in such a way that each month’s sample is nationally representative. Data collection on the NHIS is continuous, i.e., from January to December each year.”

“One ‘sample adult’ aged 18 years or older and one ‘sample child’ aged 17 years or younger (if any children live in the household) are randomly selected from each household following a brief initial interview that identifies everyone who usually lives or stays in the household. Information about the sample adult is collected from the sample adult herself or himself unless she or he is physically or mentally unable to do so, in which case a knowledgeable proxy can answer for the sample adult. Information about the sample child is collected from a parent or adult who is knowledgeable about and responsible for the health care of the sample child. This respondent may or may not also be the sample adult.”

Data collection procedures: “The U.S. Census Bureau, under a contractual agreement, is the data collection agent for the NHIS. NHIS data are collected continuously throughout the year by Census interviewers. Nationally, about 750 interviewers (also called ‘Field Representatives’) are trained and directed by health survey supervisors in the U.S. Census Bureau Regional Offices to conduct interviews for NHIS.

The NHIS is conducted using computer-assisted personal interviewing. Face-to-face interviews are conducted in respondents’ homes, but follow-ups to complete interviews may be conducted over the telephone. A telephone interview may also be conducted when the respondent requests a telephone interview or when road conditions or travel distances would make it difficult to schedule a personal visit before the required completion date. In 2019, 34.3% of the Sample Adult interviews and 31.7% of the Sample Child interviews were conducted at least partially by telephone.”

1.2 The Subjects

Subjects included in this project are adults (18 years and older) with chronic pain as defined by responding “most days” or “every day” to the survey question, “In the past 3 months, how often did you have pain?” (Zelaya et al., 2020).

2 Research Questions

Which demographic, mental health, and pain-related variables are associated with engagement in: (1) the number of IHM modalities used for pain and (2) exclusively nonpharmacologic modalities rather than opioid utilization among U.S. adults with chronic pain.

3 Data Ingest and Management

3.1 Loading the Raw Data

I ingested the adult19.csv raw data that I downloaded and unzipped from the 2019 National Health Interview Survey (NHIS) Website as projectb_raw using the read_csv() function.

Code
projectb_raw <- read_csv("data/adult19.csv", show_col_types = FALSE)

3.2 Cleaning the Data

In the following code, I created a new tibble, projectb_clean, limited to respondents who stated that they had pain on most days (PAIFRQ3M_A = 3) or everyday (PAIFRQ3M_A = 4) in response to the question PAIFRQ3M_A, “In the past three months, how often did you have pain?” as this matched the chronic pain definition reported by the National Center for Health Statistics (Zelaya et al., 2020). I also set the filter at PAIFRQ3M_A < 7 to exclude responses that were missing, refused, non ascertained, or “don’t know” in response to the chronic pain question.

After filtering to the chronic pain cohort, I selected the variables from the NHIS that I needed for this analysis.

Code
projectb_clean <- projectb_raw |>
  filter(PAIFRQ3M_A >= 3 & PAIFRQ3M_A < 7) |> #Limited to pain on most days or everyday per Zelaya et al. (2020)
  select(subject_id = HHX, age = AGEP_A, sex = SEX_A,
         wt_final = WTFA_A,
         education = EDUC_A, urban_rural = URBRRL,
         painlimit_lifework = PAIWKLM3M_A,
         anymeds12mo = RX12M_A, opioids12mo = OPD12M_A, opioids3mo = OPD3M_A, 
         opioids_chonpain3mo = OPDCHRONIC_A, opioid_freq3mo = OPDFREQ_A,
         GAD71_A, GAD72_A, GAD73_A, GAD74_A, 
         GAD75_A, GAD76_A, GAD77_A, anx_severity = GADCAT_A,
         back_pain = PAIBACK3M_A, upper_ext_pain = PAIULMB3M_A,
         lower_ext_pain = PAILLMB3M_A, headache_migraine = PAIHDFC3M_A,
         abd_pelvic_pain = PAIAPG3M_A, tooth_jaw_pain = PAITOOTH3M_A,
         chiro = PAICHIRO_A, yoga_taichi = PAIYOGA_A, 
         massage = PAIMASSAGE_A, meditation = PAIMEDITAT_A,
         race_ethnicity = HISPALLP_A, fam_income = FAMINCTC_A,
         pt_rehab_ot = PAIPHYSTPY_A, talk_therapy = PAITALKTPY_A,
         chron_pain_selfmgmt = PAIPROGRAM_A, peer_support_grp = PAIGROUP_A,
         PHQ81_A, PHQ82_A, PHQ83_A, PHQ84_A, PHQ85_A, 
         PHQ86_A, PHQ87_A, PHQ88_A, dep_severity = PHQCAT_A)

3.2.1 Recode Refused, Not Ascertained, and Don’t Know as Missing

Using the 2019 NHIS Codebook, I recoded all responses that were Refused, Not Ascertained, or Don’t Know as NA.

Code
projectb_clean$age[projectb_clean$age >= 97] <- NA
projectb_clean$sex[projectb_clean$sex >= 7] <- NA
projectb_clean$education[projectb_clean$education >= 97] <- NA
projectb_clean$painlimit_lifework[projectb_clean$painlimit_lifework >= 7] <- NA
projectb_clean$anymeds12mo[projectb_clean$anymeds12mo >= 7] <- NA
projectb_clean$opioids12mo[projectb_clean$opioids12mo >= 7] <- NA
projectb_clean$opioids3mo[projectb_clean$opioids3mo >= 7] <- NA
projectb_clean$opioids_chonpain3mo[projectb_clean$opioids_chonpain3mo >= 7] <- NA
projectb_clean$opioid_freq3mo[projectb_clean$opioid_freq3mo >= 7] <- NA
projectb_clean$GAD71_A[projectb_clean$GAD71_A >= 7] <- NA
projectb_clean$GAD72_A[projectb_clean$GAD72_A >= 7] <- NA
projectb_clean$GAD73_A[projectb_clean$GAD73_A >= 7] <- NA
projectb_clean$GAD74_A[projectb_clean$GAD74_A >= 7] <- NA
projectb_clean$GAD75_A[projectb_clean$GAD75_A >= 7] <- NA
projectb_clean$GAD76_A[projectb_clean$GAD76_A >= 7] <- NA
projectb_clean$GAD77_A[projectb_clean$GAD77_A >= 7] <- NA
projectb_clean$anx_severity[projectb_clean$anx_severity >= 8] <- NA


projectb_clean$back_pain[projectb_clean$back_pain >= 7] <- NA
projectb_clean$upper_ext_pain[projectb_clean$upper_ext_pain >= 7] <- NA
projectb_clean$lower_ext_pain[projectb_clean$lower_ext_pain >= 7] <- NA
projectb_clean$headache_migraine[projectb_clean$headache_migraine >= 7] <- NA
projectb_clean$abd_pelvic_pain[projectb_clean$abd_pelvic_pain >= 7] <- NA
projectb_clean$tooth_jaw_pain[projectb_clean$tooth_jaw_pain >= 7] <- NA

projectb_clean$chiro[projectb_clean$chiro >= 7] <- NA
projectb_clean$yoga_taichi[projectb_clean$yoga_taichi >= 7] <- NA
projectb_clean$massage[projectb_clean$massage >= 7] <- NA
projectb_clean$meditation[projectb_clean$meditation >= 7] <- NA

projectb_clean$race_ethnicity[projectb_clean$race_ethnicity >= 97] <- NA

projectb_clean$pt_rehab_ot[projectb_clean$pt_rehab_ot >= 7] <- NA
projectb_clean$talk_therapy[projectb_clean$talk_therapy >= 7] <- NA
projectb_clean$chron_pain_selfmgmt[projectb_clean$chron_pain_selfmgmt >= 7] <- NA
projectb_clean$peer_support_grp[projectb_clean$peer_support_grp >= 7] <- NA

projectb_clean$PHQ81_A[projectb_clean$PHQ81_A >= 7] <- NA
projectb_clean$PHQ82_A[projectb_clean$PHQ82_A >= 7] <- NA
projectb_clean$PHQ83_A[projectb_clean$PHQ83_A >= 7] <- NA
projectb_clean$PHQ84_A[projectb_clean$PHQ84_A >= 7] <- NA
projectb_clean$PHQ85_A[projectb_clean$PHQ85_A >= 7] <- NA
projectb_clean$PHQ86_A[projectb_clean$PHQ86_A >= 7] <- NA
projectb_clean$PHQ87_A[projectb_clean$PHQ87_A >= 7] <- NA
projectb_clean$PHQ88_A[projectb_clean$PHQ88_A >= 7] <- NA
projectb_clean$dep_severity[projectb_clean$dep_severity >= 8] <- NA

3.2.2 Recode Binary Variables From (1 = Yes, 2 = No) to (1 = Yes, 0 = No)

In the following code, I recoded binary variables with initial responses in the format, 1 = Yes, 2 = No, to binary variables coded as 1 = Yes, 0 = No using the mutate() function, subtracting 2 from the variables chiro, yoga_taichi, massage, meditation, anymeds12mo, opioids12mo, opioids3mo, opioids_chonpain3mo, pt_rehab_ot, talk_therapy, chron_pain_selfmgmt, and peer_support_grp.

Code
projectb_clean <- projectb_clean |>
  mutate(chiro = as.numeric(2 - chiro),
         yoga_taichi = as.numeric(2 - yoga_taichi),
         massage = as.numeric(2 - massage),
         meditation = as.numeric(2 - meditation),
         anymeds12mo = as.numeric(2 - anymeds12mo),
         opioids12mo = as.numeric(2 - opioids12mo),
         opioids3mo = as.numeric(2 - opioids3mo),
         opioids_chonpain3mo = as.numeric(2 - opioids_chonpain3mo),
         pt_rehab_ot = as.numeric(2 - pt_rehab_ot),
         talk_therapy = as.numeric(2 - talk_therapy),
         chron_pain_selfmgmt = as.numeric(2 - chron_pain_selfmgmt),
         peer_support_grp = as.numeric(2 - peer_support_grp))

3.2.3 Recode Categorical Variables to Meaningful Factors

Using the 2019 NHIS Codebook, I recoded the following variables to meaningful, ascending, and ordered factors using the mutate() and factor() functions. To make race_ethnicity 6 categories, I combined “NH Amer Indian Alaska Natv only” and “NH Amer Indian Alaska Natv and any other group” into “NH Amer Indian Alaska Natv.”

Code
projectb_clean <- projectb_clean |>
  mutate(sex = factor(case_when(
    sex == 1 ~ "Male",
    sex == 2 ~ "Female"))) |>
  mutate(race_ethnicity = case_when(
    race_ethnicity == 1 ~ "Hispanic",
    race_ethnicity == 2 ~ "NH White only",
    race_ethnicity == 3 ~ "NH Black/African American only",
    race_ethnicity == 4 ~ "NH Asian only",
    race_ethnicity == 5 ~ "NH Amer Indian Alaska Natv",
    race_ethnicity == 6 ~ "NH Amer Indian Alaska Natv",
    race_ethnicity == 7 ~ "Other single and multiple races")) |>
  mutate(race_ethnicity = 
           factor(race_ethnicity, 
                  levels = c("NH White only", 
                             "Hispanic", 
                             "NH Black/African American only", 
                             "NH Asian only", 
                             "NH Amer Indian Alaska Natv",
                             "Other single and multiple races"))) |>
  mutate(painlimit_lifework = case_when(
    painlimit_lifework == 1 ~ "Never",
    painlimit_lifework == 2 ~ "Some days",
    painlimit_lifework == 3 ~ "Most days",
    painlimit_lifework == 4 ~ "Every day")) |>
  mutate(painlimit_lifework = factor(painlimit_lifework, 
                                     levels = c("Never", 
                                                "Some days", 
                                                "Most days", 
                                                "Every day"))) |>
  mutate(anx_severity = case_when(
    anx_severity == 1 ~ "None/Minimal",
    anx_severity == 2 ~ "Mild",
    anx_severity == 3 ~ "Moderate",
    anx_severity == 4 ~ "Severe")) |>
  mutate(anx_severity = factor(anx_severity, 
                               levels = c("None/Minimal", 
                                          "Mild", 
                                          "Moderate", 
                                          "Severe"))) |>
  mutate(dep_severity = case_when(
    dep_severity == 1 ~ "None/Minimal",
    dep_severity == 2 ~ "Mild",
    dep_severity == 3 ~ "Moderate",
    dep_severity == 4 ~ "Severe")) |>
  mutate(dep_severity = factor(dep_severity, 
                               levels = c("None/Minimal", 
                                          "Mild", 
                                          "Moderate", 
                                          "Severe"))) |>
  mutate(opioid_freq3mo = case_when(
    opioid_freq3mo == 1 ~ "Some days",
    opioid_freq3mo == 2 ~ "Most days",
    opioid_freq3mo == 3 ~ "Every day")) |>
  mutate(opioid_freq3mo = factor(opioid_freq3mo, levels = c("Some days", 
                                                        "Most days", 
                                                        "Every day"))) |>
  mutate(education = case_when(
    education == 0 ~ "Less than high school",
    education == 1 ~ "Less than high school",
    education == 2 ~ "Less than high school",
    education == 3 ~ "High school graduate/GED",
    education == 4 ~ "High school graduate/GED",
    education == 5 ~ "Some college, no degree",
    education == 6 ~ "Associate's or bachelor's degree",
    education == 7 ~ "Associate's or bachelor's degree",
    education == 8 ~ "Associate's or bachelor's degree",
    education == 9 ~ "Master's, doctoral, or professional degree",
    education == 10 ~ "Master's, doctoral, or professional degree",
    education == 11 ~ "Master's, doctoral, or professional degree")) |>
  mutate(education = factor(education, levels = c("Less than high school", 
                                                  "High school graduate/GED",
                                                  "Some college, no degree",
                                                  "Associate's or bachelor's degree",
                                                  "Master's, doctoral, or professional degree"))) |>
  mutate(urban_rural = case_when(
    urban_rural == 1 ~ "Large central metro",
    urban_rural == 2 ~ "Large fringe metro",
    urban_rural == 3 ~ "Medium and small metro",
    urban_rural == 4 ~ "Nonmetropolitan")) |>
  mutate(urban_rural = factor(urban_rural, levels = c("Nonmetropolitan", 
                                                      "Medium and small metro", 
                                                      "Large fringe metro",
                                                      "Large central metro")))

3.2.4 Calculate GAD-7 total

In the following code, I recoded the original NHIS responses to conform to the actual scoring method for the GAD-7 (Löwe et al., 2008). I then calculated gad7total by adding each of the individual GAD-7 responses. Finally, to verify that gad7total was scored correctly, I summarized gad7total by anx_severity using favstats() to ensure that the values were in range. Interestingly, in the 2019 NHIS, it appears that they scored subjects even if they had missing individual items on the GAD-7.

Code
projectb_clean <- projectb_clean |>
  mutate(GAD71_A = case_when(
    GAD71_A == 1 ~ 0,
    GAD71_A == 2 ~ 1,
    GAD71_A == 3 ~ 2,
    GAD71_A == 4 ~ 3)) |>
  mutate(GAD72_A = case_when(
    GAD72_A == 1 ~ 0,
    GAD72_A == 2 ~ 1,
    GAD72_A == 3 ~ 2,
    GAD72_A == 4 ~ 3)) |>
  mutate(GAD73_A = case_when(
    GAD73_A == 1 ~ 0,
    GAD73_A == 2 ~ 1,
    GAD73_A == 3 ~ 2,
    GAD73_A == 4 ~ 3)) |>
  mutate(GAD74_A = case_when(
    GAD74_A == 1 ~ 0,
    GAD74_A == 2 ~ 1,
    GAD74_A == 3 ~ 2,
    GAD74_A == 4 ~ 3)) |>
  mutate(GAD75_A = case_when(
    GAD75_A == 1 ~ 0,
    GAD75_A == 2 ~ 1,
    GAD75_A == 3 ~ 2,
    GAD75_A == 4 ~ 3)) |>
  mutate(GAD76_A = case_when(
    GAD76_A == 1 ~ 0,
    GAD76_A == 2 ~ 1,
    GAD76_A == 3 ~ 2,
    GAD76_A == 4 ~ 3)) |>
  mutate(GAD77_A = case_when(
    GAD77_A == 1 ~ 0,
    GAD77_A == 2 ~ 1,
    GAD77_A == 3 ~ 2,
    GAD77_A == 4 ~ 3)) |>
  mutate(gad7total = GAD71_A + GAD72_A + GAD73_A + 
           GAD74_A + GAD75_A + GAD76_A + GAD77_A)

favstats(gad7total ~ anx_severity, data = projectb_clean) |> 
  flextable() |>
  colformat_double(digits = 2) |>
  theme_zebra(odd_header = "white") |>
  hline_top(part="body")

anx_severity

min

Q1

median

Q3

max

mean

sd

n

missing

None/Minimal

0.00

0.00

0.00

2.00

4.00

1.00

1.31

4,870

27

Mild

5.00

5.00

6.00

8.00

9.00

6.60

1.31

1,141

12

Moderate

10.00

11.00

12.00

13.00

14.00

11.83

1.42

574

4

Severe

15.00

16.00

18.00

19.00

21.00

17.82

2.13

508

5

3.2.5 Calculate PHQ-8 Total

In the following code, I recoded the original NHIS responses to conform to the actual scoring method for the PHQ-8 (Kroenke et al., 2001). I then calculated phq8total by adding each of the individual PHQ-8 responses.

Finally, to verify that phq8total was scored correctly, I summarized phq8total by dep_severity using favstats() to ensure that the values were in range. Interestingly, in the 2019 NHIS, it appears that they dropped “moderate-to-severe depression” as a value given that they did not use the full PHQ-9, and they scored subjects even if they had missing individual items on the PHQ-8.

Code
projectb_clean <- projectb_clean |>
  mutate(PHQ81_A = case_when(
    PHQ81_A == 1 ~ 0,
    PHQ81_A == 2 ~ 1,
    PHQ81_A == 3 ~ 2,
    PHQ81_A == 4 ~ 3)) |>
  mutate(PHQ82_A = case_when(
    PHQ82_A == 1 ~ 0,
    PHQ82_A == 2 ~ 1,
    PHQ82_A == 3 ~ 2,
    PHQ82_A == 4 ~ 3)) |>
  mutate(PHQ83_A = case_when(
    PHQ83_A == 1 ~ 0,
    PHQ83_A == 2 ~ 1,
    PHQ83_A == 3 ~ 2,
    PHQ83_A == 4 ~ 3)) |>
  mutate(PHQ84_A = case_when(
    PHQ84_A == 1 ~ 0,
    PHQ84_A == 2 ~ 1,
    PHQ84_A == 3 ~ 2,
    PHQ84_A == 4 ~ 3)) |>
  mutate(PHQ85_A = case_when(
    PHQ85_A == 1 ~ 0,
    PHQ85_A == 2 ~ 1,
    PHQ85_A == 3 ~ 2,
    PHQ85_A == 4 ~ 3)) |>
  mutate(PHQ86_A = case_when(
    PHQ86_A == 1 ~ 0,
    PHQ86_A == 2 ~ 1,
    PHQ86_A == 3 ~ 2,
    PHQ86_A == 4 ~ 3)) |>
  mutate(PHQ87_A = case_when(
    PHQ87_A == 1 ~ 0,
    PHQ87_A == 2 ~ 1,
    PHQ87_A == 3 ~ 2,
    PHQ87_A == 4 ~ 3)) |>
  mutate(PHQ88_A = case_when(
    PHQ88_A == 1 ~ 0,
    PHQ88_A == 2 ~ 1,
    PHQ88_A == 3 ~ 2,
    PHQ88_A == 4 ~ 3)) |>
  mutate(phq8total = PHQ81_A + PHQ82_A + PHQ83_A + PHQ84_A + 
           PHQ85_A + PHQ86_A + PHQ87_A + PHQ88_A)

favstats(phq8total ~ dep_severity, data = projectb_clean) |> 
  flextable() |>
  colformat_double(digits = 2) |>
  theme_zebra(odd_header = "white") |>
  hline_top(part="body")

dep_severity

min

Q1

median

Q3

max

mean

sd

n

missing

None/Minimal

0.00

0.00

1.00

3.00

4.00

1.46

1.42

4,163

34

Mild

5.00

5.00

6.00

8.00

9.00

6.66

1.40

1,585

17

Moderate

10.00

10.00

11.00

13.00

14.00

11.68

1.39

749

13

Severe

15.00

16.00

18.00

20.00

24.00

18.32

2.80

566

9

3.2.6 Integrative Health and Medicine (IHM) Variables

In the following code, I performed the following steps to create the 5-level count outcome, ihm4pain_num, and the binary variable, ihm4pain_bin.

  • First, I used the mutate() function to create ihm4pain_num as the sum of engaging in chiro, yoga_taichi, massage, or meditation to manage pain in the past 3 months.
  • Then, I created the binary outcome, ihm4pain_bin, which was equal to 1 if ihm4pain_num >= 1 and equal to 0 if ihm4pain_num == 0.
Code
projectb_clean <- projectb_clean |>
  mutate(ihm4pain_num = chiro + yoga_taichi + massage + meditation,
         ihm4pain_bin = case_when(ihm4pain_num >= 1 ~ 1,
                                  ihm4pain_num == 0 ~ 0))

3.2.7 Opioid frequency

To create a single variable describing frequency of opioid use in the past 3 months to manage chronic pain, I performed the following steps:

  • First, since 1) the opioid frequency question was only asked to subjects who were taking prescription medications and opioid medications and 2) subjects responding “No” to the prescription medication and opioid use questions would be assumed to have never taken opioids in the past 3 months, I coded opioid_freq3mo_final as equal to “Never” if the subject answered “No” to anymeds12mo, opioids12mo, opioids3mo, or opioids_chonpain3mo.
  • Next, if subjects responded “Some days”, “Most days”, “Every day” to the opioid_freq3mo question, I coded the opioid_freq3mo_final variable as equal to that response.
  • Finally, I coded opioid_freq3mo_final as a meaningfully ordered factor.
Code
projectb_clean <- projectb_clean |>
  mutate(opioid_freq3mo_final = case_when(
    anymeds12mo == 0|
      opioids12mo == 0|
      opioids3mo == 0|
      opioids_chonpain3mo == 0 ~ "Never",
    opioid_freq3mo %in% c("Some days", "Most days", "Every day") ~ 
      opioid_freq3mo),
  opioid_freq3mo_final = factor(opioid_freq3mo_final, 
                                levels = c("Never",
                                           "Some days",
                                           "Most days",
                                           "Every day")))

3.2.8 Number of Pain Locations

To create a single variable quantifying the total number of locations in which subjects reported being bothered at least “a little” by pain in the past three months, I performed the following steps:

  • I converted the 6 individual pain variables to binary variables such that 1 indicated being bothered at least “a little” (>= 2), and 0 meant “Not at all.”
  • I then calculated num_pain_locations as the sum of those binary variables, ignoring missingness using the rowSums() function.
  • Finally, I used mutate() and if_else() to mark num_pain_locations as missing if all pain locations were missing.
Code
projectb_clean <- projectb_clean |>
  mutate(back_pain = if_else(back_pain >= 2, 1, 0),
         upper_ext_pain = if_else(upper_ext_pain >= 2, 1, 0),
         lower_ext_pain = if_else(lower_ext_pain >= 2, 1, 0),
         headache_migraine = if_else(headache_migraine >= 2, 1, 0),
         abd_pelvic_pain = if_else(abd_pelvic_pain >= 2, 1, 0),
         tooth_jaw_pain = if_else(tooth_jaw_pain >= 2, 1, 0))

projectb_clean$num_pain_locations <- rowSums(projectb_clean[,c("back_pain",
                                             "upper_ext_pain",
                                             "lower_ext_pain",
                                             "headache_migraine",
                                             "abd_pelvic_pain",
                                             "tooth_jaw_pain")], na.rm=TRUE)

projectb_clean <- projectb_clean |>
  mutate(num_pain_locations = if_else(
    is.na(back_pain) & is.na(upper_ext_pain) &
      is.na(lower_ext_pain) & is.na(headache_migraine) &
      is.na(abd_pelvic_pain) & is.na(tooth_jaw_pain), NA, num_pain_locations))

3.2.9 Chronic Pain Treatment Category - SUPPORT

In the following code, I performed the following steps to create the 4-category outcome, pain_tx_type.

  • First, I used the mutate() function to create support4pain_num as the sum of engaging in pt_rehab_ot, talk_therapy, chron_pain_selfmgmt, or peer_support_grp to manage pain in the past 3 months. In this case, SUPPORT stands for SUpportive, Physical, Psychological, Occupational, or Rehabilitative Therapies.
  • Next, I created the binary outcome, support4pain_bin, which was equal to 1 if support4pain_num >= 1 and equal to 0 if support4pain_num == 0.
  • Then, I created the outcome variable pain_tx_type as equal to None, IHM only, SUPPORT only, or SUPPORT and IHM based on the the support4pain_num and ihm4pain_bin variables.
  • Finally, I ensured pain_tx_type was coded as a factor using the factor function, with “None” as the reference category.
Code
projectb_clean <- projectb_clean |>
  mutate(support4pain_num = pt_rehab_ot + talk_therapy + 
           chron_pain_selfmgmt + peer_support_grp,
         support4pain_bin = case_when(support4pain_num >= 1 ~ 1,
                                  support4pain_num == 0 ~ 0),
         pain_tx_type = case_when(
           ihm4pain_bin == 0 & support4pain_bin == 0 ~ "None",
           ihm4pain_bin == 1 & support4pain_bin == 0 ~ "IHM only",
           ihm4pain_bin == 0 & support4pain_bin == 1 ~ "SUPPORT only",
           ihm4pain_bin == 1 & support4pain_bin == 1 ~ "SUPPORT and IHM"),
         pain_tx_type = factor(pain_tx_type, levels = c("None", 
                                                        "IHM only", 
                                                        "SUPPORT only", 
                                                        "SUPPORT and IHM")))

3.2.10 Chronic Pain Treatment Category - Opioids

Code
projectb_clean <- projectb_clean |>
  mutate(nonpharm_opioid = case_when(
    opioid_freq3mo_final == "Never" &
      ihm4pain_bin == 0 & support4pain_bin == 0 ~ "None",
    opioid_freq3mo_final %in% c("Some days",
                                "Most days",
                                "Every day") &
      ihm4pain_bin == 0 & support4pain_bin == 0 ~ "Opioids only",
    opioid_freq3mo_final == "Never" &
      (ihm4pain_bin == 1|support4pain_bin == 1) ~ "Nonpharm only",
    opioid_freq3mo_final %in% c("Some days",
                                "Most days",
                                "Every day") &
      (ihm4pain_bin == 1|support4pain_bin == 1) ~ "Opioids + Nonpharm"),
    nonpharm_opioid = factor(nonpharm_opioid, levels = c("None", 
                                                     "Opioids only", 
                                                     "Nonpharm only", 
                                                     "Opioids + Nonpharm")))


projectb_clean |> tabyl(nonpharm_opioid) |>
  adorn_pct_formatting() |>
  flextable() |>
  colformat_double(digits = 0) |>
  autofit() |>
  theme_zebra(odd_header = "white") |>
  hline_top(part="body")

nonpharm_opioid

n

percent

valid_percent

None

3,201

44.6%

45.0%

Opioids only

510

7.1%

7.2%

Nonpharm only

2,791

38.9%

39.2%

Opioids + Nonpharm

615

8.6%

8.6%

67

0.9%

-

3.2.11 IHM Category

Code
projectb_clean <- projectb_clean |>
  mutate(manualtx4pain = case_when(
    chiro + massage >= 1 ~ 1,
    chiro + massage == 0 ~ 0),
    mindbodytx4pain = case_when(
      yoga_taichi + meditation >= 1 ~ 1,
      yoga_taichi + meditation  == 0 ~ 0),
    ihm4pain_cat = case_when(
      manualtx4pain + mindbodytx4pain == 0 ~ "None",
      manualtx4pain == 1 & mindbodytx4pain == 0 ~ "Manual only",
      manualtx4pain == 0 & mindbodytx4pain == 1 ~ "Mind-body only",
      manualtx4pain == 1 & mindbodytx4pain == 1 ~ "Manual and mind-body"),
    ihm4pain_cat = factor(ihm4pain_cat, levels = c("None", 
                                                   "Manual only", 
                                                   "Mind-body only", 
                                                   "Manual and mind-body")))
projectb_clean |> tabyl(ihm4pain_cat) |>
  adorn_pct_formatting() |>
  flextable() |>
  colformat_double(digits = 0) |>
  autofit() |>
  theme_zebra(odd_header = "white") |>
  hline_top(part="body")

ihm4pain_cat

n

percent

valid_percent

None

4,587

63.9%

64.0%

Manual only

1,125

15.7%

15.7%

Mind-body only

818

11.4%

11.4%

Manual and mind-body

633

8.8%

8.8%

21

0.3%

-

3.2.12 Selecting Variables I Used and Arranging the Tibble

With all variables recoded, I created the projectb_data tibble as a subset of projectb_clean, limited to the variables needed for analysis and subjects with complete data on the outcomes, ihm4pain_num and nonpharm_opioid. I also renamed opioid_freq3mo_final as opioid_freq3mo.

Code
projectb_data <- projectb_clean |>
  select(subject_id, age, sex, education, urban_rural,
         painlimit_lifework, opioid_freq3mo = opioid_freq3mo_final,
         gad7total, phq8total, num_pain_locations,
         ihm4pain_num, race_ethnicity, fam_income, 
         nonpharm_opioid, ihm4pain_bin, ihm4pain_cat, wt_final) |>
  filter(!is.na(ihm4pain_num), !is.na(nonpharm_opioid))

3.2.13 Working with Categorical Predictors

To examine distribution of categorical predictors, I used the tbl_summary() function to summarize the factor variables in projectb_data.

Code
projectb_data |> select(where(~ is.factor(.x))) |>
  tbl_summary() |>
  modify_header(label ~ "**Variable**") |>
  bold_labels()
Variable N = 7,1141
sex
    Female 4,050 (57%)
    Male 3,063 (43%)
    Unknown 1
education
    Less than high school 870 (12%)
    High school graduate/GED 2,057 (29%)
    Some college, no degree 1,325 (19%)
    Associate's or bachelor's degree 2,193 (31%)
    Master's, doctoral, or professional degree 628 (8.9%)
    Unknown 41
urban_rural
    Nonmetropolitan 1,491 (21%)
    Medium and small metro 2,500 (35%)
    Large fringe metro 1,456 (20%)
    Large central metro 1,667 (23%)
painlimit_lifework
    Never 1,999 (28%)
    Some days 2,480 (35%)
    Most days 1,088 (15%)
    Every day 1,543 (22%)
    Unknown 4
opioid_freq3mo
    Never 5,991 (84%)
    Some days 342 (4.8%)
    Most days 133 (1.9%)
    Every day 648 (9.1%)
race_ethnicity
    NH White only 5,421 (76%)
    Hispanic 605 (8.5%)
    NH Black/African American only 743 (10%)
    NH Asian only 113 (1.6%)
    NH Amer Indian Alaska Natv 168 (2.4%)
    Other single and multiple races 64 (0.9%)
nonpharm_opioid
    None 3,201 (45%)
    Opioids only 510 (7.2%)
    Nonpharm only 2,790 (39%)
    Opioids + Nonpharm 613 (8.6%)
ihm4pain_cat
    None 4,554 (64%)
    Manual only 1,114 (16%)
    Mind-body only 815 (11%)
    Manual and mind-body 631 (8.9%)
1 n (%)

In examining the table above, I decided that the variable race_ethnicity required some collapsing of categories to ensure all categories contained at least 25 subjects in the testing and training samples. Therefore, I combined the race_ethnicity categories “NH Asian only”, “NH Amer Indian Alaska Natv”, and “Other single and multiple races” into “Other single and multiple races.”

Code
projectb_data <- projectb_data |>
  mutate(race_ethnicity = case_when(
    race_ethnicity == "NH White only" ~ "NH White only",
    race_ethnicity == "Hispanic" ~ "Hispanic",
    race_ethnicity == "NH Black/African American only" ~ 
      "NH Black/African American only",
    race_ethnicity == "NH Asian only" ~ "Other single and multiple races",
    race_ethnicity == "NH Amer Indian Alaska Natv" ~ 
      "Other single and multiple races",
    race_ethnicity == "Other single and multiple races" ~ 
      "Other single and multiple races")) |>
  mutate(race_ethnicity = factor(race_ethnicity, 
                                 levels = c("NH White only", 
                                            "Hispanic", 
                                            "NH Black/African American only", 
                                            "Other single and multiple races")))

4 Code Book and Description

4.1 Defining the Variables

7,114 adults ages 18-85 with pain on most days or every day in the past 3 months and participating in the 2019 NHIS with data on the variables listed in the table below.

For each of my 17 selected variables, I have included the original variable name from the raw NHIS file.

Variable Role Type Description (units/levels) Original NHIS Variable Name
subject_id identifier - Randomly assigned household number unique to a household (H000074-H064180) HHX
age input quant Age of subject AGEP_A
sex input 2-cat Sex (Male or Female) SEX_A
education input 5-cat Education level (Less than high school; High school graduate/GED; Some college, no degree; Associate’s or bachelor’s degree; Master’s, doctoral, or professional degree) EDUC_A
urban_rural input 4-cat Urban-rural classification scheme for counties (Large central metro, Large fringe metro, Medium and small metro, Non-metropolitan) URBRRL
painlimit_lifework input 4-cat How often pain limited the subject’s life or work activities in the past 3 months (Never, Some days, Most days, or Every day) PAIWKLM3M_A
opioid_freq3mo input 4-cat How often the subject took a prescription opioid for chronic pain in the past three months (Never, Some days, Most days, or Every day) RX12M_A, OPD12M_A, OPD3M_A, OPDCHRONIC_A, OPDFREQ_A
gad7total input quant Subject’s total score on the GAD-7 anxiety scale (scale units 0 - 21) SUM(GAD71_A, GAD72_A, GAD73_A, GAD74_A, GAD75_A, GAD76_A, GAD77_A)
phq8total input quant Subject’s total score on the PHQ-8 depression scale (scale units 0 - 24) SUM(PHQ81_A, PHQ82_A, PHQ83_A, PHQ84_A, PHQ85_A, PHQ86_A, PHQ87_A, PHQ88_A)
num_pain_locations input quant Total number of locations in which subject reported being bothered at least “a little” by pain in the past three months (0 - 6) PAIBACK3M_A, PAIULMB3M_A, PAILLMB3M_A, PAIHDFC3M_A, PAIAPG3M_A, PAITOOTH3M_A
ihm4pain_num outcome quant How many integrative health and medicine (IHM) modalities (i.e., chiropractic care, yoga/Tai Chi, meditation/guided imagery, or massage) subject used to manage pain in the past three months (0, 1, 2, 3, or 4) PAICHIRO_A, PAIYOGA_A, PAIMASSAGE_A, PAIMEDITAT_A
race_ethnicity input 4-cat Race and Hispanic origin (NH White only, Hispanic, NH Black/African American only, or Other single and multiple races) HISPALLP_A
fam_income input quant Adult annual family income FAMINCTC_A
nonpharm_opioid outcome 4-cat Engagement in one of the following categories of care for pain in the past three months (None, Opioids only, Nonpharm only, Opioids + Nonpharm). PAICHIRO_A, PAIYOGA_A, PAIMASSAGE_A, PAIMEDITAT_A, PAIPHYSTPY_A, PAITALKTPY_A, PAIPROGRAM_A, PAIGROUP_A, RX12M_A, OPD12M_A, OPD3M_A, OPDCHRONIC_A, OPDFREQ_A
ihm4pain_bin outcome quant Whether or not subject engaged in integrative health and medicine (IHM) modalities (i.e., chiropractic care, yoga/Tai Chi, meditation/guided imagery, or massage) to manage pain in the past three months (0, 1, 2, 3, or 4) PAICHIRO_A, PAIYOGA_A, PAIMASSAGE_A, PAIMEDITAT_A
ihm4pain_cat outcome quant Which type of (IHM) modalities (i.e., chiropractic care, yoga/Tai Chi, meditation/guided imagery, or massage) subject used to manage pain in the past three months (None, Manual only, Mind-body only, Manual and mind-body) PAICHIRO_A, PAIYOGA_A, PAIMASSAGE_A, PAIMEDITAT_A
wt_final weight quant Weight - Final Annual WTFA_A

4.2 Numerical Description

The following code provides a numerical summary of all variables included in the projectb_data dataset using the describe() function from the Hmisc() library.

In examining the data, I confirm that:

  • There are no impossible values for age (i.e., > 85 or < 18), fam_income, or phq8total (i.e., > 24 or < 0).
  • Each factor variable is in an appropriate ascending order.
  • All categorical and binary variables have at least 30 observations in each category.
Code
describe(projectb_data)
projectb_data 

 17  Variables      7114  Observations
--------------------------------------------------------------------------------
subject_id 
       n  missing distinct 
    7114        0     7114 

lowest : H000003 H000005 H000010 H000023 H000040
highest: H064153 H064154 H064160 H064165 H064180
--------------------------------------------------------------------------------
age 
       n  missing distinct     Info     Mean      Gmd      .05      .10 
    7107        7       68        1    58.68    18.56       29       35 
     .25      .50      .75      .90      .95 
      48       60       71       80       85 

lowest : 18 19 20 21 22, highest: 81 82 83 84 85
--------------------------------------------------------------------------------
sex 
       n  missing distinct 
    7113        1        2 
                        
Value      Female   Male
Frequency    4050   3063
Proportion  0.569  0.431
--------------------------------------------------------------------------------
education 
       n  missing distinct 
    7073       41        5 

lowest : Less than high school                      High school graduate/GED                   Some college, no degree                    Associate's or bachelor's degree           Master's, doctoral, or professional degree
highest: Less than high school                      High school graduate/GED                   Some college, no degree                    Associate's or bachelor's degree           Master's, doctoral, or professional degree
--------------------------------------------------------------------------------
urban_rural 
       n  missing distinct 
    7114        0        4 
                                                                               
Value             Nonmetropolitan Medium and small metro     Large fringe metro
Frequency                    1491                   2500                   1456
Proportion                  0.210                  0.351                  0.205
                                 
Value         Large central metro
Frequency                    1667
Proportion                  0.234
--------------------------------------------------------------------------------
painlimit_lifework 
       n  missing distinct 
    7110        4        4 
                                                  
Value          Never Some days Most days Every day
Frequency       1999      2480      1088      1543
Proportion     0.281     0.349     0.153     0.217
--------------------------------------------------------------------------------
opioid_freq3mo 
       n  missing distinct 
    7114        0        4 
                                                  
Value          Never Some days Most days Every day
Frequency       5991       342       133       648
Proportion     0.842     0.048     0.019     0.091
--------------------------------------------------------------------------------
gad7total 
       n  missing distinct     Info     Mean      Gmd      .05      .10 
    7033       81       22    0.946    3.977    5.245        0        0 
     .25      .50      .75      .90      .95 
       0        2        6       12       16 

lowest :  0  1  2  3  4, highest: 17 18 19 20 21
--------------------------------------------------------------------------------
phq8total 
       n  missing distinct     Info     Mean      Gmd      .05      .10 
    7007      107       25    0.985    5.058    5.674        0        0 
     .25      .50      .75      .90      .95 
       1        3        8       13       17 

lowest :  0  1  2  3  4, highest: 20 21 22 23 24
--------------------------------------------------------------------------------
num_pain_locations 
       n  missing distinct     Info     Mean      Gmd 
    7106        8        7    0.954    3.071    1.514 
                                                    
Value          0     1     2     3     4     5     6
Frequency     82   850  1554  2034  1469   819   298
Proportion 0.012 0.120 0.219 0.286 0.207 0.115 0.042

For the frequency table, variable is rounded to the nearest 0
--------------------------------------------------------------------------------
ihm4pain_num 
       n  missing distinct     Info     Mean      Gmd 
    7114        0        5    0.725    0.545   0.7913 
                                        
Value          0     1     2     3     4
Frequency   4554  1602   667   223    68
Proportion 0.640 0.225 0.094 0.031 0.010

For the frequency table, variable is rounded to the nearest 0
--------------------------------------------------------------------------------
race_ethnicity 
       n  missing distinct 
    7114        0        4 
                                                                          
Value                        NH White only                        Hispanic
Frequency                             5421                             605
Proportion                           0.762                           0.085
                                                                          
Value       NH Black/African American only Other single and multiple races
Frequency                              743                             345
Proportion                           0.104                           0.048
--------------------------------------------------------------------------------
fam_income 
       n  missing distinct     Info     Mean      Gmd      .05      .10 
    7114        0     1225        1    57217    51257     8741    11000 
     .25      .50      .75      .90      .95 
   20110    40434    79000   125000   160000 

lowest :      0      1      4      5     10, highest: 209500 210000 215341 218000 220000
--------------------------------------------------------------------------------
nonpharm_opioid 
       n  missing distinct 
    7114        0        4 
                                                                   
Value                    None       Opioids only      Nonpharm only
Frequency                3201                510               2790
Proportion              0.450              0.072              0.392
                             
Value      Opioids + Nonpharm
Frequency                 613
Proportion              0.086
--------------------------------------------------------------------------------
ihm4pain_bin 
       n  missing distinct     Info      Sum     Mean      Gmd 
    7114        0        2    0.691     2560   0.3599   0.4608 

--------------------------------------------------------------------------------
ihm4pain_cat 
       n  missing distinct 
    7114        0        4 
                                                                         
Value                      None          Manual only       Mind-body only
Frequency                  4554                 1114                  815
Proportion                0.640                0.157                0.115
                               
Value      Manual and mind-body
Frequency                   631
Proportion                0.089
--------------------------------------------------------------------------------
wt_final 
       n  missing distinct     Info     Mean      Gmd      .05      .10 
    7114        0     7072        1     6986     4729     2341     2761 
     .25      .50      .75      .90      .95 
    3692     5837     8509    12623    16424 

lowest : 733.485 738.112 762.185 763.376 822.349
highest: 45380.6 48013.1 49073.1 51662.9 78079.9
--------------------------------------------------------------------------------

5 Single Imputation

5.1 Missingness

For this analysis, I assumed that the data were Missing at Random (MAR). Specifically, I assumed that the probability of a predictor missing data depended upon available information from the other predictors. For example, missing phq8total data may be related to other variables such as how much pain limited the subject’s life or work activities (painlimit_lifework) or race_ethnicity rather than their level of depression. In this situation, certain subjects may have been less likely to rate their anxiety based on their other characteristics. Thus, single imputation was warranted and required as missingness could be accounted for by studying variables with complete information. Missing NOT at Random (MNAR) was not an appropriate assumption here as there was no indication in the data or the 2019 NHIS Codebook that data were missing due to the variables themselves.

5.1.1 Missing Values

To explore missingness within the projectb_data, I first used the n_miss() to quantify total missing values.

Code
projectb_data |> n_miss()
[1] 249

5.1.2 Missing Values Within Variables

I then used the miss_var_summary() to determine which variables contained those 249 missing values. These variables included phq8total, gad7total, education, num_pain_locations, age, painlimit_lifework, and sex. Given the relative infrequency of missing data (highest = 1.50%), I expected that performing imputation would not alter subsequent models in any meaningful way.

Code
miss_var_summary(projectb_data) |>   flextable() |>
  colformat_double(digits = 2) |>
  autofit() |>
  theme_zebra(odd_header = "white") |>
  hline_top(part="body")

variable

n_miss

pct_miss

phq8total

107

1.50

gad7total

81

1.14

education

41

0.58

num_pain_locations

8

0.11

age

7

0.10

painlimit_lifework

4

0.06

sex

1

0.01

subject_id

0

0.00

urban_rural

0

0.00

opioid_freq3mo

0

0.00

ihm4pain_num

0

0.00

race_ethnicity

0

0.00

fam_income

0

0.00

nonpharm_opioid

0

0.00

ihm4pain_bin

0

0.00

ihm4pain_cat

0

0.00

wt_final

0

0.00

5.1.3 Missing Values within Cases

Finally, to understand missing values within cases, I used the miss_case_table() function, which revealed that 159 cases were missing 1 value, 42 cases were missing 2 values, and 2 cases were missing 3 values

Code
miss_case_table(projectb_data) |> flextable() |>
  colformat_double(digits = 0) |>
  autofit() |>
  theme_zebra(odd_header = "white") |>
  hline_top(part="body")

n_miss_in_case

n_cases

pct_cases

0

6,911

97

1

159

2

2

42

1

3

2

0

5.2 Generating the Single Imputation Dataset

To perform single imputation and generate the projectb_si dataset, I performed the following steps:

  • Set a seed of 19900317 using the set.seed() function
  • Created projectb_si as a data.frame() derived from projectb_data
  • Used robust linear models to impute the quantitative variable gad7total based on the predictors with no missing values: urban_rural, race_ethnicity, fam_income and opioid_freq3mo,
  • Used robust linear models to impute the quantitative variable phq8total based on gad7total and the predictors with no missing values: urban_rural, race_ethnicity, fam_income, and opioid_freq3mo.
  • Used classification and regression trees (impute_cart()) to impute the categorical variable education based on the now complete variable, phq8total, and the other complete predictors.
  • Used predictive mean matching (impute_pmm()) to impute the quantitative integer variable num_pain_locations based on the now complete variable, education, and the other complete predictors as I wanted the imputation to be an integer.
  • Used robust linear models to impute the quantitative variable age based on the now complete variable, num_pain_locations, and the other complete predictors.
  • Used classification and regression trees to impute the categorical variable painlimit_lifework based on the now complete variable, age, and the other complete predictors.
  • Used classification and regression trees to impute the categorical variable sex based on the now complete variable, painlimit_lifework, and the other complete predictors.
  • Used the as_tibble() function to ensure projectb_si was stored as a tibble.
  • I performed mutations to use the variable one in the survey design and transform covariates so that coefficients would represent meaningful increases in (1) age10 [10 unit increment increase in age]; (2) gad7total5unit [5-unit increases in GAD-7]; (3) phq8total5unit [5-unit increases in PHQ-8]; (4) fam_income10k [10,000 dollar increase in family income]; (5) fam_income10k [1,000 dollar increase in family income].
  • I created a meaningful factor for the ihm4pain_bin_fact variable (i.e., IHM vs. No IHM)
  • Used the n_miss() function to verify that projectb_si contained no missing values.
Code
set.seed(19900317)
projectb_si <- projectb_data |> data.frame() |>
  impute_rlm(gad7total ~ urban_rural + race_ethnicity +
               fam_income + opioid_freq3mo) |>
  impute_rlm(phq8total ~ gad7total + urban_rural + race_ethnicity +
               fam_income + opioid_freq3mo) |>
  impute_cart(education ~ phq8total + gad7total + urban_rural + race_ethnicity +
               fam_income + opioid_freq3mo) |>
  impute_pmm(num_pain_locations ~ phq8total + opioid_freq3mo + education + gad7total + 
               urban_rural + race_ethnicity + fam_income) |>
  impute_rlm(age ~ phq8total + num_pain_locations + opioid_freq3mo + education + 
               gad7total + urban_rural + race_ethnicity + 
               fam_income) |>
  impute_cart(painlimit_lifework ~ phq8total + age + num_pain_locations + 
                opioid_freq3mo + education + gad7total + 
                urban_rural + race_ethnicity + fam_income) |>
  impute_cart(sex ~ phq8total + painlimit_lifework + age + num_pain_locations + 
                opioid_freq3mo + education + gad7total + 
                urban_rural + race_ethnicity + fam_income) |>
  as_tibble() |>
  mutate(one = 1,
         age10 = age/10,
         gad7total5unit = gad7total/5,
         phq8total5unit = phq8total/5,
         fam_income10k = round(fam_income/10000, digits = 2),
         fam_income1k = round(fam_income/1000, digits = 2),
         ihm4pain_bin_fact = factor(case_when(ihm4pain_bin == 0 ~ "No IHM",
                                              ihm4pain_bin == 1 ~ "IHM")))
n_miss(projectb_si)
[1] 0

6 Model 1: Engagement in IHM

For Model 1, I sought to answer the question:

Which demographic, mental health, and pain-related variables are associated with engagement in the number of IHM modalities used for pain in the past 3 months among adults with chronic pain?

In this analysis, I chose to predict ihm4pain_num based on 10 variables: age10, sex, education, urban_rural, painlimit_lifework, opioid_freq3mo, phq8total5unit, num_pain_locations, race_ethnicity, and fam_income10k.

6.1 Table 1: Bivariate Analysis

The following code details the steps taken to generate Table 1.

6.1.1 Create Survey Design

Code
nhis_des <- svydesign(data = projectb_si, id = ~ subject_id,
                        weights = ~ wt_final, nest = TRUE)

6.1.2 Total population represented

Code
total_n <- svytotal(~ one, nhis_des, na.rm = FALSE) |> as_tibble() |>
  mutate(Variable = "Total, n",
         total = floor(total),
         total = formatC(total, format = "d", big.mark = ","),
         p.value = "") |>
  select(Variable, All = total, p.value)
Code
ihm_n <- svyby(~ one, ~ ihm4pain_bin, nhis_des, svytotal) |> as_tibble() |> filter(ihm4pain_bin == 1) |> select(IHM = one) |>
  mutate(IHM = round(IHM),
         IHM = formatC(IHM, format = "d", big.mark = ","))
no_ihm_n <- svyby(~ one, ~ ihm4pain_bin, nhis_des, svytotal) |> as_tibble() |> filter(ihm4pain_bin == 0) |> select(NoIHM = one) |>
  mutate(NoIHM = round(NoIHM),
         NoIHM = formatC(NoIHM, format = "d", big.mark = ","))
total_n <- bind_cols(total_n, ihm_n, no_ihm_n) |>
  mutate(All = as.character(All),
         IHM = as.character(IHM),
         NoIHM = as.character(NoIHM))
Code
total_n <- total_n |> select(Variable, All, IHM, NoIHM, p.value) 

total_n |>
  flextable() |>
  autofit() |>
  bold(bold = TRUE, part = "header") |>
  theme_zebra(odd_header = "white") |>
  hline_top(part="body")

Variable

All

IHM

NoIHM

p.value

Total, n

49,695,212

17,610,400

32,084,812

6.1.3 Crosstabulation

Code
style_percent_1digit <- purrr::partial(gtsummary::style_percent, digits = 1)
style_number_1digit <- purrr::partial(gtsummary::style_number, digits = 1)

table1_cross <- nhis_des |>
  tbl_svysummary(
    include = c(age, sex, race_ethnicity, education, 
                fam_income1k, urban_rural, num_pain_locations,
                painlimit_lifework, opioid_freq3mo, phq8total),
    type = c(num_pain_locations = "continuous",
             fam_income = "continuous"),
    statistic = list(all_continuous()  ~ "{mean}",
                     all_categorical() ~ "{p}"),
    label = list(age ~ "Age in years, mean (95% CI)",
                 sex ~ "Sex, % (95% CI)",
                 race_ethnicity ~ "Race/Ethnicity, % (95% CI)",
                 education ~ "Education, % (95% CI)",
                 fam_income1k ~ "Family income in $1000s, mean (95% CI)",
                 urban_rural ~ "Proximity to metro area, % (95% CI)",
                 num_pain_locations ~ "Number of pain locations, mean (95% CI)",
                 painlimit_lifework ~ "Pain limiting life/work, % (95% CI)",
                 opioid_freq3mo ~ "Opioid use in past 3 months, % (95% CI)",
                 phq8total ~ "PHQ-8 total, mean (95% CI)"),
    digits = list(all_continuous() ~ 1,
                  all_categorical() ~ 1),
    by = ihm4pain_bin_fact) |>
  add_overall() |>
  add_ci(pattern = "{stat} ({ci})", 
         style_fun = list(all_categorical() ~ style_percent_1digit, 
                          all_continuous() ~ style_number_1digit),
         statistic = list(all_categorical() ~ "{conf.low}-{conf.high}", 
                          all_continuous() ~ "{conf.low}-{conf.high}")) |>
  add_p(pvalue_fun = function(x) style_pvalue(x, digits = 3)) |>
  modify_header(label = "Variable",
                stat_0 ~ "All",
                stat_1 ~ "IHM",
                stat_2 ~ "NoIHM",
                p.value ~ "p.value")

table1_cross
Variable All1,2 IHM1,2 NoIHM1,2 p.value3
Age in years, mean (95% CI) 55.4 (54.9-55.9) 51.8 (51.0-52.6) 57.3 (56.7-57.9) <0.001
Sex, % (95% CI)


<0.001
    Female 54.9 (53.5-56.3) 60.3 (57.9-62.6) 52.0 (50.2-53.7)
    Male 45.1 (43.7-46.5) 39.7 (37.4-42.1) 48.0 (46.3-49.8)
Race/Ethnicity, % (95% CI)


0.005
    NH White only 73.2 (71.9-74.5) 74.8 (72.7-76.8) 72.3 (70.6-73.9)
    Hispanic 10.5 (9.59-11.5) 10.1 (8.79-11.6) 10.7 (9.52-12.0)
    NH Black/African American only 11.0 (10.1-11.9) 9.0 (7.73-10.5) 12.1 (11.0-13.3)
    Other single and multiple races 5.3 (4.66-6.03) 6.1 (5.01-7.33) 4.9 (4.11-5.81)
Education, % (95% CI)


<0.001
    Less than high school 15.5 (14.4-16.7) 8.8 (7.45-10.4) 19.2 (17.7-20.7)
    High school graduate/GED 30.4 (29.1-31.7) 25.2 (23.1-27.5) 33.2 (31.6-34.9)
    Some college, no degree 19.1 (18.0-20.2) 21.3 (19.4-23.4) 17.8 (16.5-19.2)
    Associate's or bachelor's degree 28.2 (27.0-29.4) 34.3 (32.2-36.6) 24.8 (23.4-26.3)
    Master's, doctoral, or professional degree 6.9 (6.31-7.53) 10.3 (9.11-11.6) 5.0 (4.42-5.71)
Family income in $1000s, mean (95% CI) 63.1 (61.6-64.6) 71.7 (69.0-74.3) 58.4 (56.7-60.2) <0.001
Proximity to metro area, % (95% CI)


<0.001
    Nonmetropolitan 19.9 (18.8-21.0) 16.1 (14.5-17.9) 22.0 (20.6-23.5)
    Medium and small metro 33.9 (32.6-35.3) 33.1 (30.9-35.3) 34.4 (32.8-36.1)
    Large fringe metro 21.7 (20.5-22.9) 23.6 (21.6-25.8) 20.6 (19.2-22.1)
    Large central metro 24.4 (23.2-25.7) 27.2 (25.1-29.4) 22.9 (21.4-24.5)
Number of pain locations, mean (95% CI) 3.1 (3.1-3.1) 3.4 (3.3-3.4) 3.0 (2.9-3.0) <0.001
Pain limiting life/work, % (95% CI)


<0.001
    Never 28.2 (27.0-29.5) 22.9 (21.1-24.9) 31.1 (29.5-32.8)
    Some days 35.6 (34.2-36.9) 38.6 (36.2-40.9) 33.9 (32.2-35.6)
    Most days 15.4 (14.4-16.5) 17.1 (15.2-19.0) 14.5 (13.3-15.8)
    Every day 20.8 (19.7-22.0) 21.5 (19.6-23.5) 20.5 (19.1-21.9)
Opioid use in past 3 months, % (95% CI)


0.026
    Never 84.8 (83.7-85.8) 85.5 (83.7-87.2) 84.4 (83.1-85.7)
    Some days 4.7 (4.12-5.38) 5.1 (3.97-6.45) 4.5 (3.86-5.27)
    Most days 1.8 (1.47-2.19) 2.2 (1.63-2.91) 1.6 (1.20-2.09)
    Every day 8.7 (7.91-9.53) 7.2 (6.12-8.49) 9.5 (8.47-10.6)
PHQ-8 total, mean (95% CI) 5.3 (5.1-5.4) 5.7 (5.4-6.0) 5.0 (4.8-5.2) <0.001
1 Mean; %
2 CI = Confidence Interval
3 Wilcoxon rank-sum test for complex survey samples; chi-squared test with Rao & Scott’s second-order correction

6.1.4 Assemble Table 1 Tibble

Code
table1_cross <- table1_cross |> as.data.frame() |> as_tibble()

table1_cross[is.na(table1_cross)] <- ""

table1_cross$p.value <- sub("0\\.",".", table1_cross$p.value)
table1_cross$p.value <- sub(" ","", table1_cross$p.value)

table1 <- bind_rows(total_n, table1_cross) |>
  rename("No IHM" = NoIHM,
         "All with chronic pain" = All,
         "Weighted variable" = Variable,
         "P value" = p.value)

6.2 Splitting into Training and Testing Samples

I then used the initial_split(), training(), and testing() functions to generate projectb_si_train (training data) and projectb_si_test (testing data), each with n = 3557 subjects.

Code
set.seed(19900317)

projectb_split <- initial_split(projectb_si, prop = 0.5)

projectb_si_train <- training(projectb_split)
projectb_si_test <- testing(projectb_split)

nrow(projectb_si_train)
[1] 3557
Code
nrow(projectb_si_test)
[1] 3557

6.3 Scatterplot Matrix and Collinearity

6.3.1 Scatterplot Matrix of Quantitative Predictors with Outcome

To investigate issues of collinearity prior to fitting models, I generated a series of scatterplot matrices using ggpairs(). First I assessed the relationship between the quantitative outcomes and ihm4pain_num.

Code
temp1quant <- projectb_si_train |> 
  select(age10, phq8total5unit, num_pain_locations, fam_income10k, ihm4pain_num)

ggpairs(temp1quant, 
        title = "Scatterplot Matrix of Quantitative Variables with Outcome",
        lower = list(combo = wrap("facethist", bins = 20)))

Code
rm(temp1quant)

In examining the plot above, there do not appear to be any strong (i.e., > 0.6) correlations between quantitative predictors and the outcome.

6.3.2 Scatterplot Matrix of Categorical Variables with Outcome

Then, I looked at the relationships between three of the categorical predictors and ihm4pain_num using ggpairs().

Code
temp2cat1 <- projectb_si_train |> 
  select(sex, education, urban_rural, ihm4pain_num)

ggpairs(temp2cat1, 
        title = "Scatterplot Matrix of Categorical Variables with Outcome",
        lower = list(combo = wrap("facethist", bins = 20)))

Code
rm(temp2cat1)

6.3.3 Scatterplot Matrix of Remaining Categorical Variables with Outcome

Finally, I looked at the relationships between the two remaining categorical predictors and ihm4pain_num using ggpairs().

Code
temp2cat2 <- projectb_si_train |> 
  select(painlimit_lifework, opioid_freq3mo, race_ethnicity, ihm4pain_num)

ggpairs(temp2cat2, 
        title = "Scatterplot Matrix of Categorical Variables with Outcome",
        lower = list(combo = wrap("facethist", bins = 20)))

Code
rm(temp2cat2)

To investigate this further, I calculated variance inflation factors (VIF) for each variable using the vif() function. Given the recommendation to look closely at any VIF value that is greater than 5, there does not seem to be an issue with collinearity in this full model as all VIF values are less than 1.4.

Code
vif(lm(ihm4pain_num ~ age10 + sex + education +
                     urban_rural + painlimit_lifework + opioid_freq3mo +
                     phq8total5unit + num_pain_locations + fam_income10k + race_ethnicity, 
                   data = projectb_si_train))
                       GVIF Df GVIF^(1/(2*Df))
age10              1.141259  1        1.068297
sex                1.044327  1        1.021923
education          1.282077  4        1.031548
urban_rural        1.129965  3        1.020573
painlimit_lifework 1.294856  3        1.044007
opioid_freq3mo     1.093742  3        1.015046
phq8total5unit     1.396814  1        1.181869
num_pain_locations 1.242368  1        1.114616
fam_income10k      1.305455  1        1.142565
race_ethnicity     1.179024  3        1.027828

6.4 Considering Non-Linear Terms

To determine how best to potentially spend degrees of freedom on non-linear terms, I generated a Spearman \(\rho^2\) plot using the plot() and spearman2()functions.

Code
plot(spearman2(ihm4pain_num ~ age10 + sex + education +
                 urban_rural + painlimit_lifework + opioid_freq3mo +
                 phq8total5unit + num_pain_locations + fam_income10k + race_ethnicity, 
               data = projectb_si_train))

The plot suggests that I consider, for example, an interaction between education and age10 as a non-linear term. However, given that 1) this model already contains 21 degrees of freedom, I opted not to include any non-linear terms.

6.5 Fitting Models for Comparison

I chose to compare the following 6 modeling approaches for m1 before choosing a final version.

  1. Poisson
  2. Negative binomial regression
  3. Zero-inflated Poisson
  4. Zero-inflated negative binomial
  5. Hurdle with Poisson
  6. Hurdle with negative binomial

In the following series of code chunks, I performed the following

  • Fit each model using the appropriate function
  • Generated a summary of fit statistics including \(R^2\), RMSE, MAE, AIC, and BIC using various functions including augment(), glance(), metric_set(), and predict().
  • Used functions including mutate(), group_by(), and summarize() to pivot the fit statistics from each model into a tidy tibble for combination later.

6.5.1 Poisson (m1_pois)

Code
m1_pois <- glm(ihm4pain_num ~ age10 + sex + education +
                 urban_rural + painlimit_lifework + opioid_freq3mo +
                 phq8total5unit + num_pain_locations + fam_income10k + race_ethnicity, 
                 data = projectb_si_train, family = "poisson")

m1_pois_aug <- augment(m1_pois, projectb_si_train, type.predict = "response")
m1_pois_glance <- glance(m1_pois)

mets <- metric_set(rsq, rmse, mae)

m1_pois_summary <-
  mets(m1_pois_aug, truth = ihm4pain_num, estimate = .fitted) |>
  mutate(Model = "m1_pois",
         R2 = if_else(.metric == "rsq", .estimate, NA),
         RMSE = if_else(.metric == "rmse", .estimate, NA),
         MAE = if_else(.metric == "mae", .estimate, NA)) |> relocate(Model)

m1_pois_summary <- m1_pois_summary |>
  group_by(Model) |>
  summarize(R2 = mean(R2, na.rm = TRUE),
                   RMSE = mean(RMSE, na.rm = TRUE),
                   MAE = mean(MAE, na.rm = TRUE))

m1_pois_summary <- bind_cols(m1_pois_summary, m1_pois_glance) |>
  select(-c(logLik))
summary(m1_pois)

Call:
glm(formula = ihm4pain_num ~ age10 + sex + education + urban_rural + 
    painlimit_lifework + opioid_freq3mo + phq8total5unit + num_pain_locations + 
    fam_income10k + race_ethnicity, family = "poisson", data = projectb_si_train)

Coefficients:
                                                     Estimate Std. Error
(Intercept)                                         -1.183452   0.162138
age10                                               -0.131602   0.014550
sexMale                                             -0.275074   0.047688
educationHigh school graduate/GED                    0.273482   0.105837
educationSome college, no degree                     0.656055   0.105739
educationAssociate's or bachelor's degree            0.812995   0.101206
educationMaster's, doctoral, or professional degree  0.960610   0.117198
urban_ruralMedium and small metro                    0.023538   0.068379
urban_ruralLarge fringe metro                        0.064347   0.075702
urban_ruralLarge central metro                       0.251440   0.071819
painlimit_lifeworkSome days                          0.205155   0.060383
painlimit_lifeworkMost days                          0.256226   0.074951
painlimit_lifeworkEvery day                          0.260852   0.075321
opioid_freq3moSome days                              0.036669   0.099685
opioid_freq3moMost days                              0.072801   0.166465
opioid_freq3moEvery day                             -0.182815   0.091255
phq8total5unit                                      -0.029353   0.024111
num_pain_locations                                   0.147973   0.018430
fam_income10k                                        0.027594   0.004528
race_ethnicityHispanic                              -0.008236   0.082486
race_ethnicityNH Black/African American only        -0.203382   0.084115
race_ethnicityOther single and multiple races       -0.180839   0.107077
                                                    z value Pr(>|z|)    
(Intercept)                                          -7.299 2.90e-13 ***
age10                                                -9.045  < 2e-16 ***
sexMale                                              -5.768 8.01e-09 ***
educationHigh school graduate/GED                     2.584 0.009766 ** 
educationSome college, no degree                      6.204 5.49e-10 ***
educationAssociate's or bachelor's degree             8.033 9.51e-16 ***
educationMaster's, doctoral, or professional degree   8.196 2.48e-16 ***
urban_ruralMedium and small metro                     0.344 0.730672    
urban_ruralLarge fringe metro                         0.850 0.395324    
urban_ruralLarge central metro                        3.501 0.000464 ***
painlimit_lifeworkSome days                           3.398 0.000680 ***
painlimit_lifeworkMost days                           3.419 0.000629 ***
painlimit_lifeworkEvery day                           3.463 0.000534 ***
opioid_freq3moSome days                               0.368 0.712985    
opioid_freq3moMost days                               0.437 0.661868    
opioid_freq3moEvery day                              -2.003 0.045140 *  
phq8total5unit                                       -1.217 0.223444    
num_pain_locations                                    8.029 9.82e-16 ***
fam_income10k                                         6.094 1.10e-09 ***
race_ethnicityHispanic                               -0.100 0.920464    
race_ethnicityNH Black/African American only         -2.418 0.015610 *  
race_ethnicityOther single and multiple races        -1.689 0.091246 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 4380.5  on 3556  degrees of freedom
Residual deviance: 3834.2  on 3535  degrees of freedom
AIC: 6841.7

Number of Fisher Scoring iterations: 6

6.5.2 Negative Binomial Regression (m1_nb)

Code
m1_nb <- glm.nb(ihm4pain_num ~ age10 + sex + education +
                 urban_rural + painlimit_lifework + opioid_freq3mo +
                 phq8total5unit + num_pain_locations + fam_income10k + race_ethnicity, 
               data = projectb_si_train)

m1_nb_aug <- augment(m1_nb, projectb_si_train, type.predict = "response")
m1_nb_glance <- glance(m1_nb)

m1_nb_summary <-
  mets(m1_nb_aug, truth = ihm4pain_num, estimate = .fitted) |>
  mutate(Model = "m1_nb",
         R2 = if_else(.metric == "rsq", .estimate, NA),
         RMSE = if_else(.metric == "rmse", .estimate, NA),
         MAE = if_else(.metric == "mae", .estimate, NA)) |> relocate(Model)

m1_nb_summary <- m1_nb_summary |>
  group_by(Model) |>
  summarize(R2 = mean(R2, na.rm = TRUE),
                   RMSE = mean(RMSE, na.rm = TRUE),
                   MAE = mean(MAE, na.rm = TRUE))

m1_nb_summary <- bind_cols(m1_nb_summary, m1_nb_glance) |>
  select(-c(logLik))
summary(m1_nb)

Call:
glm.nb(formula = ihm4pain_num ~ age10 + sex + education + urban_rural + 
    painlimit_lifework + opioid_freq3mo + phq8total5unit + num_pain_locations + 
    fam_income10k + race_ethnicity, data = projectb_si_train, 
    init.theta = 3.17461915, link = log)

Coefficients:
                                                      Estimate Std. Error
(Intercept)                                         -1.1668329  0.1757511
age10                                               -0.1340332  0.0160376
sexMale                                             -0.2773424  0.0523129
educationHigh school graduate/GED                    0.2712388  0.1116211
educationSome college, no degree                     0.6551017  0.1123832
educationAssociate's or bachelor's degree            0.8073798  0.1074271
educationMaster's, doctoral, or professional degree  0.9724037  0.1264575
urban_ruralMedium and small metro                    0.0128675  0.0743288
urban_ruralLarge fringe metro                        0.0597904  0.0826633
urban_ruralLarge central metro                       0.2508966  0.0786869
painlimit_lifeworkSome days                          0.2030962  0.0661423
painlimit_lifeworkMost days                          0.2544446  0.0825732
painlimit_lifeworkEvery day                          0.2508763  0.0825099
opioid_freq3moSome days                              0.0368888  0.1107997
opioid_freq3moMost days                              0.0825150  0.1823585
opioid_freq3moEvery day                             -0.1805531  0.0989803
phq8total5unit                                      -0.0297361  0.0266371
num_pain_locations                                   0.1490524  0.0203585
fam_income10k                                        0.0280486  0.0050948
race_ethnicityHispanic                               0.0009119  0.0910714
race_ethnicityNH Black/African American only        -0.2054487  0.0914048
race_ethnicityOther single and multiple races       -0.1778063  0.1179438
                                                    z value Pr(>|z|)    
(Intercept)                                          -6.639 3.16e-11 ***
age10                                                -8.357  < 2e-16 ***
sexMale                                              -5.302 1.15e-07 ***
educationHigh school graduate/GED                     2.430  0.01510 *  
educationSome college, no degree                      5.829 5.57e-09 ***
educationAssociate's or bachelor's degree             7.516 5.66e-14 ***
educationMaster's, doctoral, or professional degree   7.690 1.48e-14 ***
urban_ruralMedium and small metro                     0.173  0.86256    
urban_ruralLarge fringe metro                         0.723  0.46950    
urban_ruralLarge central metro                        3.189  0.00143 ** 
painlimit_lifeworkSome days                           3.071  0.00214 ** 
painlimit_lifeworkMost days                           3.081  0.00206 ** 
painlimit_lifeworkEvery day                           3.041  0.00236 ** 
opioid_freq3moSome days                               0.333  0.73919    
opioid_freq3moMost days                               0.452  0.65092    
opioid_freq3moEvery day                              -1.824  0.06813 .  
phq8total5unit                                       -1.116  0.26428    
num_pain_locations                                    7.321 2.45e-13 ***
fam_income10k                                         5.505 3.68e-08 ***
race_ethnicityHispanic                                0.010  0.99201    
race_ethnicityNH Black/African American only         -2.248  0.02460 *  
race_ethnicityOther single and multiple races        -1.508  0.13167    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for Negative Binomial(3.1746) family taken to be 1)

    Null deviance: 3741.6  on 3556  degrees of freedom
Residual deviance: 3278.2  on 3535  degrees of freedom
AIC: 6799.9

Number of Fisher Scoring iterations: 1

              Theta:  3.175 
          Std. Err.:  0.576 

 2 x log-likelihood:  -6753.854 

6.5.3 Zero-inflated Poisson (m1_zip)

Code
m1_zip <- zeroinfl(ihm4pain_num ~ age10 + sex + education +
                 urban_rural + painlimit_lifework + opioid_freq3mo +
                 phq8total5unit + num_pain_locations + fam_income10k + race_ethnicity, 
                 data = projectb_si_train)

m1_zip_aug <- projectb_si_train |>
    mutate(".fitted" = predict(m1_zip, type = "response"),
           ".resid" = resid(m1_zip, type = "response"))

m1_zip_summary <-
  mets(m1_zip_aug, truth = ihm4pain_num, estimate = .fitted) |>
  mutate(Model = "m1_zip",
         R2 = if_else(.metric == "rsq", .estimate, NA),
         RMSE = if_else(.metric == "rmse", .estimate, NA),
         MAE = if_else(.metric == "mae", .estimate, NA)) |> relocate(Model)

m1_zip_summary <- m1_zip_summary |>
  group_by(Model) |>
  summarize(R2 = mean(R2, na.rm = TRUE),
                   RMSE = mean(RMSE, na.rm = TRUE),
                   MAE = mean(MAE, na.rm = TRUE))

m1_zip_summary <- m1_zip_summary |>
  mutate(AIC = AIC(m1_zip),
         BIC = BIC(m1_zip))
summary(m1_zip)

Call:
zeroinfl(formula = ihm4pain_num ~ age10 + sex + education + urban_rural + 
    painlimit_lifework + opioid_freq3mo + phq8total5unit + num_pain_locations + 
    fam_income10k + race_ethnicity, data = projectb_si_train)

Pearson residuals:
    Min      1Q  Median      3Q     Max 
-1.3529 -0.6426 -0.4657  0.5057  7.1811 

Count model coefficients (poisson with log link):
                                                     Estimate Std. Error
(Intercept)                                         -0.415196   0.261037
age10                                               -0.135764   0.024135
sexMale                                             -0.210056   0.077075
educationHigh school graduate/GED                    0.282405   0.167720
educationSome college, no degree                     0.635911   0.165162
educationAssociate's or bachelor's degree            0.619413   0.160655
educationMaster's, doctoral, or professional degree  0.761770   0.176987
urban_ruralMedium and small metro                    0.012747   0.111366
urban_ruralLarge fringe metro                        0.063042   0.119672
urban_ruralLarge central metro                       0.186333   0.115089
painlimit_lifeworkSome days                          0.204220   0.098190
painlimit_lifeworkMost days                          0.319729   0.112174
painlimit_lifeworkEvery day                          0.201525   0.119923
opioid_freq3moSome days                              0.040076   0.141439
opioid_freq3moMost days                              0.221426   0.248309
opioid_freq3moEvery day                             -0.075506   0.156667
phq8total5unit                                      -0.056037   0.038500
num_pain_locations                                   0.082879   0.026625
fam_income10k                                        0.017314   0.006237
race_ethnicityHispanic                              -0.359617   0.095201
race_ethnicityNH Black/African American only        -0.110367   0.145961
race_ethnicityOther single and multiple races       -0.217057   0.206899
                                                    z value Pr(>|z|)    
(Intercept)                                          -1.591 0.111707    
age10                                                -5.625 1.85e-08 ***
sexMale                                              -2.725 0.006423 ** 
educationHigh school graduate/GED                     1.684 0.092223 .  
educationSome college, no degree                      3.850 0.000118 ***
educationAssociate's or bachelor's degree             3.856 0.000115 ***
educationMaster's, doctoral, or professional degree   4.304 1.68e-05 ***
urban_ruralMedium and small metro                     0.114 0.908874    
urban_ruralLarge fringe metro                         0.527 0.598340    
urban_ruralLarge central metro                        1.619 0.105441    
painlimit_lifeworkSome days                           2.080 0.037541 *  
painlimit_lifeworkMost days                           2.850 0.004368 ** 
painlimit_lifeworkEvery day                           1.680 0.092869 .  
opioid_freq3moSome days                               0.283 0.776915    
opioid_freq3moMost days                               0.892 0.372536    
opioid_freq3moEvery day                              -0.482 0.629839    
phq8total5unit                                       -1.455 0.145531    
num_pain_locations                                    3.113 0.001853 ** 
fam_income10k                                         2.776 0.005501 ** 
race_ethnicityHispanic                               -3.777 0.000158 ***
race_ethnicityNH Black/African American only         -0.756 0.449562    
race_ethnicityOther single and multiple races        -1.049 0.294133    

Zero-inflation model coefficients (binomial with logit link):
                                                      Estimate Std. Error
(Intercept)                                          6.712e-01  7.979e-01
age10                                               -4.693e-03  7.769e-02
sexMale                                              2.826e-01  2.410e-01
educationHigh school graduate/GED                   -2.112e-02  4.783e-01
educationSome college, no degree                    -9.915e-02  4.802e-01
educationAssociate's or bachelor's degree           -7.535e-01  5.073e-01
educationMaster's, doctoral, or professional degree -8.290e-01  5.714e-01
urban_ruralMedium and small metro                   -1.014e-02  3.239e-01
urban_ruralLarge fringe metro                       -7.261e-04  3.504e-01
urban_ruralLarge central metro                      -2.918e-01  3.640e-01
painlimit_lifeworkSome days                          8.064e-03  3.156e-01
painlimit_lifeworkMost days                          2.778e-01  3.413e-01
painlimit_lifeworkEvery day                         -1.917e-01  3.966e-01
opioid_freq3moSome days                              1.341e-02  4.731e-01
opioid_freq3moMost days                              3.755e-01  6.382e-01
opioid_freq3moEvery day                              3.516e-01  4.513e-01
phq8total5unit                                      -1.027e-01  1.431e-01
num_pain_locations                                  -2.714e-01  8.580e-02
fam_income10k                                       -4.744e-02  2.318e-02
race_ethnicityHispanic                              -1.585e+01  1.817e+03
race_ethnicityNH Black/African American only         2.591e-01  3.857e-01
race_ethnicityOther single and multiple races       -1.622e-01  8.016e-01
                                                    z value Pr(>|z|)   
(Intercept)                                           0.841  0.40024   
age10                                                -0.060  0.95184   
sexMale                                               1.173  0.24089   
educationHigh school graduate/GED                    -0.044  0.96478   
educationSome college, no degree                     -0.206  0.83644   
educationAssociate's or bachelor's degree            -1.485  0.13749   
educationMaster's, doctoral, or professional degree  -1.451  0.14687   
urban_ruralMedium and small metro                    -0.031  0.97502   
urban_ruralLarge fringe metro                        -0.002  0.99835   
urban_ruralLarge central metro                       -0.802  0.42270   
painlimit_lifeworkSome days                           0.026  0.97961   
painlimit_lifeworkMost days                           0.814  0.41575   
painlimit_lifeworkEvery day                          -0.483  0.62889   
opioid_freq3moSome days                               0.028  0.97738   
opioid_freq3moMost days                               0.588  0.55624   
opioid_freq3moEvery day                               0.779  0.43597   
phq8total5unit                                       -0.718  0.47280   
num_pain_locations                                   -3.163  0.00156 **
fam_income10k                                        -2.046  0.04073 * 
race_ethnicityHispanic                               -0.009  0.99304   
race_ethnicityNH Black/African American only          0.672  0.50171   
race_ethnicityOther single and multiple races        -0.202  0.83960   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Number of iterations in BFGS optimization: 86 
Log-likelihood: -3347 on 44 Df

6.5.4 Zero-inflated Negative Binomial (m1_zinb)

Code
m1_zinb <- zeroinfl(ihm4pain_num ~ age10 + sex + education +
                          urban_rural + painlimit_lifework + opioid_freq3mo +
                          phq8total5unit + num_pain_locations + fam_income10k + race_ethnicity,
                        dist = "negbin", data = projectb_si_train)

m1_zinb_aug <- projectb_si_train |>
    mutate(".fitted" = predict(m1_zinb, type = "response"),
           ".resid" = resid(m1_zinb, type = "response"))

m1_zinb_summary <-
  mets(m1_zinb_aug, truth = ihm4pain_num, estimate = .fitted) |>
  mutate(Model = "m1_zinb",
         R2 = if_else(.metric == "rsq", .estimate, NA),
         RMSE = if_else(.metric == "rmse", .estimate, NA),
         MAE = if_else(.metric == "mae", .estimate, NA)) |> relocate(Model)

m1_zinb_summary <- m1_zinb_summary |>
  group_by(Model) |>
  summarize(R2 = mean(R2, na.rm = TRUE),
                   RMSE = mean(RMSE, na.rm = TRUE),
                   MAE = mean(MAE, na.rm = TRUE))

m1_zinb_summary <- m1_zinb_summary |>
  mutate(AIC = AIC(m1_zinb),
         BIC = BIC(m1_zinb))
summary(m1_zinb)

Call:
zeroinfl(formula = ihm4pain_num ~ age10 + sex + education + urban_rural + 
    painlimit_lifework + opioid_freq3mo + phq8total5unit + num_pain_locations + 
    fam_income10k + race_ethnicity, data = projectb_si_train, dist = "negbin")

Pearson residuals:
    Min      1Q  Median      3Q     Max 
-1.3529 -0.6426 -0.4657  0.5057  7.1811 

Count model coefficients (negbin with log link):
                                                     Estimate Std. Error
(Intercept)                                         -0.415181   0.261036
age10                                               -0.135766   0.024135
sexMale                                             -0.210059   0.077075
educationHigh school graduate/GED                    0.282398   0.167720
educationSome college, no degree                     0.635914   0.165161
educationAssociate's or bachelor's degree            0.619416   0.160655
educationMaster's, doctoral, or professional degree  0.761771   0.176987
urban_ruralMedium and small metro                    0.012739   0.111366
urban_ruralLarge fringe metro                        0.063033   0.119671
urban_ruralLarge central metro                       0.186324   0.115088
painlimit_lifeworkSome days                          0.204214   0.098190
painlimit_lifeworkMost days                          0.319721   0.112174
painlimit_lifeworkEvery day                          0.201516   0.119922
opioid_freq3moSome days                              0.040082   0.141439
opioid_freq3moMost days                              0.221426   0.248310
opioid_freq3moEvery day                             -0.075485   0.156665
phq8total5unit                                      -0.056035   0.038500
num_pain_locations                                   0.082879   0.026625
fam_income10k                                        0.017315   0.006237
race_ethnicityHispanic                              -0.359612   0.095200
race_ethnicityNH Black/African American only        -0.110358   0.145960
race_ethnicityOther single and multiple races       -0.217029   0.206893
Log(theta)                                          15.442114  25.722300
                                                    z value Pr(>|z|)    
(Intercept)                                          -1.591 0.111719    
age10                                                -5.625 1.85e-08 ***
sexMale                                              -2.725 0.006422 ** 
educationHigh school graduate/GED                     1.684 0.092230 .  
educationSome college, no degree                      3.850 0.000118 ***
educationAssociate's or bachelor's degree             3.856 0.000115 ***
educationMaster's, doctoral, or professional degree   4.304 1.68e-05 ***
urban_ruralMedium and small metro                     0.114 0.908932    
urban_ruralLarge fringe metro                         0.527 0.598391    
urban_ruralLarge central metro                        1.619 0.105455    
painlimit_lifeworkSome days                           2.080 0.037545 *  
painlimit_lifeworkMost days                           2.850 0.004369 ** 
painlimit_lifeworkEvery day                           1.680 0.092881 .  
opioid_freq3moSome days                               0.283 0.776883    
opioid_freq3moMost days                               0.892 0.372537    
opioid_freq3moEvery day                              -0.482 0.629929    
phq8total5unit                                       -1.455 0.145539    
num_pain_locations                                    3.113 0.001853 ** 
fam_income10k                                         2.776 0.005497 ** 
race_ethnicityHispanic                               -3.777 0.000158 ***
race_ethnicityNH Black/African American only         -0.756 0.449599    
race_ethnicityOther single and multiple races        -1.049 0.294182    
Log(theta)                                            0.600 0.548280    

Zero-inflation model coefficients (binomial with logit link):
                                                      Estimate Std. Error
(Intercept)                                          6.712e-01  7.979e-01
age10                                               -4.698e-03  7.769e-02
sexMale                                              2.826e-01  2.410e-01
educationHigh school graduate/GED                   -2.113e-02  4.783e-01
educationSome college, no degree                    -9.913e-02  4.802e-01
educationAssociate's or bachelor's degree           -7.535e-01  5.073e-01
educationMaster's, doctoral, or professional degree -8.290e-01  5.714e-01
urban_ruralMedium and small metro                   -1.017e-02  3.239e-01
urban_ruralLarge fringe metro                       -7.511e-04  3.504e-01
urban_ruralLarge central metro                      -2.918e-01  3.640e-01
painlimit_lifeworkSome days                          8.048e-03  3.156e-01
painlimit_lifeworkMost days                          2.778e-01  3.413e-01
painlimit_lifeworkEvery day                         -1.917e-01  3.966e-01
opioid_freq3moSome days                              1.343e-02  4.731e-01
opioid_freq3moMost days                              3.755e-01  6.382e-01
opioid_freq3moEvery day                              3.516e-01  4.513e-01
phq8total5unit                                      -1.027e-01  1.431e-01
num_pain_locations                                  -2.714e-01  8.580e-02
fam_income10k                                       -4.744e-02  2.318e-02
race_ethnicityHispanic                              -1.685e+01  2.986e+03
race_ethnicityNH Black/African American only         2.591e-01  3.857e-01
race_ethnicityOther single and multiple races       -1.622e-01  8.015e-01
                                                    z value Pr(>|z|)   
(Intercept)                                           0.841  0.40021   
age10                                                -0.060  0.95178   
sexMale                                               1.173  0.24091   
educationHigh school graduate/GED                    -0.044  0.96477   
educationSome college, no degree                     -0.206  0.83645   
educationAssociate's or bachelor's degree            -1.485  0.13750   
educationMaster's, doctoral, or professional degree  -1.451  0.14687   
urban_ruralMedium and small metro                    -0.031  0.97496   
urban_ruralLarge fringe metro                        -0.002  0.99829   
urban_ruralLarge central metro                       -0.802  0.42265   
painlimit_lifeworkSome days                           0.026  0.97965   
painlimit_lifeworkMost days                           0.814  0.41578   
painlimit_lifeworkEvery day                          -0.483  0.62884   
opioid_freq3moSome days                               0.028  0.97735   
opioid_freq3moMost days                               0.588  0.55624   
opioid_freq3moEvery day                               0.779  0.43589   
phq8total5unit                                       -0.718  0.47282   
num_pain_locations                                   -3.163  0.00156 **
fam_income10k                                        -2.046  0.04073 * 
race_ethnicityHispanic                               -0.006  0.99550   
race_ethnicityNH Black/African American only          0.672  0.50167   
race_ethnicityOther single and multiple races        -0.202  0.83968   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Theta = 5086566.3791 
Number of iterations in BFGS optimization: 86 
Log-likelihood: -3347 on 45 Df

6.5.5 Hurdle with Poisson (m1_hurdle_pois)

Code
m1_hurdle_pois <- hurdle(ihm4pain_num ~ age10 + sex + race_ethnicity + education +
                           fam_income10k + urban_rural + num_pain_locations +
                           painlimit_lifework + opioid_freq3mo + phq8total5unit,
                        dist = "poisson", zero.dist = "binomial", 
                        data = projectb_si_train)

m1_hurdle_pois_aug <- projectb_si_train |>
    mutate(".fitted" = predict(m1_hurdle_pois, type = "response"),
           ".resid" = resid(m1_hurdle_pois, type = "response"))

m1_hurdle_pois_summary <-
  mets(m1_hurdle_pois_aug, truth = ihm4pain_num, estimate = .fitted) |>
  mutate(Model = "m1_hurdle_pois",
         R2 = if_else(.metric == "rsq", .estimate, NA),
         RMSE = if_else(.metric == "rmse", .estimate, NA),
         MAE = if_else(.metric == "mae", .estimate, NA)) |> relocate(Model)

m1_hurdle_pois_summary <- m1_hurdle_pois_summary |>
  group_by(Model) |>
  summarize(R2 = mean(R2, na.rm = TRUE),
                   RMSE = mean(RMSE, na.rm = TRUE),
                   MAE = mean(MAE, na.rm = TRUE))

m1_hurdle_pois_summary <- m1_hurdle_pois_summary |>
  mutate(AIC = AIC(m1_hurdle_pois),
         BIC = BIC(m1_hurdle_pois))
summary(m1_hurdle_pois)

Call:
hurdle(formula = ihm4pain_num ~ age10 + sex + race_ethnicity + education + 
    fam_income10k + urban_rural + num_pain_locations + painlimit_lifework + 
    opioid_freq3mo + phq8total5unit, data = projectb_si_train, dist = "poisson", 
    zero.dist = "binomial")

Pearson residuals:
    Min      1Q  Median      3Q     Max 
-1.3352 -0.6384 -0.4592  0.5140  7.1628 

Count model coefficients (truncated poisson with log link):
                                                     Estimate Std. Error
(Intercept)                                         -0.045206   0.281619
age10                                               -0.143934   0.024059
sexMale                                             -0.178949   0.078531
race_ethnicityHispanic                              -0.220647   0.140983
race_ethnicityNH Black/African American only        -0.101348   0.141085
race_ethnicityOther single and multiple races       -0.098327   0.170501
educationHigh school graduate/GED                    0.054218   0.206801
educationSome college, no degree                     0.414146   0.200328
educationAssociate's or bachelor's degree            0.402233   0.194865
educationMaster's, doctoral, or professional degree  0.462490   0.213386
fam_income10k                                        0.014675   0.006929
urban_ruralMedium and small metro                    0.005048   0.114664
urban_ruralLarge fringe metro                       -0.009438   0.125995
urban_ruralLarge central metro                       0.148443   0.117913
num_pain_locations                                   0.079246   0.029779
painlimit_lifeworkSome days                          0.179310   0.099253
painlimit_lifeworkMost days                          0.248306   0.121008
painlimit_lifeworkEvery day                          0.148881   0.127017
opioid_freq3moSome days                              0.059283   0.158583
opioid_freq3moMost days                              0.113070   0.279708
opioid_freq3moEvery day                             -0.160421   0.164993
phq8total5unit                                      -0.085068   0.040396
                                                    z value Pr(>|z|)    
(Intercept)                                          -0.161  0.87247    
age10                                                -5.982  2.2e-09 ***
sexMale                                              -2.279  0.02268 *  
race_ethnicityHispanic                               -1.565  0.11757    
race_ethnicityNH Black/African American only         -0.718  0.47254    
race_ethnicityOther single and multiple races        -0.577  0.56415    
educationHigh school graduate/GED                     0.262  0.79319    
educationSome college, no degree                      2.067  0.03870 *  
educationAssociate's or bachelor's degree             2.064  0.03900 *  
educationMaster's, doctoral, or professional degree   2.167  0.03021 *  
fam_income10k                                         2.118  0.03418 *  
urban_ruralMedium and small metro                     0.044  0.96488    
urban_ruralLarge fringe metro                        -0.075  0.94029    
urban_ruralLarge central metro                        1.259  0.20806    
num_pain_locations                                    2.661  0.00779 ** 
painlimit_lifeworkSome days                           1.807  0.07083 .  
painlimit_lifeworkMost days                           2.052  0.04017 *  
painlimit_lifeworkEvery day                           1.172  0.24115    
opioid_freq3moSome days                               0.374  0.70853    
opioid_freq3moMost days                               0.404  0.68603    
opioid_freq3moEvery day                              -0.972  0.33091    
phq8total5unit                                       -2.106  0.03522 *  
Zero hurdle model coefficients (binomial with logit link):
                                                     Estimate Std. Error
(Intercept)                                         -1.479732   0.249322
age10                                               -0.142153   0.023765
sexMale                                             -0.375766   0.076124
race_ethnicityHispanic                               0.139327   0.135109
race_ethnicityNH Black/African American only        -0.272954   0.128870
race_ethnicityOther single and multiple races       -0.244884   0.174931
educationHigh school graduate/GED                    0.355758   0.142993
educationSome college, no degree                     0.759842   0.149059
educationAssociate's or bachelor's degree            1.039827   0.141786
educationMaster's, doctoral, or professional degree  1.328910   0.180035
fam_income10k                                        0.042015   0.008126
urban_ruralMedium and small metro                    0.027743   0.105044
urban_ruralLarge fringe metro                        0.114360   0.118742
urban_ruralLarge central metro                       0.346088   0.114916
num_pain_locations                                   0.212404   0.030285
painlimit_lifeworkSome days                          0.239884   0.096082
painlimit_lifeworkMost days                          0.275086   0.122223
painlimit_lifeworkEvery day                          0.325687   0.118433
opioid_freq3moSome days                              0.043253   0.167722
opioid_freq3moMost days                              0.107665   0.265744
opioid_freq3moEvery day                             -0.194347   0.137609
phq8total5unit                                      -0.001161   0.039121
                                                    z value Pr(>|z|)    
(Intercept)                                          -5.935 2.94e-09 ***
age10                                                -5.982 2.21e-09 ***
sexMale                                              -4.936 7.97e-07 ***
race_ethnicityHispanic                                1.031  0.30244    
race_ethnicityNH Black/African American only         -2.118  0.03417 *  
race_ethnicityOther single and multiple races        -1.400  0.16155    
educationHigh school graduate/GED                     2.488  0.01285 *  
educationSome college, no degree                      5.098 3.44e-07 ***
educationAssociate's or bachelor's degree             7.334 2.24e-13 ***
educationMaster's, doctoral, or professional degree   7.381 1.57e-13 ***
fam_income10k                                         5.170 2.34e-07 ***
urban_ruralMedium and small metro                     0.264  0.79170    
urban_ruralLarge fringe metro                         0.963  0.33550    
urban_ruralLarge central metro                        3.012  0.00260 ** 
num_pain_locations                                    7.013 2.33e-12 ***
painlimit_lifeworkSome days                           2.497  0.01254 *  
painlimit_lifeworkMost days                           2.251  0.02440 *  
painlimit_lifeworkEvery day                           2.750  0.00596 ** 
opioid_freq3moSome days                               0.258  0.79650    
opioid_freq3moMost days                               0.405  0.68537    
opioid_freq3moEvery day                              -1.412  0.15786    
phq8total5unit                                       -0.030  0.97632    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Number of iterations in BFGS optimization: 30 
Log-likelihood: -3346 on 44 Df

6.5.6 Hurdle with Negative Binomial (m1_hurdle_nb)

Code
m1_hurdle_nb <- hurdle(ihm4pain_num ~ age10 + sex + education +
                          urban_rural + painlimit_lifework + opioid_freq3mo +
                          phq8total5unit + num_pain_locations + fam_income10k + race_ethnicity,
                       dist = "negbin", zero.dist = "binomial", 
                       data = projectb_si_train)

m1_hurdle_nb_aug <- projectb_si_train |>
    mutate(".fitted" = predict(m1_hurdle_nb, type = "response"),
           ".resid" = resid(m1_hurdle_nb, type = "response"))

m1_hurdle_nb_summary <-
  mets(m1_hurdle_nb_aug, truth = ihm4pain_num, estimate = .fitted) |>
  mutate(Model = "m1_hurdle_nb",
         R2 = if_else(.metric == "rsq", .estimate, NA),
         RMSE = if_else(.metric == "rmse", .estimate, NA),
         MAE = if_else(.metric == "mae", .estimate, NA)) |> relocate(Model)

m1_hurdle_nb_summary <- m1_hurdle_nb_summary |>
  group_by(Model) |>
  summarize(R2 = mean(R2, na.rm = TRUE),
                   RMSE = mean(RMSE, na.rm = TRUE),
                   MAE = mean(MAE, na.rm = TRUE))

m1_hurdle_nb_summary <- m1_hurdle_nb_summary |>
  mutate(AIC = AIC(m1_hurdle_nb),
         BIC = BIC(m1_hurdle_nb))
summary(m1_hurdle_nb)

Call:
hurdle(formula = ihm4pain_num ~ age10 + sex + education + urban_rural + 
    painlimit_lifework + opioid_freq3mo + phq8total5unit + num_pain_locations + 
    fam_income10k + race_ethnicity, data = projectb_si_train, dist = "negbin", 
    zero.dist = "binomial")

Pearson residuals:
    Min      1Q  Median      3Q     Max 
-1.3352 -0.6384 -0.4592  0.5140  7.1628 

Count model coefficients (truncated negbin with log link):
                                                     Estimate Std. Error
(Intercept)                                         -0.045178   0.281617
age10                                               -0.143937   0.024059
sexMale                                             -0.178955   0.078531
educationHigh school graduate/GED                    0.054212   0.206800
educationSome college, no degree                     0.414166   0.200326
educationAssociate's or bachelor's degree            0.402225   0.194864
educationMaster's, doctoral, or professional degree  0.462490   0.213385
urban_ruralMedium and small metro                    0.005048   0.114663
urban_ruralLarge fringe metro                       -0.009442   0.125995
urban_ruralLarge central metro                       0.148444   0.117912
painlimit_lifeworkSome days                          0.179297   0.099253
painlimit_lifeworkMost days                          0.248299   0.121008
painlimit_lifeworkEvery day                          0.148878   0.127017
opioid_freq3moSome days                              0.059280   0.158583
opioid_freq3moMost days                              0.113059   0.279709
opioid_freq3moEvery day                             -0.160405   0.164991
phq8total5unit                                      -0.085068   0.040396
num_pain_locations                                   0.079248   0.029779
fam_income10k                                        0.014675   0.006929
race_ethnicityHispanic                              -0.220641   0.140982
race_ethnicityNH Black/African American only        -0.101337   0.141084
race_ethnicityOther single and multiple races       -0.098315   0.170499
Log(theta)                                          19.343411        NaN
                                                    z value Pr(>|z|)    
(Intercept)                                          -0.160  0.87255    
age10                                                -5.983  2.2e-09 ***
sexMale                                              -2.279  0.02268 *  
educationHigh school graduate/GED                     0.262  0.79321    
educationSome college, no degree                      2.067  0.03869 *  
educationAssociate's or bachelor's degree             2.064  0.03900 *  
educationMaster's, doctoral, or professional degree   2.167  0.03020 *  
urban_ruralMedium and small metro                     0.044  0.96488    
urban_ruralLarge fringe metro                        -0.075  0.94026    
urban_ruralLarge central metro                        1.259  0.20805    
painlimit_lifeworkSome days                           1.806  0.07085 .  
painlimit_lifeworkMost days                           2.052  0.04018 *  
painlimit_lifeworkEvery day                           1.172  0.24115    
opioid_freq3moSome days                               0.374  0.70854    
opioid_freq3moMost days                               0.404  0.68606    
opioid_freq3moEvery day                              -0.972  0.33095    
phq8total5unit                                       -2.106  0.03522 *  
num_pain_locations                                    2.661  0.00779 ** 
fam_income10k                                         2.118  0.03419 *  
race_ethnicityHispanic                               -1.565  0.11758    
race_ethnicityNH Black/African American only         -0.718  0.47259    
race_ethnicityOther single and multiple races        -0.577  0.56419    
Log(theta)                                              NaN      NaN    
Zero hurdle model coefficients (binomial with logit link):
                                                     Estimate Std. Error
(Intercept)                                         -1.479732   0.249322
age10                                               -0.142153   0.023765
sexMale                                             -0.375766   0.076124
educationHigh school graduate/GED                    0.355758   0.142993
educationSome college, no degree                     0.759842   0.149059
educationAssociate's or bachelor's degree            1.039827   0.141786
educationMaster's, doctoral, or professional degree  1.328910   0.180035
urban_ruralMedium and small metro                    0.027743   0.105044
urban_ruralLarge fringe metro                        0.114360   0.118742
urban_ruralLarge central metro                       0.346088   0.114916
painlimit_lifeworkSome days                          0.239884   0.096082
painlimit_lifeworkMost days                          0.275086   0.122223
painlimit_lifeworkEvery day                          0.325687   0.118433
opioid_freq3moSome days                              0.043253   0.167722
opioid_freq3moMost days                              0.107665   0.265744
opioid_freq3moEvery day                             -0.194347   0.137609
phq8total5unit                                      -0.001161   0.039121
num_pain_locations                                   0.212404   0.030285
fam_income10k                                        0.042015   0.008126
race_ethnicityHispanic                               0.139327   0.135109
race_ethnicityNH Black/African American only        -0.272954   0.128870
race_ethnicityOther single and multiple races       -0.244884   0.174931
                                                    z value Pr(>|z|)    
(Intercept)                                          -5.935 2.94e-09 ***
age10                                                -5.982 2.21e-09 ***
sexMale                                              -4.936 7.97e-07 ***
educationHigh school graduate/GED                     2.488  0.01285 *  
educationSome college, no degree                      5.098 3.44e-07 ***
educationAssociate's or bachelor's degree             7.334 2.24e-13 ***
educationMaster's, doctoral, or professional degree   7.381 1.57e-13 ***
urban_ruralMedium and small metro                     0.264  0.79170    
urban_ruralLarge fringe metro                         0.963  0.33550    
urban_ruralLarge central metro                        3.012  0.00260 ** 
painlimit_lifeworkSome days                           2.497  0.01254 *  
painlimit_lifeworkMost days                           2.251  0.02440 *  
painlimit_lifeworkEvery day                           2.750  0.00596 ** 
opioid_freq3moSome days                               0.258  0.79650    
opioid_freq3moMost days                               0.405  0.68537    
opioid_freq3moEvery day                              -1.412  0.15786    
phq8total5unit                                       -0.030  0.97632    
num_pain_locations                                    7.013 2.33e-12 ***
fam_income10k                                         5.170 2.34e-07 ***
race_ethnicityHispanic                                1.031  0.30244    
race_ethnicityNH Black/African American only         -2.118  0.03417 *  
race_ethnicityOther single and multiple races        -1.400  0.16155    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Theta: count = 251615122.8885
Number of iterations in BFGS optimization: 49 
Log-likelihood: -3346 on 45 Df

6.6 Comparing m1 Versions

6.6.1 Comparing with Rootograms

To begin a comparison of the 4 variations of Model 1, I generated four rootogram plots using the countreg::rootogram() function.

Code
par(mfrow = c(3,2))
countreg::rootogram(m1_pois, main = "Rootogram for m1_pois")
countreg::rootogram(m1_nb, main = "Rootogram for m1_nb")
countreg::rootogram(m1_zip, main = "Rootogram for m1_zip")
countreg::rootogram(m1_zinb, main = "Rootogram for m1_zinb")
countreg::rootogram(m1_hurdle_pois, main = "Rootogram for m1_hurdle_pois")
countreg::rootogram(m1_hurdle_nb, main = "Rootogram for m1_hurdle_nb")

Code
par(mfrow = c(1,1))

Looking at the plots above, there is a clear improvement observed between 1) the Poisson (m1_pois) and negative binomial (m1_nb) models; and 2) the zero-inflated (m1_zip and m1_zinb) and hurdle (m1_hurdle_pois and mt_hurdle_nb) models as the latter appear to fit 0 counts perfectly and lack the over/under-fitting of counts 1-3. However, there do not appear to be any noticeable differences in predicting counts between the bottom four models. Furthermore, it appears each model predicts counts of ihm4pain_num > 4, but not to a substantial degree. Certainly, some subjects may have engaged in more than four IHM modalities for pain, but these were not measured in the 2019 NHIS.

6.6.2 Comparing with Fit Statistics

To further examine differences, I generated m1_summary as a combination of the 6 previous model summaries using the bind_rows() function. I then selected the Model, \(R^2\), RMSE, MAE, AIC, and BIC values for output.

Code
m1_summary <- bind_rows(m1_pois_summary, m1_nb_summary, 
                        m1_zip_summary, m1_zinb_summary, 
                        m1_hurdle_pois_summary, m1_hurdle_nb_summary)

m1_summary |> select(Model, R2, RMSE, MAE, AIC, BIC) |>
  flextable() |>
  colformat_double(j = 2:4, digits = 3) |>
  colformat_double(j = 5:6, digits = 0) |>
  bold(i = 5, j = 1:6, bold = TRUE) |>
  autofit() |>
  theme_zebra(odd_header = "white") |>
  hline_top(part="body")

Model

R2

RMSE

MAE

AIC

BIC

m1_pois

0.120

0.805

0.624

6,842

6,978

m1_nb

0.119

0.806

0.624

6,800

6,942

m1_zip

0.120

0.806

0.622

6,783

7,055

m1_zinb

0.120

0.806

0.622

6,785

7,063

m1_hurdle_pois

0.121

0.805

0.622

6,780

7,052

m1_hurdle_nb

0.121

0.805

0.622

6,782

7,060

Similar to the rootograms, there is a clear improvement in \(R^2\) between 1) the Poisson (m1_pois) and negative binomial (m1_nb) models; and 2) the zero-inflated (m1_zip and m1_zinb) and hurdle (m1_hurdle_pois and mt_hurdle_nb) models. m1_hurdle_pois appears to be the best fit based on it’s \(R^2\) (0.121), RMSE (0.805), and AIC (6780 [lowest overall]). Importantly, m1_hurdle_pois has the lowest BIC among the four models that had better rootograms in the plot above.

6.6.3 Comparing with Vuong tests

Finally, given the inclination to use m1_hurdle_pois, I used the vuong() function to compare it to the other models.

m1_hurdle_pois v. m1_pois

Code
vuong(m1_hurdle_pois, m1_pois)
Vuong Non-Nested Hypothesis Test-Statistic: 
(test-statistic is asymptotically distributed N(0,1) under the
 null that the models are indistinguishible)
-------------------------------------------------------------
              Vuong z-statistic             H_A    p-value
Raw                    4.991730 model1 > model2  2.992e-07
AIC-corrected          2.913802 model1 > model2 0.00178528
BIC-corrected         -3.503540 model2 > model1 0.00022956

m1_hurdle_pois is a better fit than m1_pois based on small p-values from the the Raw, and AIC-corrected Vuong tests.

m1_hurdle_pois v. m1_nb

Code
vuong(m1_hurdle_pois, m1_nb)
Vuong Non-Nested Hypothesis Test-Statistic: 
(test-statistic is asymptotically distributed N(0,1) under the
 null that the models are indistinguishible)
-------------------------------------------------------------
              Vuong z-statistic             H_A    p-value
Raw                    4.754520 model1 > model2 9.9459e-07
AIC-corrected          1.369766 model1 > model2    0.08538
BIC-corrected         -9.083491 model2 > model1 < 2.22e-16

m1_hurdle_pois is a better fit than m1_nb based on small p-values from the the Raw Vuong test.

m1_hurdle_pois v. m1_zip

Code
vuong(m1_hurdle_pois, m1_zip)
Vuong Non-Nested Hypothesis Test-Statistic: 
(test-statistic is asymptotically distributed N(0,1) under the
 null that the models are indistinguishible)
-------------------------------------------------------------
              Vuong z-statistic             H_A p-value
Raw                   0.3795831 model1 > model2 0.35213
AIC-corrected         0.3795831 model1 > model2 0.35213
BIC-corrected         0.3795831 model1 > model2 0.35213

m1_hurdle_pois is not meaningfully different than m1_zip (p = .352).

m1_hurdle_pois v. m1_zinb

Code
vuong(m1_hurdle_pois, m1_zinb)
Vuong Non-Nested Hypothesis Test-Statistic: 
(test-statistic is asymptotically distributed N(0,1) under the
 null that the models are indistinguishible)
-------------------------------------------------------------
              Vuong z-statistic             H_A p-value
Raw                   0.3795879 model1 > model2 0.35213
AIC-corrected         0.3795879 model1 > model2 0.35213
BIC-corrected         0.3795879 model1 > model2 0.35213

m1_hurdle_pois is not meaningfully different than m1_zinb (p = .352).

m1_hurdle_pois v. m1_hurdle_nb

Code
vuong(m1_hurdle_pois, m1_hurdle_nb)
Vuong Non-Nested Hypothesis Test-Statistic: 
(test-statistic is asymptotically distributed N(0,1) under the
 null that the models are indistinguishible)
-------------------------------------------------------------
              Vuong z-statistic             H_A p-value
Raw                -0.007316386 model2 > model1 0.49708
AIC-corrected      -0.007316386 model2 > model1 0.49708
BIC-corrected      -0.007316386 model2 > model1 0.49708

m1_hurdle_pois is not meaningfully different than m1_hurdle_nb (p = .497).

6.7 Final Model Choice

Based on a combination of the rootograms, the fit statistics, and the Vuong tests, I chose m1_hurdle_pois as the model for the count outcome.

The hurdle model also makes sense given its description (Ford 2016).

The hurdle model is a two-part model that specifies one process for zero counts and another process for positive counts. The idea is that positive counts occur once a threshold is crossed, or put another way, a hurdle is cleared. If the hurdle is not cleared, then we have a count of 0.

Conceptually, moving from no engagement in IHM modalities for pain to engaging in any number of IHM modalities for pain seems like a hurdle to model independently of the count data.

6.8 Fitting Weighted Hurdle Model

In the following code, I added weights to m1_hurdle_pois model using the code, weights = wt_final/mean(wt_final). I then printed a summary of the model.

Code
m1_hurdle_pois <- hurdle(ihm4pain_num ~ age10 + sex + race_ethnicity + education +
                           fam_income10k + urban_rural + num_pain_locations +
                           painlimit_lifework + opioid_freq3mo + phq8total5unit,
                        dist = "poisson", zero.dist = "binomial", 
                        weights = wt_final/mean(wt_final),
                        data = projectb_si_train)

mets <- metric_set(rsq, rmse, mae)

m1_hurdle_pois_aug <- projectb_si_train |>
    mutate(".fitted" = predict(m1_hurdle_pois, type = "response"),
           ".resid" = resid(m1_hurdle_pois, type = "response"))

m1_hurdle_pois_summary <-
  mets(m1_hurdle_pois_aug, truth = ihm4pain_num, estimate = .fitted) |>
  mutate(Model = "m1_hurdle_pois",
         R2 = if_else(.metric == "rsq", .estimate, NA),
         RMSE = if_else(.metric == "rmse", .estimate, NA),
         MAE = if_else(.metric == "mae", .estimate, NA)) |> relocate(Model)

m1_hurdle_pois_summary <- m1_hurdle_pois_summary |>
  group_by(Model) |>
  summarize(R2 = mean(R2, na.rm = TRUE),
                   RMSE = mean(RMSE, na.rm = TRUE),
                   MAE = mean(MAE, na.rm = TRUE))

m1_hurdle_pois_summary <- m1_hurdle_pois_summary |>
  mutate(AIC = AIC(m1_hurdle_pois),
         BIC = BIC(m1_hurdle_pois))

summary(m1_hurdle_pois)

Call:
hurdle(formula = ihm4pain_num ~ age10 + sex + race_ethnicity + education + 
    fam_income10k + urban_rural + num_pain_locations + painlimit_lifework + 
    opioid_freq3mo + phq8total5unit, data = projectb_si_train, weights = wt_final/mean(wt_final), 
    dist = "poisson", zero.dist = "binomial")

Pearson residuals:
    Min      1Q  Median      3Q     Max 
-1.8226 -0.5770 -0.3611  0.4732  8.1198 

Count model coefficients (truncated poisson with log link):
                                                     Estimate Std. Error
(Intercept)                                         -0.288706   0.286847
age10                                               -0.143633   0.025058
sexMale                                             -0.254302   0.080070
race_ethnicityHispanic                              -0.187190   0.136711
race_ethnicityNH Black/African American only        -0.096093   0.139174
race_ethnicityOther single and multiple races       -0.306117   0.184700
educationHigh school graduate/GED                    0.239399   0.208846
educationSome college, no degree                     0.638642   0.202086
educationAssociate's or bachelor's degree            0.572008   0.199347
educationMaster's, doctoral, or professional degree  0.648524   0.222696
fam_income10k                                        0.010163   0.006892
urban_ruralMedium and small metro                    0.066435   0.124326
urban_ruralLarge fringe metro                        0.056516   0.132087
urban_ruralLarge central metro                       0.216033   0.125488
num_pain_locations                                   0.097209   0.029301
painlimit_lifeworkSome days                          0.190048   0.101775
painlimit_lifeworkMost days                          0.273623   0.122793
painlimit_lifeworkEvery day                          0.165813   0.129089
opioid_freq3moSome days                             -0.118804   0.180672
opioid_freq3moMost days                              0.235398   0.269783
opioid_freq3moEvery day                             -0.265695   0.189130
phq8total5unit                                      -0.104129   0.039421
                                                    z value Pr(>|z|)    
(Intercept)                                          -1.006 0.314184    
age10                                                -5.732 9.92e-09 ***
sexMale                                              -3.176 0.001493 ** 
race_ethnicityHispanic                               -1.369 0.170925    
race_ethnicityNH Black/African American only         -0.690 0.489910    
race_ethnicityOther single and multiple races        -1.657 0.097443 .  
educationHigh school graduate/GED                     1.146 0.251674    
educationSome college, no degree                      3.160 0.001576 ** 
educationAssociate's or bachelor's degree             2.869 0.004112 ** 
educationMaster's, doctoral, or professional degree   2.912 0.003589 ** 
fam_income10k                                         1.475 0.140336    
urban_ruralMedium and small metro                     0.534 0.593090    
urban_ruralLarge fringe metro                         0.428 0.668743    
urban_ruralLarge central metro                        1.722 0.085154 .  
num_pain_locations                                    3.318 0.000908 ***
painlimit_lifeworkSome days                           1.867 0.061854 .  
painlimit_lifeworkMost days                           2.228 0.025859 *  
painlimit_lifeworkEvery day                           1.284 0.198972    
opioid_freq3moSome days                              -0.658 0.510815    
opioid_freq3moMost days                               0.873 0.382911    
opioid_freq3moEvery day                              -1.405 0.160072    
phq8total5unit                                       -2.641 0.008255 ** 
Zero hurdle model coefficients (binomial with logit link):
                                                     Estimate Std. Error
(Intercept)                                         -1.678505   0.240666
age10                                               -0.152575   0.023318
sexMale                                             -0.218202   0.076494
race_ethnicityHispanic                              -0.064007   0.129418
race_ethnicityNH Black/African American only        -0.380766   0.128323
race_ethnicityOther single and multiple races       -0.353984   0.170466
educationHigh school graduate/GED                    0.415864   0.135167
educationSome college, no degree                     0.785447   0.141858
educationAssociate's or bachelor's degree            1.053611   0.135944
educationMaster's, doctoral, or professional degree  1.396878   0.184676
fam_income10k                                        0.036441   0.007846
urban_ruralMedium and small metro                    0.064509   0.110355
urban_ruralLarge fringe metro                        0.262959   0.120586
urban_ruralLarge central metro                       0.463635   0.118048
num_pain_locations                                   0.180661   0.029999
painlimit_lifeworkSome days                          0.327998   0.097323
painlimit_lifeworkMost days                          0.410528   0.122285
painlimit_lifeworkEvery day                          0.472318   0.120034
opioid_freq3moSome days                              0.139740   0.168191
opioid_freq3moMost days                              0.161674   0.270131
opioid_freq3moEvery day                             -0.344714   0.145339
phq8total5unit                                       0.071419   0.037659
                                                    z value Pr(>|z|)    
(Intercept)                                          -6.974 3.07e-12 ***
age10                                                -6.543 6.01e-11 ***
sexMale                                              -2.853 0.004337 ** 
race_ethnicityHispanic                               -0.495 0.620900    
race_ethnicityNH Black/African American only         -2.967 0.003005 ** 
race_ethnicityOther single and multiple races        -2.077 0.037841 *  
educationHigh school graduate/GED                     3.077 0.002093 ** 
educationSome college, no degree                      5.537 3.08e-08 ***
educationAssociate's or bachelor's degree             7.750 9.16e-15 ***
educationMaster's, doctoral, or professional degree   7.564 3.91e-14 ***
fam_income10k                                         4.644 3.41e-06 ***
urban_ruralMedium and small metro                     0.585 0.558844    
urban_ruralLarge fringe metro                         2.181 0.029207 *  
urban_ruralLarge central metro                        3.928 8.58e-05 ***
num_pain_locations                                    6.022 1.72e-09 ***
painlimit_lifeworkSome days                           3.370 0.000751 ***
painlimit_lifeworkMost days                           3.357 0.000788 ***
painlimit_lifeworkEvery day                           3.935 8.32e-05 ***
opioid_freq3moSome days                               0.831 0.406064    
opioid_freq3moMost days                               0.599 0.549505    
opioid_freq3moEvery day                              -2.372 0.017702 *  
phq8total5unit                                        1.896 0.057901 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Number of iterations in BFGS optimization: 29 
Log-likelihood: -3275 on 44 Df

6.9 Model Coefficients

To generate a singular summary of coefficients from m1_hurdle_pois, I performed the following

  • Generated an initial m1_hurdle_pois_coef tibble containing the variables Type (i.e., count or non-zero), Group (i.e., each variable group), and Comparison (e.g., Age +10yrs, Male v. Female) to match the output from m1_hurdle_pois
  • Generated m1_hurdle_pois_pe as a tibble of the exponentiated coefficients from m1_hurdle_pois
  • Generated m1_hurdle_pois_ci as a tibble of the exponentiated 95% CI from m1_hurdle_pois
  • Bound all of the above into the tibble, m1_hurdle_pois_coef.
Code
m1_hurdle_pois_coef <- tibble(Type = c("count", "count", "count", 
                                       "count", "count", "count",
                                       "count", "count", "count",
                                       "count", "count", "count",
                                       "count", "count", "count", 
                                       "count", "count", "count",
                                       "count", "count", "count", "count",
                                       "non-zero", "non-zero", "non-zero", 
                                       "non-zero", "non-zero", "non-zero",
                                       "non-zero", "non-zero", "non-zero", 
                                       "non-zero", "non-zero", "non-zero",
                                       "non-zero", "non-zero", "non-zero", 
                                       "non-zero", "non-zero", "non-zero",
                                       "non-zero", "non-zero", "non-zero", "non-zero"),
                              Group = c("(Intercept)", "Age", "Sex", 
                                        "Race/Ethnicity", "Race/Ethnicity", "Race/Ethnicity",
                                        "Education", "Education", 
                                        "Education", "Education", 
                                        "Family income", 
                                        "Proximity to metro area", "Proximity to metro area",
                                        "Proximity to metro area", 
                                        "Number of pain locations", 
                                        "Pain limiting life or work", 
                                        "Pain limiting life or work",
                                        "Pain limiting life or work", 
                                        "Opioid use past 3 mo", 
                                        "Opioid use past 3 mo",
                                        "Opioid use past 3 mo", 
                                        "PHQ-8", 
                                        "(Intercept)", "Age", "Sex", 
                                        "Race/Ethnicity", "Race/Ethnicity", "Race/Ethnicity",
                                        "Education", "Education", 
                                        "Education", "Education", 
                                        "Family income", 
                                        "Proximity to metro area", "Proximity to metro area",
                                        "Proximity to metro area", 
                                        "Number of pain locations", 
                                        "Pain limiting life or work", 
                                        "Pain limiting life or work",
                                        "Pain limiting life or work", 
                                        "Opioid use past 3 mo", 
                                        "Opioid use past 3 mo",
                                        "Opioid use past 3 mo", 
                                        "PHQ-8"),
                              Comparison = c("(Intercept)", "Age +10yrs", 
                                             "Male v. Female", 
                                             "Hispanic v. NH White",
                                             "NH Black v. NH White",
                                             "Other v. NH White",
                                             "HS/GED v. < HS", 
                                             "Some College v. < HS", 
                                             "Assoc/Bach v. < HS",
                                             "Masters/PhD v. < HS", 
                                             "Fam income +$10,000",
                                             "M/S metro v. Non-metro", 
                                             "L fringe metro v. Non-metro",
                                             "L central metro v. Non-metro", 
                                             "Pain locations +1", 
                                             "Pain limit some days v. Never", 
                                             "Pain limit most days v. Never",
                                             "Pain limit every day v. Never", 
                                             "Opioids some days v. Never", 
                                             "Opioids most days v. Never",
                                             "Opioids every day v. Never", 
                                             "PHQ-8 +5 units", 
                                             "(Intercept)", "Age +10yrs", 
                                             "Male v. Female", 
                                             "Hispanic v. NH White",
                                             "NH Black v. NH White",
                                             "Other v. NH White",
                                             "HS/GED v. < HS", 
                                             "Some College v. < HS", 
                                             "Assoc/Bach v. < HS",
                                             "Masters/PhD v. < HS", 
                                             "Fam income +$10,000",
                                             "M/S metro v. Non-metro", 
                                             "L fringe metro v. Non-metro",
                                             "L central metro v. Non-metro", 
                                             "Pain locations +1", 
                                             "Pain limit some days v. Never", 
                                             "Pain limit most days v. Never",
                                             "Pain limit every day v. Never", 
                                             "Opioids some days v. Never", 
                                             "Opioids most days v. Never",
                                             "Opioids every day v. Never", 
                                             "PHQ-8 +5 units"))

m1_hurdle_pois_pe <- exp(coef(m1_hurdle_pois)) |> as_tibble()
m1_hurdle_pois_ci <- exp(confint(m1_hurdle_pois)) |> as_tibble() |>
  clean_names() |>
  rename(Conf.low = "x2_5_percent",
         Conf.high = "x97_5_percent")

m1_hurdle_pois_coef <- bind_cols(m1_hurdle_pois_coef, 
                                 m1_hurdle_pois_pe, 
                                 m1_hurdle_pois_ci) |>
  rename(Estimate = value)

6.9.1 Numeric Summary of Odds of ihm4pain_num != 0

In the following code, I generated a numeric summary of the exponentiated coefficients from m1_hurdle_pois describing the odds of ihm4pain_num != 0.

Code
m1_hurdle_pois_coef |> 
  filter(Type == "non-zero") |>
  select(Group, Comparison, Estimate, Conf.low, Conf.high) |> 
  mutate(Estimate = round(Estimate, digits = 3),
         Conf.low = round(Conf.low, digits = 3),
         Conf.high = round(Conf.high, digits = 3)) |>
  rename(aOR = Estimate,
         "95% CI Low" = Conf.low,
         "95% CI High" = Conf.high) |>
  flextable() |>
  bold(bold = TRUE, part = "header") |>
  add_header_lines(values = "Supplemental Table 1: aOR of Engagement in IHM Modalities for Pain") |>
  autofit() |>
  add_footer_lines(values = "Abbreviations: Assoc, associate’s; aOR, adjusted odds ratio; Bach, bachelor’s; CI, confidence interval; fam, family; PHQ-8, Patient Health Questionnaire-8; GED, general equivalency diploma; HS, high school; L, large; M, medium; metro, metropolitan; mo, months; NH, Non-Hispanic; S, small; v., versus; yrs, years", top = FALSE) |>
  theme_zebra(odd_header = "white") |>
  hline_top(part="body")

Supplemental Table 1: aOR of Engagement in IHM Modalities for Pain

Group

Comparison

aOR

95% CI Low

95% CI High

(Intercept)

(Intercept)

0.187

0.116

0.299

Age

Age +10yrs

0.858

0.820

0.899

Sex

Male v. Female

0.804

0.692

0.934

Race/Ethnicity

Hispanic v. NH White

0.938

0.728

1.209

Race/Ethnicity

NH Black v. NH White

0.683

0.531

0.879

Race/Ethnicity

Other v. NH White

0.702

0.503

0.980

Education

HS/GED v. < HS

1.516

1.163

1.975

Education

Some College v. < HS

2.193

1.661

2.896

Education

Assoc/Bach v. < HS

2.868

2.197

3.744

Education

Masters/PhD v. < HS

4.043

2.815

5.806

Family income

Fam income +$10,000

1.037

1.021

1.053

Proximity to metro area

M/S metro v. Non-metro

1.067

0.859

1.324

Proximity to metro area

L fringe metro v. Non-metro

1.301

1.027

1.648

Proximity to metro area

L central metro v. Non-metro

1.590

1.261

2.004

Number of pain locations

Pain locations +1

1.198

1.130

1.271

Pain limiting life or work

Pain limit some days v. Never

1.388

1.147

1.680

Pain limiting life or work

Pain limit most days v. Never

1.508

1.186

1.916

Pain limiting life or work

Pain limit every day v. Never

1.604

1.268

2.029

Opioid use past 3 mo

Opioids some days v. Never

1.150

0.827

1.599

Opioid use past 3 mo

Opioids most days v. Never

1.175

0.692

1.996

Opioid use past 3 mo

Opioids every day v. Never

0.708

0.533

0.942

PHQ-8

PHQ-8 +5 units

1.074

0.998

1.156

Abbreviations: Assoc, associate’s; aOR, adjusted odds ratio; Bach, bachelor’s; CI, confidence interval; fam, family; PHQ-8, Patient Health Questionnaire-8; GED, general equivalency diploma; HS, high school; L, large; M, medium; metro, metropolitan; mo, months; NH, Non-Hispanic; S, small; v., versus; yrs, years

In examining the effects table above, I will summarize the effect of Masters/PhD v. < HS using two fictional cases: Luke Skywalker and Din Djarin

Suppose Luke Skywalker and Din Djarin 1) have the same demographic profile (i.e., same age, male); 2) reside in rural areas; 3) rated similar challenges related to pain interfering with life or work in the past 3 months; 4) used the same amount of opioids for chronic pain within the past 3 months; 5) reported the same score on the PHQ-8; 6) reported the same number of locations where they were bothered at least “a little” by pain in the past three months; 7) have the same family income; and 8) are of the same race and ethnicity. However, Luke Skywalker has a PhD in being a Jedi while Din Djarin did not graduate “This is the Way” High School. In this scenario, m1_hurdle_pois predicts that Luke Skywalker’s odds of ihm4pain_num != 0 will be 4.043 times that of Din Djarin.

The 95% confidence interval for this estimated odds ratio for the effect of Masters/PhD v. < HS on ihm4pain_num != 0 is (2.815, 5.806). This confidence interval does not include one. Thus, holding all other variables constant, having a Masters, PhD, or professional degree is associated with higher odds of having engaged in an IHM modality to manage pain in the past three months.

6.9.2 Numeric Summary of Relative Change in Count of ihm4pain_num

In the following code, I generated a numeric summary of the exponentiated coefficients from m1_hurdle_pois describing the predicted relative change in the count of ihm4pain_num.

Code
m1_hurdle_pois_coef |> 
  filter(Type == "count") |>
  select(Group, Comparison, Estimate, Conf.low, Conf.high) |> 
  mutate(Estimate = round(Estimate, digits = 3),
         Conf.low = round(Conf.low, digits = 3),
         Conf.high = round(Conf.high, digits = 3)) |>
  rename("Relative change" = Estimate,
         "95% CI Low" = Conf.low,
         "95% CI High" = Conf.high) |>
  flextable() |>
  bold(bold = TRUE, part = "header") |>
  add_header_lines(values = "Supplemental Table 2: Relative Change in Count of IHM Modalities Used for Pain") |>
  autofit() |>
  add_footer_lines(values = "Abbreviations: Assoc, associate’s; Bach, bachelor’s; CI, confidence interval; fam, family; PHQ-8, Patient Health Questionnaire-8; GED, general equivalency diploma; HS, high school; L, large; M, medium; metro, metropolitan; mo, months; NH, Non-Hispanic; S, small; v., versus; yrs, years", top = FALSE) |>
  theme_zebra(odd_header = "white") |>
  hline_top(part="body")

Supplemental Table 2: Relative Change in Count of IHM Modalities Used for Pain

Group

Comparison

Relative change

95% CI Low

95% CI High

(Intercept)

(Intercept)

0.749

0.427

1.315

Age

Age +10yrs

0.866

0.825

0.910

Sex

Male v. Female

0.775

0.663

0.907

Race/Ethnicity

Hispanic v. NH White

0.829

0.634

1.084

Race/Ethnicity

NH Black v. NH White

0.908

0.692

1.193

Race/Ethnicity

Other v. NH White

0.736

0.513

1.057

Education

HS/GED v. < HS

1.270

0.844

1.913

Education

Some College v. < HS

1.894

1.275

2.814

Education

Assoc/Bach v. < HS

1.772

1.199

2.619

Education

Masters/PhD v. < HS

1.913

1.236

2.959

Family income

Fam income +$10,000

1.010

0.997

1.024

Proximity to metro area

M/S metro v. Non-metro

1.069

0.838

1.364

Proximity to metro area

L fringe metro v. Non-metro

1.058

0.817

1.371

Proximity to metro area

L central metro v. Non-metro

1.241

0.971

1.587

Number of pain locations

Pain locations +1

1.102

1.041

1.167

Pain limiting life or work

Pain limit some days v. Never

1.209

0.991

1.476

Pain limiting life or work

Pain limit most days v. Never

1.315

1.034

1.672

Pain limiting life or work

Pain limit every day v. Never

1.180

0.916

1.520

Opioid use past 3 mo

Opioids some days v. Never

0.888

0.623

1.265

Opioid use past 3 mo

Opioids most days v. Never

1.265

0.746

2.147

Opioid use past 3 mo

Opioids every day v. Never

0.767

0.529

1.111

PHQ-8

PHQ-8 +5 units

0.901

0.834

0.973

Abbreviations: Assoc, associate’s; Bach, bachelor’s; CI, confidence interval; fam, family; PHQ-8, Patient Health Questionnaire-8; GED, general equivalency diploma; HS, high school; L, large; M, medium; metro, metropolitan; mo, months; NH, Non-Hispanic; S, small; v., versus; yrs, years

In examining the effects table above, I will summarize the effect of Age using two fictional cases: Han Solo and Luke Skywalker

Per the guidance by Yoshida, given the response is positive (among those who have positive counts for engaging in IHM modalities), the average count is 0.701 (Intercept).

Suppose Han Solo and Luke Skywalker 1) are both male and of the same race and ethnicity; 2) have the same level of education; 3) reside in rural areas; 4) rated similar challenges related to pain interfering with life or work in the past 3 months; 5) used the same amount of opioids for chronic pain within the past 3 months; 6) reported the same score on the PHQ-8; 7) reported the same number of locations where they were bothered at least “a little” by pain in the past three months; and 8) have the same family income. However, Han Solo is 10 years older than Luke Skywalker. In this scenario, m1_hurdle_pois predicts that the number of IHM modalities that Han Solo engaged in for pain in the past 3 months will be .866 times that of Luke’s (13.4% lower than Luke’s).

The 95% confidence interval for this estimated effect of age10 on ihm4pain_num is (0.825, 0.910). This confidence interval does not include one. Thus, holding all other variables constant, being 10 years older is associated with lower counts of IHM modalities used to manage pain in the past three months.

6.9.3 Visual Summary of Odds of ihm4pain_num != 0

To visualize the effects of the covariates on odds of ihm4pain_num != 0, I first generated the m1_coef_non0 tibble, limited to the “non-zero” coefficients. I then used ggplot() to plot the effects. Education covariates are placed on a separate axis given their size to not obscure the other covariates.

Code
m1_coef_non0 <- m1_hurdle_pois_coef |> 
  filter(Type == "non-zero", Comparison != "(Intercept)", Group != "Education") |>
  mutate(Comparison = fct_inorder(Comparison, ordered = NA),
         Comparison = fct_rev(Comparison))

m1_coef_non0educ <- m1_hurdle_pois_coef |> 
  filter(Type == "non-zero", Comparison != "(Intercept)", Group == "Education") |>
  mutate(Comparison = fct_inorder(Comparison, ordered = NA),
         Comparison = fct_rev(Comparison))

p1 <- ggplot(m1_coef_non0, aes(x = Estimate, y = Comparison)) +
  geom_vline(aes(xintercept = 1), linewidth = .25, linetype = 'dashed') +
  geom_errorbarh(aes(xmax = Conf.high, xmin = Conf.low), linewidth = .5, 
                 height = .2, color = 'gray50') +
  geom_point(size = 2.5, color = uhred) +
  labs(x = "aOR",
       y = "",
       title = "aOR of Engaging in ≥ 1 IHM Modality for Pain") +
  scale_x_continuous(breaks = seq(0,4,0.25))

p2 <- ggplot(m1_coef_non0educ, aes(x = Estimate, y = Comparison)) +
  geom_vline(aes(xintercept = 1), linewidth = .25, linetype = 'dashed') +
  geom_errorbarh(aes(xmax = Conf.high, xmin = Conf.low), linewidth = .5,
                 height = .2, color = 'gray50') +
  geom_point(size = 2.5, color = uhred) +
  labs(x = "aOR",
       y = "") +
  scale_x_continuous(breaks = seq(0,8,0.5))

p1/p2 + plot_layout(heights = c(5,1))

6.9.4 Visual Summary of Estimated Relative Change in Count of ihm4pain_num

To visualize the effects of the covariates on the estimated relative change in count of ihm4pain_num, I first generated the m1_coef_count tibble, limited to the “count” coefficients. I then used ggplot() to plot the effects. Education covariates are placed on a separate axis given their size to not obscure the other covariates.

Code
m1_coef_count <- m1_hurdle_pois_coef |> 
  filter(Type == "count", Comparison != "(Intercept)", Group != "Education") |>
  mutate(Comparison = fct_inorder(Comparison, ordered = NA),
         Comparison = fct_rev(Comparison))

m1_coef_count_educ <- m1_hurdle_pois_coef |> 
  filter(Type == "count", Comparison != "(Intercept)",
         Group == "Education") |>
  mutate(Comparison = fct_inorder(Comparison, ordered = NA),
         Comparison = fct_rev(Comparison))

p1 <- ggplot(m1_coef_count, aes(x = Estimate, y = Comparison)) +
  geom_vline(aes(xintercept = 1), linewidth = .25, linetype = 'dashed') +
  geom_errorbarh(aes(xmax = Conf.high, xmin = Conf.low), linewidth = .5, 
                 height = .2, color = 'gray50') +
  geom_point(size = 2.5, color = uhred) +
  labs(x = "Relative Change in Count",
       y = "",
       title = "Relative Change in Count of IHM Modalities Used for Pain",
       subtitle = "Assuming Engagement in > 0 IHM Modalities") +
  scale_x_continuous(breaks = seq(0,7,0.25))

p2 <- ggplot(m1_coef_count_educ, aes(x = Estimate, y = Comparison)) +
  geom_vline(aes(xintercept = 1), linewidth = .25, linetype = 'dashed') +
  geom_errorbarh(aes(xmax = Conf.high, xmin = Conf.low), linewidth = .5, 
                 height = .2, color = 'gray50') +
  geom_point(size = 2.5, color = uhred) +
  labs(x = "Relative Change in Count",
       y = "") +
  scale_x_continuous(breaks = seq(0,7,0.5))

p1/p2 + plot_layout(heights = c(5,1))

6.10 Check Model Assumptions

To generate a residuals vs. fitted plot from m1_hurdle_pois, I used ggplot() and geom_point().

Code
ggplot(m1_hurdle_pois_aug, aes(x = .fitted, y = .resid)) +
    geom_point() +
    labs(title = "Residuals vs. Fitted `ihm4pain_num`",
         subtitle = "Hurdle model with Poisson counts")

In examining the plot above, I see no major issues of concern.

6.11 Validating m1_hurdle_pois

6.11.1 Fit Summary in Test Data

To validate m1_hurdle_pois in the projectb_si_test data, I performed the following to generate m1test_hurdle_pois_summary, a summary of fit statistics.

  • I created m1test_hurdle_pois_aug as an augmentation of m1_hurdle_pois applied to the projectb_si_test data. This time I used mutate() to obtain .fitted and .resid values.
  • I generated, \(R^2\), RMSE, and MAE values using the mets() function.
  • I used group_by() and summarize() to pivot m1test_hurdle_pois_summary such that there would be one column for each fit statistic.
Code
m1test_hurdle_pois_aug <- projectb_si_test |>
    mutate(".fitted" = predict(m1_hurdle_pois, type = "response"),
           ".resid" = resid(m1_hurdle_pois, type = "response"))

m1test_hurdle_pois_summary <-
  mets(m1test_hurdle_pois_aug, truth = ihm4pain_num, estimate = .fitted) |>
  mutate(Model = "m1_hurdle_pois",
         R2 = if_else(.metric == "rsq", .estimate, NA),
         RMSE = if_else(.metric == "rmse", .estimate, NA),
         MAE = if_else(.metric == "mae", .estimate, NA)) |> relocate(Model)

m1test_hurdle_pois_summary <- m1test_hurdle_pois_summary |>
  group_by(Model) |>
  summarize(R2 = mean(R2, na.rm = TRUE),
                   RMSE = mean(RMSE, na.rm = TRUE),
                   MAE = mean(MAE, na.rm = TRUE))

6.11.2 Comparing Fits in Datasets

I then bound together m1_hurdle_pois_summary and m1test_hurdle_pois_summary into m1_summaryfinal to present a table comparing the \(R^2\), RMSE, and MAE values from the training and testing data.

Code
m1_summaryfinal <- bind_rows(m1_hurdle_pois_summary, 
                             m1test_hurdle_pois_summary) |>
  mutate(Sample = c("Training", "Testing"))

m1_summaryfinal |> select(Sample, R2, RMSE, MAE) |>
  flextable() |>
  colformat_double(digits = 4) |>
  autofit() |>
  theme_zebra(odd_header = "white") |>
  hline_top(part="body")

Sample

R2

RMSE

MAE

Training

0.1182

0.8069

0.6137

Testing

0.0001

0.9067

0.6929

In examining the table above, it appears the model does not perform as well in the testing data given the lower \(R^2\) (0.0001 vs. 0.1182). m1_hurdle_pois also has higher RMSE and MAE in the testing data.

6.11.3 Table Comparing Observed and Predicted Counts

To further explore the accuracy of m1_hurdle_pois, I created a table (m1_countsum) summarizing the observed and expected counts from the testing and training datasets as well as their respective m1_hurdle_pois applications.

Code
m1_countsum <- tibble(Source = c("Training data observations", 
                                 "m1_hurdle_pois in training", 
                                 "Test data observations", 
                                 "m1_hurdle_pois in test"),
                      ZeroIHM = c(sum(projectb_si_train$ihm4pain_num == 0),
                                  round(sum(predict(m1_hurdle_pois, 
                                                    type = "prob")[,1]),0),
                                  sum(projectb_si_test$ihm4pain_num == 0),
                                  round(sum(predict(m1_hurdle_pois, 
                                                    newdata = projectb_si_test, 
                                                    type = "prob")[,1]),0)),
                      OneIHM = c(sum(projectb_si_train$ihm4pain_num == 1),
                                 round(sum(predict(m1_hurdle_pois, 
                                                   type = "prob")[,2]),0),
                                 sum(projectb_si_test$ihm4pain_num == 1),
                                 round(sum(predict(m1_hurdle_pois, 
                                                   newdata = projectb_si_test, 
                                                   type = "prob")[,2]),0)),
                      TwoIHM = c(sum(projectb_si_train$ihm4pain_num == 2),
                                 round(sum(predict(m1_hurdle_pois, 
                                                   type = "prob")[,3]),0),
                                 sum(projectb_si_test$ihm4pain_num == 2),
                                 round(sum(predict(m1_hurdle_pois, 
                                                   newdata = projectb_si_test, 
                                                   type = "prob")[,3]),0)),
                      ThreeIHM = c(sum(projectb_si_train$ihm4pain_num == 3),
                                   round(sum(predict(m1_hurdle_pois, 
                                                     type = "prob")[,4]),0),
                                   sum(projectb_si_test$ihm4pain_num == 3),
                                   round(sum(predict(m1_hurdle_pois, 
                                                     newdata = projectb_si_test, 
                                                     type = "prob")[,4]),0)),
                      FourIHM = c(sum(projectb_si_train$ihm4pain_num == 4),
                                  round(sum(predict(m1_hurdle_pois, 
                                                    type = "prob")[,5]),0),
                                  sum(projectb_si_test$ihm4pain_num == 4),
                                  round(sum(predict(m1_hurdle_pois, 
                                                    newdata = projectb_si_test, 
                                                    type = "prob")[,5]),0))) |>
  mutate(MoreThan4IHM = nrow(projectb_si_train) - (ZeroIHM + OneIHM + TwoIHM + ThreeIHM + FourIHM),
         Total = (ZeroIHM + OneIHM + TwoIHM + ThreeIHM + 
                    FourIHM + MoreThan4IHM)) |>
  rename("0" = ZeroIHM,
         "1" = OneIHM,
         "2" = TwoIHM,
         "3" = ThreeIHM,
         "4" = FourIHM,
         ">4" = MoreThan4IHM)

6.11.4 Plot Comparing Observed and Predicted Counts

To facilitate a visual comparison, I generated m1_pivot and a pair of plots comparing observed vs. predicted values in the training and test samples.

Code
m1_pivot <- m1_countsum |>
  select(-c(Total)) |>
  pivot_longer(!Source, names_to = "Count", values_to = "Value") |>
  mutate(Count = factor(Count, levels = c("0", "1", "2", "3", "4", ">4")))

m1_pivot_train <- m1_pivot |>
  filter(Source %in% c("Training data observations", 
                       "m1_hurdle_pois in training")) |>
  mutate(Source = if_else(Source == "Training data observations", 
                          "Observed", "Predicted"),
         Source = factor(Source, levels = c("Observed", "Predicted")))

m1_pivot_test <- m1_pivot |>
  filter(Source %in% c("Test data observations", 
                       "m1_hurdle_pois in test")) |>
  mutate(Source = if_else(Source == "Test data observations", 
                          "Observed", "Predicted"),
         Source = factor(Source, levels = c("Observed", "Predicted")))

p1 <- ggplot(data = m1_pivot_train, aes(x = Count, y = Value, fill = Source)) +
  geom_bar(stat = "identity", position = position_dodge(), alpha = 0.75) +
  ylim(0, 2500) +
  labs(x = "Count of IHM Modalities",
       y = "Number of Subjects",
       title = "Training Data (n = 3557)") + 
  theme(legend.position="bottom",
        legend.title = element_blank())

p2 <- ggplot(data = m1_pivot_test, aes(x = Count, y = Value, fill = Source)) +
  geom_bar(stat = "identity", position = position_dodge(), alpha = 0.75) +
  ylim(0, 2500) +
  labs(x = "Count of IHM Modalities",
       y = "Number of Subjects",
       title = "Test Data (n = 3557)") +
  guides(fill="none")

p1 + p2 +
  plot_annotation("Comparing Observed v. Predicted Values in Weighted Hurdle Poisson Model")

Source

0

1

2

3

4

>4

Total

Training data observations

2,257

811

343

116

30

0

3,557

m1_hurdle_pois in training

2,329

787

312

96

25

8

3,557

Test data observations

2,297

791

324

107

38

0

3,557

m1_hurdle_pois in test

2,354

778

303

92

24

6

3,557

In examining the plots above, while there is slightly greater deviation from the observed counts in the test data, it appears the model performs quite well in predicting each level of the count of ihm4pain_num in both the training and test datasets.

7 Model 2: Nonpharmacologic Modalities Only vs. Opioids Only

For Model 2, I sought to answer the question:

Which demographic, mental health, and pain-related variables are associated with engagement in exclusively nonpharmacologic modalities rather than opioid utilization in the past 3 months among U.S. adults with chronic pain?

7.1 Create Dataset

For this model, the outcome was defined as engagement in nonpharmacologic modalities only (ie, engaging in IHM, a chronic pain self-management program, support groups, or physical, rehabilitative, occupational, or talk therapy while never using opioids) rather than opioids only (ie, using opioids on at least some days while not using any nonpharmacologic modalities). The already single-imputed data, projectb_si, were filtered to include the 3300 subjects engaging in either nonpharmacologic only or opioids only. With this new m2_dataset_si dataset, the outcome nonpharm_over_opioid was defined as 1 if “Nonpharm only” and 0 if “Opioids only.”

Code
m2_dataset_si <- projectb_si |>
  filter(nonpharm_opioid %in% 
           c("Opioids only", "Nonpharm only")) |>
  mutate(nonpharm_opioid = factor(nonpharm_opioid, 
                                  levels = c("Nonpharm only", 
                                             "Opioids only")),
         nonpharm_over_opioid = 
           if_else(nonpharm_opioid == "Nonpharm only", 1, 0))

m2_dataset_si |> tabyl(nonpharm_opioid, nonpharm_over_opioid)
 nonpharm_opioid   0    1
   Nonpharm only   0 2790
    Opioids only 510    0

7.2 Create Survey Design

I then used the svydesign() function to generate the survey design.

Code
nhis_des2 <- svydesign(data = m2_dataset_si, id = ~ subject_id,
                        weights = ~ wt_final, nest = TRUE)

7.3 Total population represented

The following code details the steps taken to generate a combined Table 2.

Code
total_n2 <- svytotal(~ one, nhis_des2, na.rm = FALSE) |> as_tibble() |>
  mutate(Variable = "Total, n",
         total = floor(total),
         total = formatC(total, format = "d", big.mark = ","),
         p.value = "") |>
  select(Variable, All = total, p.value)
Code
nonpharm_n <- svyby(~ one, ~ nonpharm_opioid, nhis_des2, svytotal) |> as_tibble() |> filter(nonpharm_opioid == "Nonpharm only") |> select(Nonpharm = one) |>
  mutate(Nonpharm = round(Nonpharm),
         Nonpharm = formatC(Nonpharm, format = "d", big.mark = ","))
opioids_n <- svyby(~ one, ~ nonpharm_opioid, nhis_des2, svytotal) |> as_tibble() |> filter(nonpharm_opioid == "Opioids only") |> select(Opioids = one) |>
  mutate(Opioids = round(Opioids),
         Opioids = formatC(Opioids, format = "d", big.mark = ","))
total_n2 <- bind_cols(total_n2, nonpharm_n, opioids_n) |>
  mutate(All = as.character(All),
         Nonpharm = as.character(Nonpharm),
         Opioids = as.character(Opioids))
Code
total_n2 <- total_n2 |> select(Variable, All, Nonpharm, Opioids, p.value) 

total_n2 |>
  flextable() |>
  autofit() |>
  bold(bold = TRUE, part = "header") |>
  theme_zebra(odd_header = "white") |>
  hline_top(part="body")

Variable

All

Nonpharm

Opioids

p.value

Total, n

22,707,275

19,099,216

3,608,059

7.4 Crosstabulation

Code
table2_cross <- nhis_des2 |>
  tbl_svysummary(
    include = c(age, sex, race_ethnicity, education, 
                fam_income1k, urban_rural, num_pain_locations,
                painlimit_lifework, phq8total),
    type = c(num_pain_locations = "continuous",
             fam_income = "continuous"),
    statistic = list(all_continuous()  ~ "{mean}",
                     all_categorical() ~ "{p}"),
    label = list(age ~ "Age in years, mean (95% CI)",
                 sex ~ "Sex, % (95% CI)",
                 race_ethnicity ~ "Race/Ethnicity, % (95% CI)",
                 education ~ "Education, % (95% CI)",
                 fam_income1k ~ "Family income in $1000s, mean (95% CI)",
                 urban_rural ~ "Proximity to metro area, % (95% CI)",
                 num_pain_locations ~ "Number of pain locations, mean (95% CI)",
                 painlimit_lifework ~ "Pain limiting life/work, % (95% CI)",
                 phq8total ~ "PHQ-8 total, mean (95% CI)"),
    digits = list(all_continuous() ~ 1,
                  all_categorical() ~ 1),
    by = nonpharm_opioid) |>
  add_overall() |>
  add_ci(pattern = "{stat} ({ci})", 
         style_fun = list(all_categorical() ~ style_percent_1digit, 
                          all_continuous() ~ style_number_1digit),
         statistic = list(all_categorical() ~ "{conf.low}-{conf.high}", 
                          all_continuous() ~ "{conf.low}-{conf.high}")) |>
  add_p(pvalue_fun = function(x) style_pvalue(x, digits = 3)) |>
  modify_header(label = "Variable",
                stat_0 ~ "All",
                stat_1 ~ "Nonpharm",
                stat_2 ~ "Opioids",
                p.value ~ "p.value")

table2_cross
Variable All1,2 Nonpharm1,2 Opioids1,2 p.value3
Age in years, mean (95% CI) 54.0 (53.2-54.7) 53.0 (52.2-53.8) 59.2 (57.6-60.8) <0.001
Sex, % (95% CI)


0.797
    Female 59.2 (57.1-61.3) 59.4 (57.1-61.6) 58.6 (53.0-63.9)
    Male 40.8 (38.7-42.9) 40.6 (38.4-42.9) 41.4 (36.1-47.0)
Race/Ethnicity, % (95% CI)


0.221
    NH White only 74.3 (72.4-76.1) 74.6 (72.6-76.6) 72.6 (67.2-77.3)
    Hispanic 10.1 (8.91-11.5) 10.0 (8.73-11.5) 10.7 (7.59-14.8)
    NH Black/African American only 9.8 (8.60-11.1) 9.2 (7.96-10.6) 12.8 (9.77-16.5)
    Other single and multiple races 5.8 (4.82-6.92) 6.1 (5.11-7.31) 4.0 (1.84-8.57)
Education, % (95% CI)


<0.001
    Less than high school 12.1 (10.7-13.7) 9.6 (8.25-11.2) 25.0 (20.3-30.2)
    High school graduate/GED 26.7 (24.8-28.6) 25.4 (23.5-27.5) 33.4 (28.4-38.8)
    Some college, no degree 20.4 (18.8-22.2) 20.6 (18.8-22.5) 19.4 (15.4-24.2)
    Associate's or bachelor's degree 31.2 (29.4-33.1) 33.7 (31.6-35.8) 18.3 (14.8-22.4)
    Master's, doctoral, or professional degree 9.6 (8.54-10.7) 10.6 (9.47-11.9) 3.9 (2.47-6.07)
Family income in $1000s, mean (95% CI) 68.5 (66.2-70.8) 72.8 (70.3-75.4) 45.6 (41.6-49.6) <0.001
Proximity to metro area, % (95% CI)


<0.001
    Nonmetropolitan 18.0 (16.5-19.6) 16.5 (15.0-18.2) 25.9 (21.7-30.7)
    Medium and small metro 33.0 (31.1-35.0) 32.8 (30.8-34.9) 33.9 (28.9-39.3)
    Large fringe metro 22.1 (20.4-23.9) 23.3 (21.5-25.3) 15.7 (12.4-19.7)
    Large central metro 26.8 (25.0-28.8) 27.3 (25.3-29.4) 24.5 (19.8-29.8)
Number of pain locations, mean (95% CI) 3.2 (3.2-3.3) 3.2 (3.1-3.3) 3.3 (3.2-3.5) 0.086
Pain limiting life/work, % (95% CI)


<0.001
    Never 22.1 (20.5-23.8) 24.2 (22.4-26.1) 10.9 (8.13-14.5)
    Some days 38.5 (36.4-40.6) 39.3 (37.1-41.5) 34.3 (29.1-39.9)
    Most days 17.2 (15.8-18.8) 16.9 (15.3-18.7) 18.9 (15.3-23.1)
    Every day 22.2 (20.5-24.0) 19.6 (17.9-21.5) 36.0 (30.9-41.3)
PHQ-8 total, mean (95% CI) 5.6 (5.3-5.8) 5.4 (5.1-5.6) 6.5 (5.9-7.2) <0.001
1 Mean; %
2 CI = Confidence Interval
3 Wilcoxon rank-sum test for complex survey samples; chi-squared test with Rao & Scott’s second-order correction

7.5 Assemble Table 2 Tibble

Code
table2_cross <- table2_cross |> as.data.frame() |> as_tibble()

table2_cross[is.na(table2_cross)] <- ""

table2_cross$p.value <- sub("0\\.",".", table2_cross$p.value)
table2_cross$p.value <- sub(" ","", table2_cross$p.value)

table2 <- bind_rows(total_n2, table2_cross) |>
  rename("Weighted variable" = Variable,
         "Combined groups" = All,
         "Nonpharm only" = Nonpharm,
         "Opioids only" = Opioids,
         "P value" = p.value)

7.7 Fit Weighted Logistic Model with Survey Weights

I fit the model m2_lrm_wt with the lrm() function, defining weights with weights = wt_final.

In this analysis, I chose to predict the outcome nonpharm_over_opioid based on 9 variables: age, sex, race_ethnicity, education, fam_income1k urban_rural, num_pain_locations, painlimit_lifework, and phq8total.

Code
set.seed(19900317)
dd <- datadist(m2_dataset_si, adjto.cat = "first")
options(datadist = "dd")

m2_lrm_wt <- lrm(nonpharm_over_opioid ~ 
                age + sex + race_ethnicity + education +
                               fam_income1k + urban_rural + num_pain_locations +
                               painlimit_lifework + phq8total,
                weights = wt_final, normwt = TRUE,
              data = m2_dataset_si, x = TRUE, y = TRUE)

m2_lrm_wt
Logistic Regression Model

lrm(formula = nonpharm_over_opioid ~ age + sex + race_ethnicity + 
    education + fam_income1k + urban_rural + num_pain_locations + 
    painlimit_lifework + phq8total, data = m2_dataset_si, x = TRUE, 
    y = TRUE, weights = wt_final, normwt = TRUE)


Sum of Weights by Response Category

        0         1 
 524.3515 2775.6485 

                        Model Likelihood        Discrimination    Rank Discrim.    
                              Ratio Test               Indexes          Indexes    
 Obs          3300    LR chi2     326.37        R2       0.161    C       0.737    
  0            510    d.f.            18      R2(18,3300)0.089    Dxy     0.475    
  1           2790    Pr(> chi2) <0.0001    R2(18,1323.1)0.208    gamma   0.475    
Sum of weights3300                              Brier    0.121    tau-a   0.124    
 max |deriv| 1e-07                                                                 

                                                     Coef    S.E.   Wald Z
Intercept                                             2.1393 0.3310  6.46 
age                                                  -0.0212 0.0033 -6.50 
sex=Male                                             -0.1856 0.1038 -1.79 
race_ethnicity=Hispanic                               0.2904 0.1785  1.63 
race_ethnicity=NH Black/African American only        -0.1415 0.1636 -0.87 
race_ethnicity=Other single and multiple races        0.4662 0.2511  1.86 
education=High school graduate/GED                    0.4940 0.1454  3.40 
education=Some college, no degree                     0.7391 0.1627  4.54 
education=Associate's or bachelor's degree            1.1555 0.1644  7.03 
education=Master's, doctoral, or professional degree  1.3719 0.2696  5.09 
fam_income1k                                          0.0074 0.0013  5.54 
urban_rural=Medium and small metro                    0.2832 0.1362  2.08 
urban_rural=Large fringe metro                        0.6227 0.1623  3.84 
urban_rural=Large central metro                       0.3709 0.1525  2.43 
num_pain_locations                                    0.0004 0.0411  0.01 
painlimit_lifework=Some days                         -0.6143 0.1669 -3.68 
painlimit_lifework=Most days                         -0.7064 0.1879 -3.76 
painlimit_lifework=Every day                         -1.0145 0.1773 -5.72 
phq8total                                            -0.0070 0.0097 -0.72 
                                                     Pr(>|Z|)
Intercept                                            <0.0001 
age                                                  <0.0001 
sex=Male                                             0.0739  
race_ethnicity=Hispanic                              0.1038  
race_ethnicity=NH Black/African American only        0.3869  
race_ethnicity=Other single and multiple races       0.0634  
education=High school graduate/GED                   0.0007  
education=Some college, no degree                    <0.0001 
education=Associate's or bachelor's degree           <0.0001 
education=Master's, doctoral, or professional degree <0.0001 
fam_income1k                                         <0.0001 
urban_rural=Medium and small metro                   0.0376  
urban_rural=Large fringe metro                       0.0001  
urban_rural=Large central metro                      0.0150  
num_pain_locations                                   0.9913  
painlimit_lifework=Some days                         0.0002  
painlimit_lifework=Most days                         0.0002  
painlimit_lifework=Every day                         <0.0001 
phq8total                                            0.4712  

7.8 Numeric Summary of Coefficients (m2_coef_wt)

The following code generates a numeric summary of coefficients from the model and adds meaningful labels for comparison.

Code
m2_coef_wt <- summary(m2_lrm_wt, conf.int = 0.95) |> as_tibble() |>
  filter(Type == 2) |>
  mutate(Group = c("Age", "Family income",
                   "Number of pain locations",
                   "PHQ-8", 
                   "Sex", 
                   "Race/Ethnicity", "Race/Ethnicity", "Race/Ethnicity",
                   "Education", "Education", 
                   "Education", "Education", 
                   "Proximity to metro area", "Proximity to metro area",
                   "Proximity to metro area", 
                   "Pain limiting life or work", 
                   "Pain limiting life or work",
                   "Pain limiting life or work"),
         Comparison = c("Age 69 v. 46", 
                        "Family income $85k v. $22k",
                        "4 pain locations v. 2",
                        "PHQ-8 score 8 v. 1",
                        "Male v. Female", 
                        "Hispanic v. NH White",
                        "NH Black v. NH White",
                        "Other v. NH White",
                        "HS/GED v. < HS", 
                        "Some College v. < HS", 
                        "Assoc/Bach v. < HS",
                        "Masters/PhD v. < HS", 
                        "M/S metro v. Non-metro", 
                        "L fringe metro v. Non-metro",
                        "L central metro v. Non-metro", 
                        "Pain limit some days v. Never", 
                        "Pain limit most days v. Never",
                        "Pain limit every day v. Never"),
         Effect = as.numeric(Effect)) |>
  rename(Lower0.95 = "Lower 0.95", Upper0.95 = "Upper 0.95") |>
  mutate(Lower0.95 = as.numeric(Lower0.95), Upper0.95 = as.numeric(Upper0.95))

m2_coef_wt <- m2_coef_wt |>
  mutate(Comparison = factor(Comparison, levels = c("Age 69 v. 46",
                                                    "Male v. Female",
                                                    "Hispanic v. NH White",
                                                    "NH Black v. NH White",
                                                    "Other v. NH White",
                                                    "HS/GED v. < HS", 
                                                    "Some College v. < HS", 
                                                    "Assoc/Bach v. < HS",
                                                    "Masters/PhD v. < HS", 
                                                    "Family income $85k v. $22k",
                                                    "M/S metro v. Non-metro", 
                                                    "L fringe metro v. Non-metro",
                                                    "L central metro v. Non-metro", 
                                                    "4 pain locations v. 2",
                                                    "Pain limit some days v. Never", 
                                                    "Pain limit most days v. Never",
                                                    "Pain limit every day v. Never",
                                                    "PHQ-8 score 8 v. 1"))) |>
  arrange(Comparison)

The following code generates Supplemental Table 3, providing a numeric summmary of the coefficients from m2_lrm_wt.

Code
m2_coef_wt |> 
  select(Group, Comparison, Effect, Lower0.95, Upper0.95) |> 
  mutate(Effect = round(Effect, digits = 3),
         Lower0.95 = round(Lower0.95, digits = 3),
         Upper0.95 = round(Upper0.95, digits = 3)) |>
  rename(aOR = Effect,
         "95% CI Low" = Lower0.95,
         "95% CI High" = Upper0.95) |>
  flextable() |> 
  bold(bold = TRUE, part = "header") |>
  add_header_lines(values = "Supplemental Table 3: aOR of Engagement in Nonpharmacologic Only vs. Opioids Only") |>
  autofit() |>
  add_footer_lines(values = "Abbreviations: Assoc, associate’s; aOR, adjusted odds ratio; Bach, bachelor’s; CI, confidence interval; fam, family; PHQ-8, Patient Health Questionnaire-8; GED, general equivalency diploma; HS, high school; L, large; M, medium; metro, metropolitan; NH, Non-Hispanic; S, small; v., versus; yrs, years", top = FALSE) |>
  theme_zebra(odd_header = "white") |>
  hline_top(part="body")

Supplemental Table 3: aOR of Engagement in Nonpharmacologic Only vs. Opioids Only

Group

Comparison

aOR

95% CI Low

95% CI High

Age

Age 69 v. 46

0.614

0.530

0.711

Sex

Male v. Female

0.831

0.678

1.018

Race/Ethnicity

Hispanic v. NH White

1.337

0.942

1.897

Race/Ethnicity

NH Black v. NH White

0.868

0.630

1.196

Race/Ethnicity

Other v. NH White

1.594

0.974

2.607

Education

HS/GED v. < HS

1.639

1.232

2.179

Education

Some College v. < HS

2.094

1.522

2.881

Education

Assoc/Bach v. < HS

3.176

2.301

4.383

Education

Masters/PhD v. < HS

3.943

2.325

6.687

Family income

Family income $85k v. $22k

1.596

1.352

1.883

Proximity to metro area

M/S metro v. Non-metro

1.327

1.016

1.734

Proximity to metro area

L fringe metro v. Non-metro

1.864

1.356

2.562

Proximity to metro area

L central metro v. Non-metro

1.449

1.075

1.954

Number of pain locations

4 pain locations v. 2

1.001

0.852

1.176

Pain limiting life or work

Pain limit some days v. Never

0.541

0.390

0.750

Pain limiting life or work

Pain limit most days v. Never

0.493

0.341

0.713

Pain limiting life or work

Pain limit every day v. Never

0.363

0.256

0.513

PHQ-8

PHQ-8 score 8 v. 1

0.952

0.834

1.088

Abbreviations: Assoc, associate’s; aOR, adjusted odds ratio; Bach, bachelor’s; CI, confidence interval; fam, family; PHQ-8, Patient Health Questionnaire-8; GED, general equivalency diploma; HS, high school; L, large; M, medium; metro, metropolitan; NH, Non-Hispanic; S, small; v., versus; yrs, years

7.9 Effects Plot (m2_coef_wt)

To visualize the effects of the covariates on odds of nonpharm_over_opioid = 1, I used ggplot() to plot the effects. Education covariates are placed on a separate axis given their size to not obscure the other covariates.

Code
m2_coef_plot_wt <- m2_coef_wt |> 
  filter(Comparison != "(Intercept)", Group != "Education") |>
  mutate(Comparison = fct_inorder(Comparison, ordered = NA),
         Comparison = fct_rev(Comparison))

m2_coef_plot_educ_wt <- m2_coef_wt |> 
  filter(Comparison != "(Intercept)", Group == "Education") |>
  mutate(Comparison = fct_inorder(Comparison, ordered = NA),
         Comparison = fct_rev(Comparison))

p1 <- ggplot(m2_coef_plot_wt, aes(x = Effect, y = Comparison)) +
  geom_vline(aes(xintercept = 1), linewidth = .25, linetype = 'dashed') +
  geom_errorbarh(aes(xmax = Upper0.95, xmin = Lower0.95), linewidth = .5,
                 height = .2, color = 'gray50') +
  geom_point(size = 2.5, color = uhred) +
  labs(x = "aOR",
       y = "",
       title = "aOR of Using Nonpharmacologic Only vs. Opioids Only") +
  scale_x_continuous(breaks = seq(0,8,0.5))

p2 <- ggplot(m2_coef_plot_educ_wt, aes(x = Effect, y = Comparison)) +
  geom_vline(aes(xintercept = 1), linewidth = .25, linetype = 'dashed') +
  geom_errorbarh(aes(xmax = Upper0.95, xmin = Lower0.95), linewidth = .5,
                 height = .2, color = 'gray50') +
  geom_point(size = 2.5, color = uhred) +
  labs(x = "aOR",
       y = "") +
  scale_x_continuous(breaks = seq(0,8,0.5))

p1/p2 + plot_layout(heights = c(5,1))

7.10 Summarizing Fit (m2_coef_wt)

To summarize the fit of m2_lrm_wt, I simply entered m2_lrm_wt within the code. From the output, the Nagelkerke \(R^2\) is 0.161 and the unvalidated area under the ROC curve (aka c-statistic) is 0.737.

Code
m2_lrm_wt
Logistic Regression Model

lrm(formula = nonpharm_over_opioid ~ age + sex + race_ethnicity + 
    education + fam_income1k + urban_rural + num_pain_locations + 
    painlimit_lifework + phq8total, data = m2_dataset_si, x = TRUE, 
    y = TRUE, weights = wt_final, normwt = TRUE)


Sum of Weights by Response Category

        0         1 
 524.3515 2775.6485 

                        Model Likelihood        Discrimination    Rank Discrim.    
                              Ratio Test               Indexes          Indexes    
 Obs          3300    LR chi2     326.37        R2       0.161    C       0.737    
  0            510    d.f.            18      R2(18,3300)0.089    Dxy     0.475    
  1           2790    Pr(> chi2) <0.0001    R2(18,1323.1)0.208    gamma   0.475    
Sum of weights3300                              Brier    0.121    tau-a   0.124    
 max |deriv| 1e-07                                                                 

                                                     Coef    S.E.   Wald Z
Intercept                                             2.1393 0.3310  6.46 
age                                                  -0.0212 0.0033 -6.50 
sex=Male                                             -0.1856 0.1038 -1.79 
race_ethnicity=Hispanic                               0.2904 0.1785  1.63 
race_ethnicity=NH Black/African American only        -0.1415 0.1636 -0.87 
race_ethnicity=Other single and multiple races        0.4662 0.2511  1.86 
education=High school graduate/GED                    0.4940 0.1454  3.40 
education=Some college, no degree                     0.7391 0.1627  4.54 
education=Associate's or bachelor's degree            1.1555 0.1644  7.03 
education=Master's, doctoral, or professional degree  1.3719 0.2696  5.09 
fam_income1k                                          0.0074 0.0013  5.54 
urban_rural=Medium and small metro                    0.2832 0.1362  2.08 
urban_rural=Large fringe metro                        0.6227 0.1623  3.84 
urban_rural=Large central metro                       0.3709 0.1525  2.43 
num_pain_locations                                    0.0004 0.0411  0.01 
painlimit_lifework=Some days                         -0.6143 0.1669 -3.68 
painlimit_lifework=Most days                         -0.7064 0.1879 -3.76 
painlimit_lifework=Every day                         -1.0145 0.1773 -5.72 
phq8total                                            -0.0070 0.0097 -0.72 
                                                     Pr(>|Z|)
Intercept                                            <0.0001 
age                                                  <0.0001 
sex=Male                                             0.0739  
race_ethnicity=Hispanic                              0.1038  
race_ethnicity=NH Black/African American only        0.3869  
race_ethnicity=Other single and multiple races       0.0634  
education=High school graduate/GED                   0.0007  
education=Some college, no degree                    <0.0001 
education=Associate's or bachelor's degree           <0.0001 
education=Master's, doctoral, or professional degree <0.0001 
fam_income1k                                         <0.0001 
urban_rural=Medium and small metro                   0.0376  
urban_rural=Large fringe metro                       0.0001  
urban_rural=Large central metro                      0.0150  
num_pain_locations                                   0.9913  
painlimit_lifework=Some days                         0.0002  
painlimit_lifework=Most days                         0.0002  
painlimit_lifework=Every day                         <0.0001 
phq8total                                            0.4712  

7.11 Validating m2_coef_wt

To validate m2_lrm_wt, I first set a seed of 19900317 using the set.seed() function. I then used the validate() function with 40 bootstrap repetitions to generate validated (index.corrected) \(R^2\) and Dxy values. I stored this validated summary as val_m2 so that I could obtain specific values for a summary table below.

Code
set.seed(19900317)
val_m2 <- validate(m2_lrm_wt, method = "boot", B = 40)
Code
fit_report <- tibble(Model = "m2") |>
  mutate("Validated R2" = case_when(
    Model == "m2" ~ val_m2[2,5]),
    "Validated C stat" = case_when(
      Model == "m2" ~  0.5 + (val_m2[1,5]/2)),
    AIC = AIC(m2_lrm_wt),
    BIC = BIC(m2_lrm_wt))
    
fit_report |> 
  select(Model, "Validated R2", "Validated C stat", AIC, BIC) |>
  flextable() |>
  colformat_double(j = 2:3, digits = 3) |>
  colformat_double(j = 4:5, digits = 0) |>
  autofit() |>
  theme_zebra(odd_header = "white") |>
  hline_top(part="body")

Model

Validated R2

Validated C stat

AIC

BIC

m2

0.148

0.728

2,601

2,717

7.12 References

Ford, C. (2016). Getting Started with Hurdle Models. University of Virginia Library. https://data.library.virginia.edu/getting-started-with-hurdle-models/

Kroenke, K., Spitzer, R. L., & Williams, J. B. (2001). The PHQ-9: validity of a brief depression severity measure. Journal of General Internal Medicine, 16(9), 606-613. https://doi.org/10.1046/j.1525-1497.2001.016009606.x

Löwe, B., Decker, O., Müller, S., Brähler, E., Schellberg, D., Herzog, W., & Herzberg, P. Y. (2008). Validation and standardization of the Generalized Anxiety Disorder Screener (GAD-7) in the general population. Medical Care, 46(3), 266-274. https://doi.org/10.1097/MLR.0b013e318160d093

National Center for Health Statistics. (2021, 2021-04-05). 2019 National Health Interview Survey. National Center for Health Statistics. https://www.cdc.gov/nchs/nhis/2019nhis.htm

Yoshida, K. (2014). Models for excess zeros using pscl package (Hurdle and zero-inflated regression models) and their interpretations. Retrieved May 1 from https://rpubs.com/kaz_yos/pscl-2

Zelaya, C. E., Dahlhamer, J. M., Lucas, J. W., & Connor, E. M. (2020). Chronic Pain and High-impact Chronic Pain Among U.S. Adults, 2019.

7.13 Acknowledgments

  • Robert J. Trager, DC (methodology, interpretation, literature review, first draft of introduction)
  • Jeffery A. Dusek, PhD (methodology, interpretation)
  • The National Center for Health Statistics (data collection)
  • Thomas E. Love, PhD (methodology instruction)

8 Session Information

Code
xfun::session_info()
R version 4.3.2 (2023-10-31)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.6.3

Locale: en_US.UTF-8 / en_US.UTF-8 / en_US.UTF-8 / C / en_US.UTF-8 / en_US.UTF-8

Package version:
  abind_1.4-5              askpass_1.2.0            backports_1.4.1         
  base64enc_0.1-3          bigD_0.2.0               bit_4.0.5               
  bit64_4.0.5              bitops_1.0.7             blob_1.2.4              
  boot_1.3.28.1            brio_1.1.4               broom_1.0.5             
  broom.helpers_1.14.0     bslib_0.6.1              cachem_1.0.8            
  callr_3.7.3              car_3.1-2                carData_3.0-5           
  caret_6.0-94             caTools_1.18.2           cellranger_1.1.0        
  checkmate_2.3.1          class_7.3-22             cli_3.6.2               
  clipr_0.8.0              clock_0.7.0              cluster_2.1.6           
  cmprsk_2.2-11            codetools_0.2-19         colorspace_2.1-0        
  commonmark_1.9.0         compiler_4.3.2           conflicted_1.2.0        
  countreg_0.2-1           cpp11_0.4.7              crayon_1.5.2            
  crul_1.4.0               curl_5.2.0               data.table_1.14.10      
  datasets_4.3.2           DBI_1.2.0                dbplyr_2.4.0            
  DEoptimR_1.1.3           desc_1.4.3               diagram_1.6.5           
  dials_1.2.0              DiceDesign_1.10          diffobj_0.3.5           
  digest_0.6.34            doRNG_1.8.6              dplyr_1.1.4             
  dtplyr_1.3.1             e1071_1.7.14             ellipsis_0.3.2          
  Epi_2.47.1               etm_1.1.1                evaluate_0.23           
  fansi_1.0.6              farver_2.1.1             fastmap_1.1.1           
  flextable_0.9.4          fontawesome_0.5.2        fontBitstreamVera_0.1.1 
  fontLiberation_0.1.0     fontquiver_0.2.1         forcats_1.0.0           
  foreach_1.5.2            foreign_0.8-86           Formula_1.2-5           
  fs_1.6.3                 furrr_0.3.1              future_1.33.1           
  future.apply_1.11.1      gargle_1.5.2             gdtools_0.3.5           
  generics_0.1.3           gfonts_0.2.0             GGally_2.2.0            
  ggformula_0.12.0         ggplot2_3.4.4            ggrepel_0.9.5           
  ggridges_0.5.5           ggstats_0.5.1            glmnet_4.1.8            
  globals_0.16.2           glue_1.7.0               googledrive_2.1.1       
  googlesheets4_1.1.1      gower_1.0.1              GPfit_1.0-8             
  gplots_3.1.3             graphics_4.3.2           grDevices_4.3.2         
  grid_4.3.2               gridExtra_2.3            gt_0.10.0               
  gtable_0.3.4             gtools_3.9.5             gtsummary_1.7.2         
  hardhat_1.3.0            haven_2.5.4              highr_0.10              
  Hmisc_5.1-1              hms_1.1.3                htmlTable_2.4.2         
  htmltools_0.5.7          htmlwidgets_1.6.4        httpcode_0.3.0          
  httpuv_1.6.13            httr_1.4.7               ids_1.0.1               
  infer_1.0.5              ipred_0.9-14             isoband_0.2.7           
  iterators_1.0.14         itertools_0.1.3          janitor_2.2.0           
  jquerylib_0.1.4          jsonlite_1.8.8           juicyjuice_0.1.0        
  kableExtra_1.3.4         KernSmooth_2.23.22       knitr_1.45              
  labeling_0.4.3           labelled_2.12.0          laeken_0.5.2            
  later_1.3.2              lattice_0.22-5           lava_1.7.3              
  lhs_1.1.6                lifecycle_1.0.4          listenv_0.9.0           
  lme4_1.1.35.1            lmtest_0.9.40            lubridate_1.9.3         
  magrittr_2.0.3           markdown_1.12            MASS_7.3-60             
  Matrix_1.6-5             MatrixModels_0.5-3       memoise_2.0.1           
  methods_4.3.2            mgcv_1.9-1               mime_0.12               
  minqa_1.2.6              missForest_1.5           mitools_2.4             
  modeldata_1.2.0          modelenv_0.1.1           ModelMetrics_1.2.2.2    
  modelr_0.1.11            mosaic_1.9.0             mosaicCore_0.9.4.0      
  mosaicData_0.20.4        multcomp_1.4-25          munsell_0.5.0           
  mvtnorm_1.2-4            naniar_1.0.0             nlme_3.1-164            
  nloptr_2.0.3             nnet_7.3-19              norm_1.0.11.1           
  numDeriv_2016.8-1.1      officer_0.6.3            openssl_2.1.1           
  parallel_4.3.2           parallelly_1.36.0        parsnip_1.1.1           
  patchwork_1.2.0          pbkrtest_0.5.2           pillar_1.9.0            
  pkgbuild_1.4.3           pkgconfig_2.0.3          pkgload_1.3.3           
  plyr_1.8.9               polspline_1.1.24         praise_1.0.0            
  prettyunits_1.2.0        pROC_1.18.5              processx_3.8.3          
  prodlim_2023.08.28       progress_1.2.3           progressr_0.14.0        
  promises_1.2.1           proxy_0.4.27             ps_1.7.5                
  pscl_1.5.5.1             purrr_1.0.2              quantreg_5.97           
  R6_2.5.1                 ragg_1.2.7               randomForest_4.7.1.1    
  ranger_0.16.0            rappdirs_0.3.3           RColorBrewer_1.1-3      
  Rcpp_1.0.12              RcppArmadillo_0.12.6.6.1 RcppEigen_0.3.3.9.4     
  reactable_0.4.4          reactR_0.5.0             readr_2.1.5             
  readxl_1.4.3             recipes_1.0.9            rematch_2.0.0           
  rematch2_2.1.2           reprex_2.1.0             reshape2_1.4.4          
  rlang_1.1.3              rmarkdown_2.25           rms_6.7-1               
  rngtools_1.5.2           robustbase_0.99.1        ROCR_1.0-11             
  rpart_4.1.23             rprojroot_2.0.4          rsample_1.2.0           
  rstudioapi_0.15.0        rvest_1.0.3              sandwich_3.1-0          
  sass_0.4.8               scales_1.3.0             selectr_0.4.2           
  shape_1.4.6              shiny_1.8.0              simputation_0.2.8       
  slider_0.3.1             snakecase_0.11.1         sourcetools_0.1.7.1     
  sp_2.1.2                 SparseM_1.81             splines_4.3.2           
  SQUAREM_2021.1           stats_4.3.2              stats4_4.3.2            
  stringi_1.8.3            stringr_1.5.1            survey_4.2-1            
  survival_3.5-7           svglite_2.1.3            sys_3.4.2               
  systemfonts_1.0.5        testthat_3.2.1           textshaping_0.3.7       
  TH.data_1.1-2            tibble_3.2.1             tidymodels_1.1.1        
  tidyr_1.3.0              tidyselect_1.2.0         tidyverse_2.0.0         
  timechange_0.2.0         timeDate_4032.109        tinytex_0.49            
  tools_4.3.2              triebeard_0.4.1          tune_1.1.2              
  tzdb_0.4.0               UpSetR_1.4.0             urltools_1.7.3          
  utf8_1.2.4               utils_4.3.2              uuid_1.1-1              
  V8_4.4.1                 vcd_1.4.12               vctrs_0.6.5             
  VGAM_1.1-9               VIM_6.2.2                viridis_0.6.4           
  viridisLite_0.4.2        visdat_0.6.0             vroom_1.6.5             
  waldo_0.5.2              warp_0.2.1               webshot_0.5.5           
  withr_2.5.2              workflows_1.1.3          workflowsets_1.0.1      
  xfun_0.41                xml2_1.3.6               xtable_1.8-4            
  yaml_2.3.8               yardstick_1.2.0          zip_2.3.0               
  zoo_1.8-12