Title: | Core Functionality for Simulating Quantities of Interest from Generalised Linear Models |
---|---|
Description: | Core functions for simulating quantities of interest from generalised linear models (GLM). This package will form the backbone of a series of other packages that improve the interpretation of GLM estimates. |
Authors: | Christopher Gandrud [aut, cre] |
Maintainer: | Christopher Gandrud <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.2.4 |
Built: | 2024-11-10 05:15:45 UTC |
Source: | https://github.com/christophergandrud/coresim |
A data set containing 400 graduate school admissions decisions.
Admission
Admission
A data set with 400 rows and 4 variables.
UCLA IDRE http://stats.idre.ucla.edu/r/dae/logit-regression/
Simulate coefficients from a GLM by making draws from the multivariate normal distribution
b_sim(obj, mu, Sigma, nsim = 1000)
b_sim(obj, mu, Sigma, nsim = 1000)
obj |
a fitted model object. |
mu |
an optional vector giving the means of the variables. If |
Sigma |
an optional positive-definite symmetric matrix specifying the
covariance matrix of the variables. If |
nsim |
number of simulations to draw. |
A data frame of simulated coefficients from obj
.
library(car) # Estimate model m1 <- lm(prestige ~ education + type, data = Prestige) # Create fitted values prestige_sims <- b_sim(m1) # Manually supply coefficient means and covariance matrix coefs <- coef(m1) vcov_matrix <- vcov(m1) prestige_sims_manual <- b_sim(mu = coefs, Sigma = vcov_matrix)
library(car) # Estimate model m1 <- lm(prestige ~ education + type, data = Prestige) # Create fitted values prestige_sims <- b_sim(m1) # Manually supply coefficient means and covariance matrix coefs <- coef(m1) vcov_matrix <- vcov(m1) prestige_sims_manual <- b_sim(mu = coefs, Sigma = vcov_matrix)
qi_builder
)Find the systematic component in the linear form for fitted values in across
each simulation (note: largely for internal use by qi_builder
)
linear_systematic(b_sims, newdata, inc_intercept = TRUE)
linear_systematic(b_sims, newdata, inc_intercept = TRUE)
b_sims |
a data frame created by |
newdata |
a data frame of fitted values with column names corresponding
to variable names in |
inc_intercept |
logical whether to include the intercept in the lineary systematic component. |
A data frame fitted values supplied in newdata
and associated
linear systematic component estimates for all simulationed coefficient
estimates. The linear systematic components are included in a column
named ls_
.
King, Gary, Michael Tomz, and Jason Wittenberg. 2000. "Making the Most of Statistical Analyses: Improving Interpretation and Presentation." American Journal of Political Science 44(2): 341-55.
library(car) # Estimate model m1 <- lm(prestige ~ education + type, data = Prestige) # Create fitted values fitted_df <- expand.grid(education = 6:16, typewc = 1) # Simulate coefficients m1_sims <- b_sim(m1, nsim = 1000) # Find linear systematic component for fitted values ls <- linear_systematic(b_sims = m1_sims, newdata = fitted_df)
library(car) # Estimate model m1 <- lm(prestige ~ education + type, data = Prestige) # Create fitted values fitted_df <- expand.grid(education = 6:16, typewc = 1) # Simulate coefficients m1_sims <- b_sim(m1, nsim = 1000) # Find linear systematic component for fitted values ls <- linear_systematic(b_sims = m1_sims, newdata = fitted_df)
Find quantities of interest from generalized linear models
qi_builder(obj, newdata, FUN, ci = 0.95, nsim = 1000, slim = FALSE, large_computation = FALSE, original_order = FALSE, b_sims, mu, Sigma, verbose = TRUE, ...)
qi_builder(obj, newdata, FUN, ci = 0.95, nsim = 1000, slim = FALSE, large_computation = FALSE, original_order = FALSE, b_sims, mu, Sigma, verbose = TRUE, ...)
obj |
a fitted model object from which to base coefficient simulations on. |
newdata |
an optional data frame of fitted values with column names
corresponding to coefficient names in |
FUN |
a function for calculating how to find the quantity of interest
from a vector of the fitted linear systematic component. It must return
a numeric vector. If |
ci |
the proportion of the central interval of the simulations to
return. Must be in (0, 1] or equivalently (0, 100]. Note: if |
nsim |
number of simulations to draw. |
slim |
logical indicating whether to (if |
large_computation |
logical. If |
original_order |
logical whether or not to keep the original scenario
order when |
b_sims |
an optional data frame created by |
mu |
an optional vector giving the means of the variables. If |
Sigma |
an optional positive-definite symmetric matrix specifying the
covariance matrix of the variables. If |
verbose |
logical. Whether to include full set of messages or not. |
... |
arguments to passed to |
If slimmer = FALSE
a data frame of fitted values supplied in
newdata
and associated simulated quantities of interest for all
simulations in the central interval specified by ci
. The quantities
of interest are in a column named qi_
.
If slimmer = TRUE
a data frame of fitted values supplied in
newdata
and the minimum, median, and maximum values of the central
interval specified by ci
for each scenario are returned in three
columns named qi_min
, qi_median
, and qi_max
,
respectively.
library(car) ## Normal linear model m1 <- lm(prestige ~ education + type, data = Prestige) # Using observed data as scenarios linear_qi_obs <- qi_builder(m1) # Create fitted values fitted_df_1 <- expand.grid(education = 6:16, typewc = 1) linear_qi <- qi_builder(m1, newdata = fitted_df_1) # Manually supply coefficient means and covariance matrix coefs <- coef(m1) vcov_matrix <- vcov(m1) linear_qi_custom_mu_Sigma <- qi_builder(mu = coefs, Sigma = vcov_matrix, newdata = fitted_df_1) ## Logistic regression # Load data data(Admission) Admission$rank <- as.factor(Admission$rank) # Estimate model m2 <- glm(admit ~ gre + gpa + rank, data = Admission, family = 'binomial') # Specify fitted values m2_fitted <- expand.grid(gre = seq(220, 800, by = 10), gpa = c(2, 4), rank = '4') # Function to find predicted probabilities from logistic regression models pr_function <- function(x) 1 / (1 + exp(-x)) # Find quantity of interest logistic_qi_1 <- qi_builder(m2, m2_fitted, FUN = pr_function) logistic_qi_2 <- qi_builder(m2, m2_fitted, FUN = pr_function, slim = TRUE)
library(car) ## Normal linear model m1 <- lm(prestige ~ education + type, data = Prestige) # Using observed data as scenarios linear_qi_obs <- qi_builder(m1) # Create fitted values fitted_df_1 <- expand.grid(education = 6:16, typewc = 1) linear_qi <- qi_builder(m1, newdata = fitted_df_1) # Manually supply coefficient means and covariance matrix coefs <- coef(m1) vcov_matrix <- vcov(m1) linear_qi_custom_mu_Sigma <- qi_builder(mu = coefs, Sigma = vcov_matrix, newdata = fitted_df_1) ## Logistic regression # Load data data(Admission) Admission$rank <- as.factor(Admission$rank) # Estimate model m2 <- glm(admit ~ gre + gpa + rank, data = Admission, family = 'binomial') # Specify fitted values m2_fitted <- expand.grid(gre = seq(220, 800, by = 10), gpa = c(2, 4), rank = '4') # Function to find predicted probabilities from logistic regression models pr_function <- function(x) 1 / (1 + exp(-x)) # Find quantity of interest logistic_qi_1 <- qi_builder(m2, m2_fitted, FUN = pr_function) logistic_qi_2 <- qi_builder(m2, m2_fitted, FUN = pr_function, slim = TRUE)
qi_builder
Find maximum, minimum, and median values for each scenario found using
qi_builder
qi_slimmer(df, scenario_var = "scenario_", qi_var = "qi_")
qi_slimmer(df, scenario_var = "scenario_", qi_var = "qi_")
df |
a data frame of simulated quantities of interest created by
|
scenario_var |
character string of the variable name marking the scenarios. |
qi_var |
character string of the name of the variable with the simulated quantity of interest values. |
This funciton slims down a simulation data set to some of its key features (minimun, median, and maximum value for each fitted scenario) so that it takes up less memory and can be easily plotted.
The function is incorporated into qi_builder
and can be run
using slim = TRUE
.
A data frame with the fitted values and the minimum (qi_min
),
median (qi_median
), and maximum (qi_max
) values from the
central interval specified with ci
in qi_builder
.
library(car) # Normal linear model m1 <- lm(prestige ~ education + type, data = Prestige) # Simulate coefficients m1_sims <- b_sim(m1) # Create fitted values fitted_df <- expand.grid(education = 6:16, typewc = 1) # Find predicted outcomes (95% central interval, by default) linear_qi <- qi_builder(b_sims = m1_sims, newdata = fitted_df, slim = FALSE) # Slim data set linear_slim <- qi_slimmer(linear_qi)
library(car) # Normal linear model m1 <- lm(prestige ~ education + type, data = Prestige) # Simulate coefficients m1_sims <- b_sim(m1) # Create fitted values fitted_df <- expand.grid(education = 6:16, typewc = 1) # Find predicted outcomes (95% central interval, by default) linear_qi <- qi_builder(b_sims = m1_sims, newdata = fitted_df, slim = FALSE) # Slim data set linear_slim <- qi_slimmer(linear_qi)