Title: | Dedicated 'ggplot2' Methods for 'fixest' Objects |
---|---|
Description: | Provides 'ggplot2' equivalents of fixest::coefplot() and fixest::iplot(), for producing nice coefficient plots and interaction plots. Enables some additional functionality and convenience features, including grouped multi-'fixest' object faceting and programmatic updates to existing plots (e.g., themes and aesthetics). |
Authors: | Grant McDermott [aut, cre] , Laurent Berge [ctb] |
Maintainer: | Grant McDermott <[email protected]> |
License: | GPL-3 |
Version: | 0.1.0.9001 |
Built: | 2024-09-09 03:25:25 UTC |
Source: | https://github.com/grantmcdermott/ggfixest |
Aggregates post- (and/or pre-) treatment effects of an
"event-study" estimation, also known as a dynamic difference-in-differences
(DDiD) model. The event-study should have been estimated using the fixest
package, which provides a specialised i()
operator for this class
of models. By default, the function will return the average post-treatment
effect (i.e. across multiple periods). However, it can also return the
cumulative post-treatment effect and can be used to aggregate pre-treatment
effects too. At its heart, aggr_es()
is a convenience wrapper around
marginaleffects::hypotheses()
, which is used to perform the underlying
joint hypothesis test.
aggr_es( object, rhs = 0, period = "post", aggregation = c("mean", "cumulative"), abbr_term = TRUE, ... )
aggr_es( object, rhs = 0, period = "post", aggregation = c("mean", "cumulative"), abbr_term = TRUE, ... )
object |
A model object of class |
rhs |
Numeric. The null hypothesis value. Defaults to 0. |
period |
Keyword string or numeric sequence. Which group of periods are we aggregating? Can either be one of three convenience strings—i.e., "post" (the default), "prep", or "both"—or a numeric sequence that matches a subset of periods in the data (e.g. 6:8). |
aggregation |
Character string. The aggregation type. Either "mean" (the default) or "cumulative". |
abbr_term |
Logical. Should the leading "term" column of the return
data frame be abbreviated? The default is TRUE. If FALSE, then the term
column will retain the full hypothesis test string as per usual with
|
... |
Additional arguments passed to |
A "tidy" data frame of aggregated (pre and/or post) treatment effects, plus inferential information about standard errors, confidence intervals, etc. Potentially useful information about the underlying hypothesis test is also provided as an attribute. See Examples.
library(ggfixest) ## Will load fixest too est = feols(y ~ x1 + i(period, treat, 5) | id + period, base_did) # Default hypothesis test is a null mean post-treatment effect (post_mean = aggr_es(est)) # The underlying hypothesis is saved as an attribute attributes(post_mean)["hypothesis"] # Other hypothesis and aggregation options aggr_es(est, aggregation = "cumulative") # cumulative instead of mean effects aggr_es(est, period = "pre") # pre period instead of post aggr_es(est, period = "both") # pre & post periods separately aggr_es(est, period = 6:8) # specific subset of periods aggr_es(est, rhs = -1, period = "pre") # pre period with H0 value of 1 # Etc.
library(ggfixest) ## Will load fixest too est = feols(y ~ x1 + i(period, treat, 5) | id + period, base_did) # Default hypothesis test is a null mean post-treatment effect (post_mean = aggr_es(est)) # The underlying hypothesis is saved as an attribute attributes(post_mean)["hypothesis"] # Other hypothesis and aggregation options aggr_es(est, aggregation = "cumulative") # cumulative instead of mean effects aggr_es(est, period = "pre") # pre period instead of post aggr_es(est, period = "both") # pre & post periods separately aggr_es(est, period = 6:8) # specific subset of periods aggr_es(est, rhs = -1, period = "pre") # pre period with H0 value of 1 # Etc.
fixest
regression
objects.Draws the ggplot2
equivalents of fixest::coefplot
and
fixest::iplot
. These "gg" versions do their best to recycle the same
arguments and plotting logic as their original base counterparts. But they
also support additional features via the ggplot2
API and infrastructure.
The overall goal remains the same as the original functions. To wit:
ggcoefplot
plots the results of estimations (coefficients and confidence
intervals). The function ggiplot
restricts the output to variables
created with i
, either interactions with factors or raw factors.
ggcoefplot( object, geom_style = c("pointrange", "errorbar"), multi_style = c("dodge", "facet"), facet_args = NULL, theme = NULL, ... ) ggiplot( object, geom_style = c("pointrange", "errorbar", "ribbon"), multi_style = c("dodge", "facet"), aggr_eff = NULL, aggr_eff.par = list(col = "grey50", lwd = 1, lty = 1), facet_args = NULL, theme = NULL, ... )
ggcoefplot( object, geom_style = c("pointrange", "errorbar"), multi_style = c("dodge", "facet"), facet_args = NULL, theme = NULL, ... ) ggiplot( object, geom_style = c("pointrange", "errorbar", "ribbon"), multi_style = c("dodge", "facet"), aggr_eff = NULL, aggr_eff.par = list(col = "grey50", lwd = 1, lty = 1), facet_args = NULL, theme = NULL, ... )
object |
A model object of class |
geom_style |
Character string. One of |
multi_style |
Character string. One of |
facet_args |
A list of arguments passed down to |
theme |
ggplot2 theme. Defaults to |
... |
Arguments passed down to, or equivalent to, the corresponding
|
aggr_eff |
A keyword string or numeric sequence, indicating whether
mean treatment effects for some subset of the model should be displayed as
part of the plot. For example, the "post" keyword means that the mean
post-treatment effect will be plotted alongside the individual period
effects. Passed to |
aggr_eff.par |
List. Parameters of the aggregated treatment effect line,
if plotted. The default values are |
These functions generally try to mimic the functionality and (where
appropriate) arguments of fixest::coefplot
and fixest::iplot
as
closely as possible. However, by leveraging the ggplot2 API and
infrastructure, they are able to support some more complex plot
arrangements out-of-the-box that would be more difficult to achieve using
the base coefplot
/iplot
alternatives.
A ggplot2 object.
ggiplot()
: This function plots the results of estimations
(coefficients and confidence intervals). The function ggiplot
restricts
the output to variables created with i, either interactions with factors or
raw factors.
fixest::coefplot()
, fixest::iplot()
.
library(ggfixest) ## # Author note: The examples that follow deliberately follow the original # examples from the coefplot/iplot help pages. A few "gg-" specific # features are sprinkled within, with the final set of examples in # particular highlighting unique features of this package. # # Example 1: Basic use and stacking two sets of results on the same graph # # Estimation on Iris data with one fixed-effect (Species) est = feols(Petal.Length ~ Petal.Width + Sepal.Length + Sepal.Width | Species, iris) ggcoefplot(est) # Show multiple CIs ggcoefplot(est, ci_level = c(0.8, 0.95)) # By default, fixest model standard errors are clustered by the first fixed # effect (here: Species). # But we can easily switch to "regular" standard-errors est_std = summary(est, se = "iid") # You can plot both results at once in the same plot frame... ggcoefplot(list("Clustered" = est, "IID" = est_std)) # ... or as separate facets ggcoefplot(list("Clustered" = est, "IID" = est_std), multi_style = "facet") + theme(legend.position = "none") # # Example 2: Interactions # # Now we estimate and plot the "yearly" treatment effects data(base_did) base_inter = base_did # We interact the variable 'period' with the variable 'treat' est_did = feols(y ~ x1 + i(period, treat, 5) | id + period, base_inter) # In the estimation, the variable treat is interacted # with each value of period but 5, set as a reference # ggcoefplot will show all the coefficients: ggcoefplot(est_did) # Note that the grouping of the coefficients is due to 'group = "auto"' # If you want to keep only the coefficients # created with i() (ie the interactions), use ggiplot ggiplot(est_did) # We can see that the graph is different from before: # - only interactions are shown, # - the reference is present, # => this is fully flexible ggiplot(est_did, ci_level = c(0.8, 0.95)) ggiplot(est_did, ref.line = FALSE, pt.join = TRUE, geom_style = "errorbar") ggiplot(est_did, geom_style = "ribbon", col = "orange") # etc # We can also use a dictionary to replace label values. The dicionary should # take the form of a named vector or list, e.g. c("old_lab1" = "new_lab1", ...) # Let's create a "month" variable all_months = c("aug", "sept", "oct", "nov", "dec", "jan", "feb", "mar", "apr", "may", "jun", "jul") # Turn into a dictionary by providing the old names # Note the implication that treatment occured here in December (5 month in our series) dict = all_months; names(dict) = 1:12 # Pass our new dictionary to our ggiplot call ggiplot(est_did, pt.join = TRUE, geom_style = "errorbar", dict = dict) # # What if the interacted variable is not numeric? # let's re-use our all_months vector from the previous example, but add it # directly to the dataset base_inter$period_month = all_months[base_inter$period] # The new estimation est = feols(y ~ x1 + i(period_month, treat, "oct") | id+period, base_inter) # Since 'period_month' of type character, iplot/coefplot both sort it ggiplot(est) # To respect a plotting order, use a factor base_inter$month_factor = factor(base_inter$period_month, levels = all_months) est = feols(y ~ x1 + i(month_factor, treat, "oct") | id + period, base_inter) ggiplot(est) # dict -> c("old_name" = "new_name") dict = all_months; names(dict) = 1:12; dict ggiplot(est_did, dict = dict) # # Example 3: Setting defaults # # The customization logic of ggcoefplot/ggiplot works differently than the # original base fixest counterparts, so we don't have "gg" equivalents of # setFixest_coefplot and setFixest_iplot. However, you can still invoke some # global fixest settings like setFixest_dict(). SImple example: base_inter$letter = letters[base_inter$period] est_letters = feols(y ~ x1 + i(letter, treat, 'e') | id+letter, base_inter) # Set global dictionary for capitalising the letters dict = LETTERS[1:10]; names(dict) = letters[1:10] setFixest_dict(dict) ggiplot(est_letters) setFixest_dict() # reset # # Example 4: group + cleaning # # You can use the argument group to group variables # You can further use the special character "^^" to clean # the beginning of the coef. name: particularly useful for factors est = feols(Petal.Length ~ Petal.Width + Sepal.Length + Sepal.Width + Species, iris) # No grouping: ggcoefplot(est) # now we group by Sepal and Species ggcoefplot(est, group = list(Sepal = "Sepal", Species = "Species")) # now we group + clean the beginning of the names using the special character ^^ ggcoefplot(est, group = list(Sepal = "^^Sepal.", Species = "^^Species")) # # Example 5: Some more ggcoefplot/ggiplot extras # # We'll demonstrate using the staggered treatment example from the # introductory fixest vignette. data(base_stagg) est_twfe = feols( y ~ x1 + i(time_to_treatment, treated, ref = c(-1, -1000)) | id + year, base_stagg ) est_sa20 = feols( y ~ x1 + sunab(year_treated, year) | id + year, data = base_stagg ) # Plot both regressions in a faceted plot ggiplot( list('TWFE' = est_twfe, 'Sun & Abraham (2020)' = est_sa20), main = 'Staggered treatment', ref.line = -1, pt.join = TRUE ) # So far that's no different than base iplot (automatic legend aside). But an # area where ggiplot shines is in complex multiple estimation cases, such as # lists of fixest_multi objects. To illustrate, let's add a split variable # (group) to our staggered dataset. base_stagg_grp = base_stagg base_stagg_grp$grp = ifelse(base_stagg_grp$id %% 2 == 0, 'Evens', 'Odds') # Now re-run our two regressions from earlier, but splitting the sample to # generate fixest_multi objects. est_twfe_grp = feols( y ~ x1 + i(time_to_treatment, treated, ref = c(-1, -1000)) | id + year, data = base_stagg_grp, split = ~ grp ) est_sa20_grp = feols( y ~ x1 + sunab(year_treated, year) | id + year, data = base_stagg_grp, split = ~ grp ) # ggiplot combines the list of multi-estimation objects without a problem... ggiplot(list('TWFE' = est_twfe_grp, 'Sun & Abraham (2020)' = est_sa20_grp), ref.line = -1, main = 'Staggered treatment: Split multi-sample') # ... but is even better when we use facets instead of dodged errorbars. # Let's use this an opportunity to construct a fancy plot that invokes some # additional arguments and ggplot theming. ggiplot( list('TWFE' = est_twfe_grp, 'Sun & Abraham (2020)' = est_sa20_grp), ref.line = -1, main = 'Staggered treatment: Split multi-sample', xlab = 'Time to treatment', multi_style = 'facet', geom_style = 'ribbon', facet_args = list(labeller = labeller(id = \(x) gsub(".*: ", "", x))), theme = theme_minimal() + theme( text = element_text(family = 'HersheySans'), plot.title = element_text(hjust = 0.5), legend.position = 'none' ) ) # # Aside on theming and scale adjustments # # Setting the theme inside the `ggiplot()` call is optional and not strictly # necessary, since the ggplot2 API allows programmatic updating of existing # plots. E.g. last_plot() + labs(caption = 'Note: Super fancy plot brought to you by ggiplot') last_plot() + theme_grey() + theme(legend.position = 'none') + scale_fill_brewer(palette = 'Set1', aesthetics = c("colour", "fill")) # etc.
library(ggfixest) ## # Author note: The examples that follow deliberately follow the original # examples from the coefplot/iplot help pages. A few "gg-" specific # features are sprinkled within, with the final set of examples in # particular highlighting unique features of this package. # # Example 1: Basic use and stacking two sets of results on the same graph # # Estimation on Iris data with one fixed-effect (Species) est = feols(Petal.Length ~ Petal.Width + Sepal.Length + Sepal.Width | Species, iris) ggcoefplot(est) # Show multiple CIs ggcoefplot(est, ci_level = c(0.8, 0.95)) # By default, fixest model standard errors are clustered by the first fixed # effect (here: Species). # But we can easily switch to "regular" standard-errors est_std = summary(est, se = "iid") # You can plot both results at once in the same plot frame... ggcoefplot(list("Clustered" = est, "IID" = est_std)) # ... or as separate facets ggcoefplot(list("Clustered" = est, "IID" = est_std), multi_style = "facet") + theme(legend.position = "none") # # Example 2: Interactions # # Now we estimate and plot the "yearly" treatment effects data(base_did) base_inter = base_did # We interact the variable 'period' with the variable 'treat' est_did = feols(y ~ x1 + i(period, treat, 5) | id + period, base_inter) # In the estimation, the variable treat is interacted # with each value of period but 5, set as a reference # ggcoefplot will show all the coefficients: ggcoefplot(est_did) # Note that the grouping of the coefficients is due to 'group = "auto"' # If you want to keep only the coefficients # created with i() (ie the interactions), use ggiplot ggiplot(est_did) # We can see that the graph is different from before: # - only interactions are shown, # - the reference is present, # => this is fully flexible ggiplot(est_did, ci_level = c(0.8, 0.95)) ggiplot(est_did, ref.line = FALSE, pt.join = TRUE, geom_style = "errorbar") ggiplot(est_did, geom_style = "ribbon", col = "orange") # etc # We can also use a dictionary to replace label values. The dicionary should # take the form of a named vector or list, e.g. c("old_lab1" = "new_lab1", ...) # Let's create a "month" variable all_months = c("aug", "sept", "oct", "nov", "dec", "jan", "feb", "mar", "apr", "may", "jun", "jul") # Turn into a dictionary by providing the old names # Note the implication that treatment occured here in December (5 month in our series) dict = all_months; names(dict) = 1:12 # Pass our new dictionary to our ggiplot call ggiplot(est_did, pt.join = TRUE, geom_style = "errorbar", dict = dict) # # What if the interacted variable is not numeric? # let's re-use our all_months vector from the previous example, but add it # directly to the dataset base_inter$period_month = all_months[base_inter$period] # The new estimation est = feols(y ~ x1 + i(period_month, treat, "oct") | id+period, base_inter) # Since 'period_month' of type character, iplot/coefplot both sort it ggiplot(est) # To respect a plotting order, use a factor base_inter$month_factor = factor(base_inter$period_month, levels = all_months) est = feols(y ~ x1 + i(month_factor, treat, "oct") | id + period, base_inter) ggiplot(est) # dict -> c("old_name" = "new_name") dict = all_months; names(dict) = 1:12; dict ggiplot(est_did, dict = dict) # # Example 3: Setting defaults # # The customization logic of ggcoefplot/ggiplot works differently than the # original base fixest counterparts, so we don't have "gg" equivalents of # setFixest_coefplot and setFixest_iplot. However, you can still invoke some # global fixest settings like setFixest_dict(). SImple example: base_inter$letter = letters[base_inter$period] est_letters = feols(y ~ x1 + i(letter, treat, 'e') | id+letter, base_inter) # Set global dictionary for capitalising the letters dict = LETTERS[1:10]; names(dict) = letters[1:10] setFixest_dict(dict) ggiplot(est_letters) setFixest_dict() # reset # # Example 4: group + cleaning # # You can use the argument group to group variables # You can further use the special character "^^" to clean # the beginning of the coef. name: particularly useful for factors est = feols(Petal.Length ~ Petal.Width + Sepal.Length + Sepal.Width + Species, iris) # No grouping: ggcoefplot(est) # now we group by Sepal and Species ggcoefplot(est, group = list(Sepal = "Sepal", Species = "Species")) # now we group + clean the beginning of the names using the special character ^^ ggcoefplot(est, group = list(Sepal = "^^Sepal.", Species = "^^Species")) # # Example 5: Some more ggcoefplot/ggiplot extras # # We'll demonstrate using the staggered treatment example from the # introductory fixest vignette. data(base_stagg) est_twfe = feols( y ~ x1 + i(time_to_treatment, treated, ref = c(-1, -1000)) | id + year, base_stagg ) est_sa20 = feols( y ~ x1 + sunab(year_treated, year) | id + year, data = base_stagg ) # Plot both regressions in a faceted plot ggiplot( list('TWFE' = est_twfe, 'Sun & Abraham (2020)' = est_sa20), main = 'Staggered treatment', ref.line = -1, pt.join = TRUE ) # So far that's no different than base iplot (automatic legend aside). But an # area where ggiplot shines is in complex multiple estimation cases, such as # lists of fixest_multi objects. To illustrate, let's add a split variable # (group) to our staggered dataset. base_stagg_grp = base_stagg base_stagg_grp$grp = ifelse(base_stagg_grp$id %% 2 == 0, 'Evens', 'Odds') # Now re-run our two regressions from earlier, but splitting the sample to # generate fixest_multi objects. est_twfe_grp = feols( y ~ x1 + i(time_to_treatment, treated, ref = c(-1, -1000)) | id + year, data = base_stagg_grp, split = ~ grp ) est_sa20_grp = feols( y ~ x1 + sunab(year_treated, year) | id + year, data = base_stagg_grp, split = ~ grp ) # ggiplot combines the list of multi-estimation objects without a problem... ggiplot(list('TWFE' = est_twfe_grp, 'Sun & Abraham (2020)' = est_sa20_grp), ref.line = -1, main = 'Staggered treatment: Split multi-sample') # ... but is even better when we use facets instead of dodged errorbars. # Let's use this an opportunity to construct a fancy plot that invokes some # additional arguments and ggplot theming. ggiplot( list('TWFE' = est_twfe_grp, 'Sun & Abraham (2020)' = est_sa20_grp), ref.line = -1, main = 'Staggered treatment: Split multi-sample', xlab = 'Time to treatment', multi_style = 'facet', geom_style = 'ribbon', facet_args = list(labeller = labeller(id = \(x) gsub(".*: ", "", x))), theme = theme_minimal() + theme( text = element_text(family = 'HersheySans'), plot.title = element_text(hjust = 0.5), legend.position = 'none' ) ) # # Aside on theming and scale adjustments # # Setting the theme inside the `ggiplot()` call is optional and not strictly # necessary, since the ggplot2 API allows programmatic updating of existing # plots. E.g. last_plot() + labs(caption = 'Note: Super fancy plot brought to you by ggiplot') last_plot() + theme_grey() + theme(legend.position = 'none') + scale_fill_brewer(palette = 'Set1', aesthetics = c("colour", "fill")) # etc.
Grabs the underlying data used to construct fixest::iplot
,
with some added functionality and tweaks for the ggiplot
equivalents.
iplot_data( object, .ci_level = 0.95, .keep = NULL, .drop = NULL, .dict = fixest::getFixest_dict(), .internal.only.i = TRUE, .i.select = 1, .aggr_es = NULL, .group = "auto", .vcov = NULL, .cluster = NULL, .se = NULL ) coefplot_data( object, .ci_level = 0.95, .keep = NULL, .drop = NULL, .group = "auto", .dict = fixest::getFixest_dict(), .internal.only.i = FALSE, .i.select = 1, .aggr_es = "none", .vcov = NULL, .cluster = NULL, .se = NULL )
iplot_data( object, .ci_level = 0.95, .keep = NULL, .drop = NULL, .dict = fixest::getFixest_dict(), .internal.only.i = TRUE, .i.select = 1, .aggr_es = NULL, .group = "auto", .vcov = NULL, .cluster = NULL, .se = NULL ) coefplot_data( object, .ci_level = 0.95, .keep = NULL, .drop = NULL, .group = "auto", .dict = fixest::getFixest_dict(), .internal.only.i = FALSE, .i.select = 1, .aggr_es = "none", .vcov = NULL, .cluster = NULL, .se = NULL )
object |
A model object of class |
.ci_level |
A number between 0 and 1 indicating the desired confidence level, Defaults to 0.95. |
.keep |
Character vector used to subset the coefficients of interest.
Passed down to |
.drop |
Character vector used to subset the coefficients of interest
(complement of |
.dict |
A dictionary (i.e. named character vector or a logical scalar).
Used for changing coefficient names. Defaults to the values in
|
.internal.only.i |
Logical variable used for some internal function handling when passing on to coefplot/iplot. |
.i.select |
Integer scalar, default is 1. In (gg)iplot, used to select
which variable created with i() to select. Only used when there are several
variables created with i. This is an index, just try increasing numbers to
hopefully obtain what you want. Passed down to
|
.aggr_es |
A keyword string or numeric sequence indicating whether the
aggregated mean treatment effects for some subset of the model should be
added as a column to the returned data frame. Passed to
|
.group |
A list, default is missing. Each element of the list reports
the coefficients to be grouped while the name of the element is the group
name. Passed down to
|
.vcov , .cluster , .se
|
Alternative options for adjusting the standard
errors of the model object on the fly. See |
This function is a wrapper around
fixest::iplot(..., only.params = TRUE)
, but with various checks and tweaks
to better facilitate plotting with ggplot2
and handling of complex object
types (e.g. lists of fixest_multi models)
A data frame consisting of estimate values, confidence intervals, relative x-axis positions, and other aesthetic information needed to draw a ggplot2 object.
coefplot_data()
: Internal function for grabbing and preparing coefplot data
library(fixest) est_did = feols(y ~ x1 + i(period, treat, 5) | id+period, data = base_did) iplot(est_did, only.params = TRUE) # The "base" version iplot_data(est_did) # The wrapper provided by this package # Illustrative fixest_multi case, where the sample has been split by odd and # even ID numbers. est_split = feols(y ~ x1 + i(period, treat, 5) | id+period, data = base_did, split = ~id%%2) iplot(est_split, only.params = TRUE) # The "base" version iplot_data(est_split) # The wrapper provided by this package
library(fixest) est_did = feols(y ~ x1 + i(period, treat, 5) | id+period, data = base_did) iplot(est_did, only.params = TRUE) # The "base" version iplot_data(est_did) # The wrapper provided by this package # Illustrative fixest_multi case, where the sample has been split by odd and # even ID numbers. est_split = feols(y ~ x1 + i(period, treat, 5) | id+period, data = base_did, split = ~id%%2) iplot(est_split, only.params = TRUE) # The "base" version iplot_data(est_split) # The wrapper provided by this package