Start by loading this package. Note the example scripts that follow assume you have both lfe and fixest, as well as modelsummary installed on your system.
library(lfe2fixest) ## This package
## Aside: Make sure you have the following packages installed on your system if
## you want to run the example scripts below:
## library(lfe); library(fixest); library(modelsummary)
Let’s create an lfe-based R script, that’s deliberately messy to pose an additional challenge (inconsistent formatting, etc.)
lfe_string = "
library(lfe)
library(modelsummary)
## Toy dataset
aq = airquality
names(aq) = c('y', 'x1', 'x2', 'x3', 'mnth', 'dy')
## Simple OLS
mod1 = felm(y ~ x1 + x2, aq)
## Add FE & cluster var
mod2 = felm(y ~ x1 + x2 |
dy |
0 |
mnth, aq)
## Add 2nd cluster var & some estimation options
mod3 = felm(y ~ x1 + x2 |
dy |
0 |
dy + mnth,
cmethod = 'reghdfe',
exactDOF = TRUE, ## Irrelevant for feols (should be ignored)
aq)
## IV reg with weights
mod4 = felm(y ~ 1 |
dy |
(x1 ~ x3) |
mnth,
weights = aq$x2,
data = aq
)
## Multiple IV
mod5 = felm(y ~ 1 |
0 |
(x1|x2 ~ x3 + dy + mnth) |
dy,
data = aq
)
## Regression table
mods = list(mod1, mod2, mod3, mod4, mod5)
msummary(mods, gof_omit = 'Pseudo|Within|Log|IC', output = 'markdown')
"
writeLines(lfe_string, 'lfe_script.R')
We can now convert this script to the fixest
equivalent using the package’s main function, lfe2fixest()
,
or its alias, felm2feols()
. While the function(s) accept
several arguments, the only required argument is an input file.
Similarly, if no output file argument is provided, then the function(s)
will just print the conversion results to screen.
# felm2feols('lfe_script.R') ## same thing
lfe2fixest('lfe_script.R')
#>
#> library(fixest)
#> library(modelsummary)
#>
#> ## Toy dataset
#> aq = airquality
#> names(aq) = c('y', 'x1', 'x2', 'x3', 'mnth', 'dy')
#>
#> ## Simple OLS
#> mod1 = feols(y ~ x1 + x2, data = aq)
#>
#> ## Add FE & cluster var
#> mod2 = feols(y ~ x1 + x2 | dy, cluster = ~mnth, data = aq)
#>
#> ## Add 2nd cluster var & some estimation options
#> mod3 = feols(y ~ x1 + x2 | dy, cluster = ~dy + mnth, data =
#> aq)
#>
#> ## IV reg with weights
#> mod4 = feols(y ~ 1 | dy | x1 ~ x3, cluster = ~mnth, weights = aq$x2, data = aq )
#>
#> ## Multiple IV
#> mod5 = feols(y ~ 1 | x1 + x2 ~ x3 + dy + mnth, cluster = ~dy, data = aq )
#>
#> ## Regression table
#> mods = list(mod1, mod2, mod3, mod4, mod5)
#> msummary(mods, gof_omit = 'Pseudo|Within|Log|IC', output = 'markdown')
Looks good. Note that the feols
(felm
)
model syntax has been cleaned up, with comments removed and everything
collapsed onto a single line.1 Let’s write it to disk by supplying an
output file this time.
# felm2feols(infile = 'lfe_script.R', outfile = 'fixest_script.R') ## same thing
lfe2fixest(infile = 'lfe_script.R', outfile = 'fixest_script.R')
Note that the lfe2fixest()
is a pure conversion
function. It never actually runs anything from either the input or
output files. That being said, here’s a quick comparison of the
resulting regressions — i.e. what we get if we actually do run
the scripts. As an aside, my scripts make use of the excellent modelsummary
package to generate the simple regression tables that you see below,
although we’re really not showing off its functionality here.
First, the original lfe version:
source('lfe_script.R', print.eval = TRUE)
#>
#> +-------------+---------+----------+---------------+----------+----------+
#> | | (1) | (2) | (3) | (4) | (5) |
#> +=============+=========+==========+===============+==========+==========+
#> | (Intercept) | 77.246 | | | | 93.452 |
#> +-------------+---------+----------+---------------+----------+----------+
#> | | (9.068) | | | | (65.757) |
#> +-------------+---------+----------+---------------+----------+----------+
#> | x1 | 0.100 | 0.099 | 0.099 | | |
#> +-------------+---------+----------+---------------+----------+----------+
#> | | (0.026) | (0.031) | (0.029) | | |
#> +-------------+---------+----------+---------------+----------+----------+
#> | x2 | -5.402 | -5.577 | -5.577 | | |
#> +-------------+---------+----------+---------------+----------+----------+
#> | | (0.673) | (1.100) | (1.053) | | |
#> +-------------+---------+----------+---------------+----------+----------+
#> | x1(fit) | | | | 0.733 | 0.236 |
#> +-------------+---------+----------+---------------+----------+----------+
#> | | | | | (0.254) | (0.179) |
#> +-------------+---------+----------+---------------+----------+----------+
#> | x2(fit) | | | | | -9.558 |
#> +-------------+---------+----------+---------------+----------+----------+
#> | | | | | | (3.510) |
#> +-------------+---------+----------+---------------+----------+----------+
#> | Num.Obs. | 111 | 111 | 111 | 111 | 111 |
#> +-------------+---------+----------+---------------+----------+----------+
#> | R2 | 0.449 | 0.665 | 0.665 | -2.658 | 0.071 |
#> +-------------+---------+----------+---------------+----------+----------+
#> | R2 Adj. | 0.439 | 0.527 | 0.527 | -4.093 | 0.054 |
#> +-------------+---------+----------+---------------+----------+----------+
#> | RMSE | 24.58 | 19.18 | 19.18 | 54.07 | 31.92 |
#> +-------------+---------+----------+---------------+----------+----------+
#> | Std.Errors | | by: mnth | by: dy & mnth | by: mnth | by: dy |
#> +-------------+---------+----------+---------------+----------+----------+
Second, the fixest conversion:
source('fixest_script.R', print.eval = TRUE)
#>
#> +-------------+---------+----------+---------------+----------+----------+
#> | | (1) | (2) | (3) | (4) | (5) |
#> +=============+=========+==========+===============+==========+==========+
#> | (Intercept) | 77.246 | | | | 93.452 |
#> +-------------+---------+----------+---------------+----------+----------+
#> | | (9.068) | | | | (65.757) |
#> +-------------+---------+----------+---------------+----------+----------+
#> | x1 | 0.100 | 0.099 | 0.099 | | |
#> +-------------+---------+----------+---------------+----------+----------+
#> | | (0.026) | (0.031) | (0.029) | | |
#> +-------------+---------+----------+---------------+----------+----------+
#> | x2 | -5.402 | -5.577 | -5.577 | | |
#> +-------------+---------+----------+---------------+----------+----------+
#> | | (0.673) | (1.100) | (1.053) | | |
#> +-------------+---------+----------+---------------+----------+----------+
#> | fit_x1 | | | | 0.733 | 0.236 |
#> +-------------+---------+----------+---------------+----------+----------+
#> | | | | | (0.254) | (0.179) |
#> +-------------+---------+----------+---------------+----------+----------+
#> | fit_x2 | | | | | -9.558 |
#> +-------------+---------+----------+---------------+----------+----------+
#> | | | | | | (3.510) |
#> +-------------+---------+----------+---------------+----------+----------+
#> | Num.Obs. | 111 | 111 | 111 | 111 | 111 |
#> +-------------+---------+----------+---------------+----------+----------+
#> | R2 | 0.449 | 0.665 | 0.665 | -2.658 | 0.071 |
#> +-------------+---------+----------+---------------+----------+----------+
#> | R2 Adj. | 0.439 | 0.527 | 0.527 | -4.093 | 0.054 |
#> +-------------+---------+----------+---------------+----------+----------+
#> | RMSE | 24.58 | 19.18 | 19.18 | 54.07 | 31.92 |
#> +-------------+---------+----------+---------------+----------+----------+
#> | Std.Errors | IID | by: mnth | by: dy & mnth | by: mnth | by: dy |
#> +-------------+---------+----------+---------------+----------+----------+
#> | FE: dy | | X | X | X | |
#> +-------------+---------+----------+---------------+----------+----------+
Some minor formatting differences aside, looks like it worked and we get the exact same results from both scripts. Great!
Let’s clean up before closing.
The loss of inline comments is a little unfortunate, but necessary given the way that the function parses input.↩︎