estimation.api.fepois.fepois

estimation.api.fepois.fepois(
    fml,
    data,
    vcov=None,
    vcov_kwargs=None,
    weights=None,
    weights_type='aweights',
    ssc=None,
    fixef_rm='singleton',
    iwls_tol=1e-08,
    iwls_maxiter=25,
    collin_tol=1e-09,
    separation_check=None,
    solver='scipy.linalg.solve',
    demeaner=None,
    demeaner_backend=None,
    fixef_tol=None,
    fixef_maxiter=None,
    drop_intercept=False,
    copy_data=True,
    store_data=True,
    lean=False,
    context=None,
    split=None,
    fsplit=None,
)

Estimate Poisson regression model with fixed effects using the ppmlhdfe algorithm.

Parameters

Name Type Description Default
fml str A two-sided formula string using fixest formula syntax. Syntax: “Y ~ X1 + X2 | FE1 + FE2”. “|” separates left-hand side and fixed effects. Special syntax includes: - Stepwise regressions (sw, sw0) - Cumulative stepwise regression (csw, csw0) - Multiple dependent variables (Y1 + Y2 ~ X) - Interaction of variables (i(X1,X2)) - Interacted fixed effects (fe1^fe2) Compatible with formula parsing via the formulaic module. required
data DataFrameType A pandas or polars dataframe containing the variables in the formula. required
vcov Union[VcovTypeOptions, dict[str, str]] Type of variance-covariance matrix for inference. Options include “iid”, “hetero”, “HC1”, “HC2”, “HC3”, “NW” for Newey-West HAC standard errors, “DK” for Driscoll-Kraay HAC standard errors, or a dictionary for CRV1/CRV3 inference. Note that NW and DK require to pass additional keyword arguments via the vcov_kwargs argument. For time-series HAC, you need to pass the ‘time_id’ column. For panel-HAC, you need to add pass both ‘time_id’ and ‘panel_id’. See vcov_kwargs for details. None
vcov_kwargs Optional[dict[str, any]] Additional keyword arguments to pass to the vcov function. These keywoards include “lag” for the number of lag to use in the Newey-West (NW) and Driscoll-Kraay (DK) HAC standard errors. “time_id” for the time ID used for NW and DK standard errors, and “panel_id” for the panel identifier used for NW and DK standard errors. Currently, the the time difference between consecutive time periods is always treated as 1. More flexible time-step selection is work in progress. None
weights Union[None, str], optional. Default is None. Weights for weighted Poisson regression. If None, all observations are weighted equally. If a string, the name of the column in data that contains the weights. None
weights_type WeightsTypeOptions Options include aweights or fweights. aweights implement analytic or precision weights, while fweights implement frequency weights. Frequency weights are useful for compressed count data where identical observations are aggregated. For details see this blog post: https://notstatschat.rbind.io/2020/08/04/weights-in-statistics/. 'aweights'
ssc str A ssc object specifying the small sample correction for inference. None
fixef_rm FixedRmOptions Specifies whether to drop singleton fixed effects. Can be equal to “singletons” (default) or “none”. “singletons” will drop singleton fixed effects. This will not impact point estimates but it will impact standard errors. 'singleton'
iwls_tol Optional[float] Tolerance for IWLS convergence, by default 1e-08. 1e-08
iwls_maxiter Optional[float] Maximum number of iterations for IWLS convergence, by default 25. 25
collin_tol float Tolerance for collinearity check, by default 1e-10. 1e-09
separation_check list[str] | None Methods to identify and drop separated observations. Either “fe” or “ir”. Executes “fe” by default (when None). None
solver SolverOptions, optional. The solver to use for the regression. Can be “np.linalg.lstsq”, “np.linalg.solve”, “scipy.linalg.solve”, “scipy.sparse.linalg.lsqr” and “jax”. Defaults to “scipy.linalg.solve”. 'scipy.linalg.solve'
demeaner AnyDemeaner | None Typed demeaner configuration. Controls the fixed-effects demeaning backend, tolerance, and iteration limits. Accepts a MapDemeaner, WithinDemeaner, or LsmrDemeaner instance. Defaults to MapDemeaner() (numba MAP algorithm, tol=1e-6, maxiter=10_000). None
drop_intercept bool Whether to drop the intercept from the model, by default False. False
copy_data bool Whether to copy the data before estimation, by default True. If set to False, the data is not copied, which can save memory but may lead to unintended changes in the input data outside of fepois. For example, the input data set is re-index within the function. As far as I know, the only other relevant case is when using interacted fixed effects, in which case you’ll find a column with interacted fixed effects in the data set. True
store_data bool Whether to store the data in the model object, by default True. If set to False, the data is not stored in the model object, which can improve performance and save memory. However, it will no longer be possible to access the data via the data attribute of the model object. This has impact on post-estimation capabilities that rely on the data, e.g. predict() or vcov(). True
lean bool False by default. If True, then all large objects are removed from the returned result: this will save memory but will block the possibility to use many methods. It is recommended to use the argument vcov to obtain the appropriate standard-errors at estimation time, since obtaining different SEs won’t be possible afterwards. False
context int or Mapping[str, Any] A dictionary containing additional context variables to be used by formulaic during the creation of the model matrix. This can include custom factorization functions, transformations, or any other variables that need to be available in the formula environment. None
split str | None A character string, i.e. ‘split = var’. If provided, the sample is split according to the variable and one estimation is performed for each value of that variable. If you also want to include the estimation for the full sample, use the argument fsplit instead. None
fsplit str | None This argument is the same as split but also includes the full sample as the first estimation. None

Returns

Name Type Description
object An instance of the Fepois class or an instance of class FixestMulti for multiple models specified via fml.

Examples

The fepois() function estimates Poisson models with the same formula interface as feols(). Fixed effects are specified after the | symbol.

import pyfixest as pf

data = pf.get_data(model="Fepois")
fit = pf.fepois("Y ~ X1 + X2 | f1 + f2", data)
fit.summary()
###

Estimation:  Poisson
Dep. var.: Y, Fixed effects: f1 + f2
sample: None = all
Inference:  iid
Observations:  995

| Coefficient   |   Estimate |   Std. Error |   t value |   Pr(>|t|) |   2.5% |   97.5% |
|:--------------|-----------:|-------------:|----------:|-----------:|-------:|--------:|
| X1            |     -0.007 |        0.042 |    -0.157 |      0.875 | -0.089 |   0.076 |
| X2            |     -0.015 |        0.011 |    -1.317 |      0.188 | -0.037 |   0.007 |
---
Deviance: 1068.169 

Cluster-robust inference uses the same vcov syntax as feols():

fit_crv = pf.fepois("Y ~ X1 + X2 | f1 + f2", data, vcov={"CRV1": "f1"})
fit_crv.tidy()
Estimate Std. Error t value Pr(>|t|) 2.5% 97.5%
Coefficient
X1 -0.006591 0.035301 -0.186711 0.851887 -0.075780 0.062598
X2 -0.014924 0.010467 -1.425778 0.153932 -0.035439 0.005591

Multiple-estimation and sample-splitting features also work as in feols():

fits = pf.fepois("Y ~ X1 | sw0(f1, f2)", data)
pf.etable(fits)
Y
(1) (2) (3)
coef
X1 0.009
(0.04)
0.002
(0.041)
0.004
(0.041)
Intercept -0.03
(0.053)
fe
f1 - x -
f2 - - x
stats
Observations 998 996 997
R2 - - -
Format of coefficient cell: Coefficient (Std. Error)

Shared arguments such as vcov, ssc, split, fsplit, context, and typed demeaners are documented in the feols() reference. For applied examples, see the Poisson & GLMs tutorial.