Poisson & GLMs

Core Estimation

Poisson for nonnegative outcomes, DiD with counts, and logit/probit for propensity-score based AB testing.

Scope

This vignette is practical by design. We focus on how to use pyfixest and when these models are useful in applied work.

1. Why Poisson? (and Why Wooldridge Likes It)

A common motivation for Poisson pseudo-maximum-likelihood (PPML) is that it works well for nonnegative outcomes and remains valid under fairly general forms of heteroskedasticity when the conditional mean is correctly specified.

In practice, that makes Poisson attractive for many applied settings where:

outcomes are counts or nonnegative flows,
zeros are common,
log-linear OLS can be fragile.

pyfixest provides PPML with high-dimensional fixed effects via pf.fepois().

2. A Standard Gravity-Style Trade Example (Boring but Useful)

Below is a simple gravity-style trade panel (exporter-importer-year). This is a synthetic dataset, but the specification mirrors common trade applications.

import numpy as np
import pandas as pd
import pyfixest as pf

def make_trade_data(seed=123, n_exporters=15, n_importers=15, years=range(2010, 2016)):
    rng = np.random.default_rng(seed)

    exporters = [f"E{i}" for i in range(n_exporters)]
    importers = [f"I{j}" for j in range(n_importers)]

    rows = []
    for y in years:
        for e in exporters:
            for i in importers:
                dist = rng.uniform(200, 9000)
                fta = rng.binomial(1, 0.15)
                exporter_fe = hash(e) % 7
                importer_fe = hash(i) % 6
                year_fe = y - min(years)

                # mean function for PPML
                eta = 2.5 - 0.35 * np.log(dist) + 0.25 * fta + 0.08 * exporter_fe + 0.06 * importer_fe + 0.05 * year_fe
                mu = np.exp(eta)
                trade = rng.poisson(mu)

                rows.append((y, e, i, dist, fta, trade))

    return pd.DataFrame(rows, columns=["year", "exporter", "importer", "distance", "fta", "trade"])

trade = make_trade_data()
trade.head()

	year	exporter	importer	distance	fta	trade
0	2010	E0	I0	6204.696397	0	0
1	2010	E0	I1	1822.471934	0	2
2	2010	E0	I2	7413.840142	1	1
3	2010	E0	I3	7453.326046	0	3
4	2010	E0	I4	7232.301132	0	0

fit_trade = pf.fepois(
    "trade ~ np.log(distance) + fta | exporter + importer + year",
    data=trade,
    vcov={"CRV1": "exporter"},
)
fit_trade.summary()

###

Estimation:  Poisson
Dep. var.: trade, Fixed effects: exporter + importer + year
sample: None = all
Inference:  CRV1
Observations:  1350

| Coefficient      |   Estimate |   Std. Error |   t value |   Pr(>|t|) |   2.5% |   97.5% |
|:-----------------|-----------:|-------------:|----------:|-----------:|-------:|--------:|
| np.log(distance) |     -0.358 |        0.031 |   -11.653 |      0.000 | -0.419 |  -0.298 |
| fta              |      0.263 |        0.060 |     4.420 |      0.000 |  0.147 |   0.380 |
---
Deviance: 1457.805

This is the basic PPML gravity workflow in pyfixest.

3. Handling Zeros: “The Log of Zeros”

One reason PPML is popular is straightforward handling of zero outcomes without ad-hoc transformations like log(y + 1).

This is central in modern work on nonnegative outcomes, including recent discussion in:

Bellégo, Benatia, and Pape (2024), Dealing with Logs and Zeros in Regression Models (QJE).

Quick contrast:

share_zeros = (trade["trade"] == 0).mean()
share_zeros

np.float64(0.2874074074074074)

With PPML, zero observations are naturally part of the likelihood through the mean function.

4. Poisson for a Simple DiD

Poisson DiD is useful when the outcome is a count or nonnegative rate and treatment is staggered or panel-based.

def make_count_did(seed=42, n_states=30, years=range(2010, 2018)):
    rng = np.random.default_rng(seed)
    states = [f"S{s}" for s in range(n_states)]

    rows = []
    treated_states = set(states[: n_states // 2])
    post_start = 2014

    for y in years:
        for s in states:
            treated = int(s in treated_states)
            post = int(y >= post_start)
            did = treated * post

            # latent state/year effects + treatment effect in post
            state_fe = (hash(s) % 9) / 10
            year_fe = (y - min(years)) / 10
            eta = 1.2 + state_fe + year_fe + 0.20 * did
            mu = np.exp(eta)
            y_count = rng.poisson(mu)

            rows.append((s, y, treated, post, did, y_count))

    return pd.DataFrame(rows, columns=["state", "year", "treated", "post", "did", "y_count"])

did_df = make_count_did()

fit_did_pois = pf.fepois(
    "y_count ~ did | state + year",
    data=did_df,
    vcov={"CRV1": "state"},
)
fit_did_pois.summary()

###

Estimation:  Poisson
Dep. var.: y_count, Fixed effects: state + year
sample: None = all
Inference:  CRV1
Observations:  240

| Coefficient   |   Estimate |   Std. Error |   t value |   Pr(>|t|) |   2.5% |   97.5% |
|:--------------|-----------:|-------------:|----------:|-----------:|-------:|--------:|
| did           |      0.099 |        0.118 |     0.837 |      0.403 | -0.132 |   0.330 |
---
Deviance: 216.056

If your outcome is nonnegative and has many zeros, this can be a practical alternative to log-linear OLS DiD.

5. Logit/Probit for Doubly Debiased AB Testing with Propensity Scores

For observational AB tests (or experiments with imperfect randomization), a common strategy is to combine:

a propensity score model P(T=1|X), and
an outcome model E[Y|T,X].

This yields doubly robust / doubly debiased estimators (AIPW-style scores).

Simulate an AB setting

def make_ab_data(seed=1, n=5000):
    rng = np.random.default_rng(seed)

    segment = rng.integers(0, 20, size=n)
    x1 = rng.normal(size=n)
    x2 = rng.normal(size=n)

    # non-random treatment assignment
    p = 1 / (1 + np.exp(-(-0.2 + 0.7 * x1 - 0.4 * x2 + 0.15 * (segment % 4))))
    t = rng.binomial(1, p)

    # outcome with true effect tau
    tau = 0.25
    y = tau * t + 0.6 * x1 + 0.3 * x2 + 0.2 * (segment % 5) + rng.normal(scale=1.0, size=n)

    return pd.DataFrame({"y": y, "t": t, "x1": x1, "x2": x2, "segment": segment})

ab = make_ab_data()
ab.head()

	y	t	x1	x2	segment
0	1.403683	1	0.247452	0.620339	9
1	-0.698392	1	0.710294	1.163892	10
2	0.504507	0	-0.650713	-0.347768	15
3	1.283318	0	-0.229100	0.122556	19
4	-0.181560	0	-0.781772	0.866387	0

Propensity via logit/probit (with fixed effects)

# both support fixed effects via the | syntax
ps_logit = pf.feglm("t ~ x1 + x2 | segment", data=ab, family="logit")
ps_probit = pf.feglm("t ~ x1 + x2 | segment", data=ab, family="probit")

ab["ehat_logit"] = np.clip(ps_logit.predict(type="response"), 0.01, 0.99)
ab["ehat_probit"] = np.clip(ps_probit.predict(type="response"), 0.01, 0.99)

Outcome regressions and a simple AIPW score

mu1 = pf.feols("y ~ x1 + x2 | segment", data=ab.loc[ab["t"] == 1]).predict(newdata=ab)
mu0 = pf.feols("y ~ x1 + x2 | segment", data=ab.loc[ab["t"] == 0]).predict(newdata=ab)

ab["mu1"] = mu1
ab["mu0"] = mu0

# AIPW score for ATE using logit propensity
ab["score_aipw_logit"] = (
    ab["mu1"] - ab["mu0"]
    + ab["t"] * (ab["y"] - ab["mu1"]) / ab["ehat_logit"]
    - (1 - ab["t"]) * (ab["y"] - ab["mu0"]) / (1 - ab["ehat_logit"])
)

ate_aipw_logit = ab["score_aipw_logit"].mean()
ate_aipw_logit

np.float64(0.2504501667772698)

This is the basic doubly debiased pattern: estimate treatment propensities with feglm (logit/probit), combine with outcome models, and form an orthogonal score.

6. GLMs with Marginal Effects

After feglm(), coefficients live on the link scale. For applied work, average marginal effects are often easier to interpret.

from marginaleffects import avg_slopes

avg_slopes(ps_logit, variables="x1")
avg_slopes(ps_logit, variables="x1", by="segment")

shape: (20, 4)

segment	term	contrast	estimate
i64	str	str	f64
0	"x1"	"dY/dX"	0.764503
1	"x1"	"dY/dX"	0.764503
2	"x1"	"dY/dX"	0.764503
3	"x1"	"dY/dX"	0.764503
4	"x1"	"dY/dX"	0.764503
…	…	…	…
15	"x1"	"dY/dX"	0.764503
16	"x1"	"dY/dX"	0.764503
17	"x1"	"dY/dX"	0.764503
18	"x1"	"dY/dX"	0.764503
19	"x1"	"dY/dX"	0.764503

This reports the change in the predicted treatment probability associated with x1, averaged either over the full sample or within groups.

For a compact workflow covering marginal effects and hypothesis tests with PyFixest, see the dedicated guide on Marginal Effects & Hypothesis Testing.

7. Fixed Effects Support (Summary)

Both Poisson and GLM estimators in pyfixest support fixed effects using the same formula syntax:

Poisson: pf.fepois("y ~ x | fe1 + fe2", ...)
GLM families (logit, probit, gaussian): pf.feglm("y ~ x | fe1 + fe2", family="logit", ...)

That shared syntax makes it easy to move between linear, Poisson, and binary-response workflows while keeping your FE structure aligned.

Where to Go Next

Difference-in-Differences: richer DiD estimators and event studies.
Standard Errors & Inference: clustered inference, randomization inference, and wild bootstrap.
How-To: Variance Reduction in AB Tests: AB testing workflows with controls and panel structure.