Grokking-like effects in counterfactual inference

S Samothrakis, A Matran-Fernandez… - … Joint Conference on …, 2022 - ieeexplore.ieee.org
2022 International Joint Conference on Neural Networks (IJCNN), 2022ieeexplore.ieee.org
We show that a typical neural network, which ignores any covariate/feature re-balancing,
can be as effective as any explicit counterfactual method. We adopt the architecture of
TARNet—a simple neural network with two heads (one for treatment, one for control) which
is trained with a relatively high batch size. Combined with ensemble methods, this produces
competitive results in four counterfactual inference benchmarks: IHDP, NEWS, JOBS, and
TWINS. Our results indicate that relatively simple methods might be good enough for …
We show that a typical neural network, which ignores any covariate/feature re-balancing, can be as effective as any explicit counterfactual method. We adopt the architecture of TARNet—a simple neural network with two heads (one for treatment, one for control) which is trained with a relatively high batch size. Combined with ensemble methods, this produces competitive results in four counterfactual inference benchmarks: IHDP, NEWS, JOBS, and TWINS. Our results indicate that relatively simple methods might be good enough for counterfactual prediction, with quality constraints coming from hyperparameter tuning. Our analysis indicates that the reason behind the observed phenomenon might be “grokking”, a recently developed theory.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果