Cluster robust double machine learning #119
Merged
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
This PR adds functionality for cluster robust double machine learning. The main reference is:
Chiang, H. D., Kato K., Ma, Y. and Sasaki, Y. (2021), Multiway Cluster Robust Double/Debiased Machine Learning, Journal of Business & Economic Statistics, https://doi.org/10.1080/07350015.2021.1895815.
The DGP from the paper was added as function
make_pliv_multiway_cluster_CKMS2021.A new data-backend for cluster data named
DoubleMLClusterDatawas added. It is inherited from theDoubleMLDataclass and primarily adds functionality to add the cluster variables.Cluster robust cross-fitting with resampling of cluster variables is implemented in the abstract base class
DoubleMLand used whenever a data-backend of classDoubleMLClusterDatais passed as input. The implemented approach is described as Algorithm 1 in Chiang et al. (2021). For details, see the new notebook added to the example gallery (see Cluster robust double machine learning doubleml-docs#40).Methods for estimation of cluster-robust standard errors in double machine learning models have been added to the abstract base class
DoubleML. It implements standard error estimation as described in Eq. (3.4)-(3.6) in Chiang et al. (2021). For details, see the new notebook added to the example gallery (see Cluster robust double machine learning doubleml-docs#40).The current implementation is restricted to the one-way and two-way clustering cases. The extension to the general multiway-clustering case would be easily doable.
In the unit tests we check against functional implementations for the one-way and two-way clustering case. Additionally standard unit tests for the newly added data-backend and functionalities have been added.
A notebook for the example gallery will be added in this PR Cluster robust double machine learning doubleml-docs#40
The implementation for the Python-package will be added in this PR Cluster robust double machine learning doubleml-for-py#116
A comparison of the Py and R version will be added in this PR Cluster robust double machine learning doubleml-py-vs-r#8