fastLink: R-package for probabilistic record linkage. fastLink includes functionalities to conduct a merge of two datasets under the Fellegi-Sunter model using the Expectation-Maximization algorithm. fastLink implements methods described in Enamorado, Fifield, and Imai (2019) "Using a Probabilistic Model to Assist Merging of Large-scale Administrative Records." The fastLink team was recently recognized and received the award for developing statistical software that makes a significant research contribution, Society for Political Methodology (2021).
Code for a deduplication example via fastLink can be accessed [HERE]
activeText: R-package for a probabilistic model for text classification with active learning. activeText implements the approach described in Bosley et al (forthcoming) “Improving Probabilistic Models in Text Classification via Active Learning.”