Software and Toolbox
RejSumm: a toolbox with rejection learning and inference models for hallucination reduction in text summarization.
EntFA: a toolbox for hallucinated entity detection in text generation.
FactCorr: a toolbox for post-processing scripts that directly edit text summarization outputs for factual error reduction.
HipoRank: unsupervised and extractive long document summarization model with hierarchical and positional information.
EditNTS an edit-based method for text and sentence simplification.
BanditSum an extractive summarization algorithm trained with reinforcement learning with the setup of contextual multi-armed bandits.
HierClass a deep hierarchical neural attention-based classifier for hierarchical taxonomy datasets.
Dataset
FacetSum: a dataset for structured scientific document summarization. It comprises 60,024 scientific articles from Emerald journals, each with a structured abstract that summarizes the purpose, method, findings, and value.
Multi-XScience: a large-scale dataset for extreme multi-document summarization of scientific articles.
Please send me email if you find bugs or have comments!