(named) entity recognition
MultiCardioNER Task at BioASQ 2024 (CardioDis subtrack)
DrugTEMIST -2024
- Read more about DrugTEMIST -2024
- Log in or register to post comments
A collection of cardiology-specific clinical case reports annotated with drugs.
GenoVarDis
- Read more about GenoVarDis
- Log in or register to post comments
The corpus consists of (i) the translation and manual curation of documents with tmVar3 annotations (Wei et al., 2022), which include PubMed summaries, to which associated diseases and symptoms were added; and (ii) the manual annotation of PubMed summaries in Spanish.
MultiCoNER-ES
- Read more about MultiCoNER-ES
- Log in or register to post comments
MULTICONER is a large multilingual dataset for Named Entity Recognition that covers 3 domains (Wiki sentences, questions, and search queries) across 11 languages, as well as multilingual and code-mixing subsets. This dataset is designed to represent contemporary challenges in NER, including low-context scenarios (short and uncased text), syntactically complex entities like movie titles, and long-tail entity distributions.
SocialDisNER
- Read more about SocialDisNER
- Log in or register to post comments
The goal of SocialDisNER is the automatic recognition of disease mentions in tweets.
LivingNER
- Read more about LivingNER
- Log in or register to post comments
DIANN-2018-ES
- Read more about DIANN-2018-ES
- Log in or register to post comments
The corpus is a collection of 500 abstracts from Elsevier journal papers related to the biomedical domain collected between 2017 and 2018. It is divided into two disjoined parts: training set (80%) and test set (20%). It is annotated with disabilities and negations and their scope.
MEDDOCAN
- Read more about MEDDOCAN
- Log in or register to post comments
CAPITEL-NER
- Read more about CAPITEL-NER
- Log in or register to post comments
DisTEMIST
- Read more about DisTEMIST
- Log in or register to post comments