question answering
Spa-DataBench
- Read more about Spa-DataBench
- Log in or register to post comments
The corpus consists of ten tabular datasets from surveys conducted by official organizations such as CIS, CEA, CRS, and 40dB. It includes a total of 200 question–answer pairs, organized as tuples (dataset, question, answer), which facilitates its expansion and reuse.
SQUAD-SQAC 2024 ES
- Read more about SQUAD-SQAC 2024 ES
- Log in or register to post comments
SQUAD/SQAC 2024 is an extension of the datasets SQUAD v1.1. (Stanford Question Answering Corpus) (Rajpurkar et al., 2016) for English and SQAC (Spanish Question Answering Corpus) (Gutiérrez-Fandiño et al., 2021) for Spanish. The dataset contains academic news from CSIC (Centro Superior de Investigaciones Científicas) for Spanish and Cambridge University for English, with questions and extractive answers.
SQAC
- Read more about SQAC
- Log in or register to post comments
The Spanish Question Answering Corpus (SQAC) is an extractive QA dataset with no unanswerable questions. It is created from texts extracted from the Spanish Wikipedia, encyclopedic articles, newswire articles from Wikinews, and the Spanish section of the AnCora corpus, which is a mix from different newswire and literature sources. It was created by commissioning the creation of 18,817 questions with the annotation of their answer spans from 6,247 textual contexts.

