ChatGPT Evaluation Dataset v.2.0 (Q6696495)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: ChatGPT Evaluation Dataset v.2.0 |
Dataset published at Zenodo repository.
| Language | Label | Description | Also known as |
|---|---|---|---|
| default for all languages | No label defined |
||
| English | ChatGPT Evaluation Dataset v.2.0 |
Dataset published at Zenodo repository. |
Statements
We tested ChatGPT on 25 tasks focusing on solving common NLP problems and requiring analytical reasoning. These tasks include (1) a relatively simple binary classification of texts like spam, humor, sarcasm, aggression detection, or grammatical correctness of the text; (2) a more complex multiclass and multi-label classification of texts such as sentiment analysis, emotion recognition; (3) reasoning with the personal context, i.e., personalized versions of the problems that make use of additional information about text perception of a given user (users examples provided to ChatGPT); (4) semantic annotation and acceptance of the text going towards natural language understanding (NLU) like word sense disambiguation (WSD), and (5) answering questions based on the input text. More information in the paper: https://www.sciencedirect.com/science/article/pii/S156625352300177X
0 references
2.0
0 references