Prompts generated from ChatGPT3.5, ChatGPT4, LLama3-8B, and Mistral-7B with NYT and HC3 topics in different roles and parameters configurations

From MaRDI portal
(Redirected from Dataset:6696307)



DOI10.5281/zenodo.11121394Zenodo11121394MaRDI QIDQ6696307FDOQ6696307

Dataset published at Zenodo repository.

Reviriego Pedro, José Alberto Hernández, Martínez Gonzalo, Conde Javier, Merino Elena

Publication date: 4 May 2024

Copyright license: Creative Commons Attribution 4.0 International



Description Prompts generated from ChatGPT3.5, ChatGPT4, Llama3-8B, and Mistral-7B withNYT and HC3 topics in different roles and parameter configurations. The dataset is useful to study lexical aspects of LLMs with different parameters/roles configurations. The 0_Base_Topics.xlsx file lists the topics used for the dataset generation The rest of the files collect the answers of ChatGPT to these topics with different configurations of parameters/context: Temperature (parameter): Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. Frequency penalty (parameter): Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. Top probability (parameter): An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. Presence penalty (parameter): Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. Roles (context) Default: No role is assigned to the LLM, the default role is used. Child: The LLM is requested to answer as a five-year-old child. Young adult male: The LLM is requested to answer as a young male adult. Young adult female: The LLM is requested to answer as a young female adult. Elderly adult male: The LLM is requested to answer as an elderly male adult. Elderly adult female: The LLM is requested to answer as an elderly female adult. Affluent adult male: The LLM is requested to answer as an affluent male adult. Affluent adult female: The LLM is requested to answer as an affluent female adult. Lower-class adult male: The LLM is requested to answer as a lower-class male adult. Lower-class adult female: The LLM is requested to answer as a lower-class female adult. Erudite: The LLM is requested to answer as an erudite who uses a rich vocabulary. Paper Paper: Beware of Words: Evaluating the Lexical Diversity ofConversational LLMs using ChatGPT as Case Study Cite: @article{10.1145/3696459,author = {Mart\'{\i}nez, Gonzalo and Hern\'{a}ndez, Jos\'{e} Alberto and Conde, Javier and Reviriego, Pedro and Merino-G\'{o}mez, Elena},title = {Beware of Words: Evaluating the Lexical Diversity of Conversational LLMs using ChatGPT as Case Study},year = {2024},publisher = {Association for Computing Machinery},address = {New York, NY, USA},issn = {2157-6904},url = {https://doi.org/10.1145/3696459},doi = {10.1145/3696459},abstract = ,note = {Just Accepted},journal = {ACM Trans. Intell. Syst. Technol.},month = sep,keywords = {LLM, Lexical diversity, ChatGPT, Evaluation}}







This page was built for dataset: Prompts generated from ChatGPT3.5, ChatGPT4, LLama3-8B, and Mistral-7B with NYT and HC3 topics in different roles and parameters configurations