Prompt-Response Datasets
Last updated
Last updated
The Prompt-Response dataset type in UbiAI is designed to streamline the creation and management of datasets for training text generation models, such as Large Language Models. This comprehensive page of the documentation outlines how to work with Prompt-Response datasets.
To create your Prompt-Response dataset, UbiAI provides you with two primary options:
By clicking on the "Generate Dataset" button, you can use UbiAI’s dataset generator to create a synthetic dataset from scratch. Here are the steps:
click on the "Generate Dataset" option when configuring your prompt-response dataset.
Add your "Dataset Details" such as Name, Language and description.
Upload existing variables or add new ones to be used in the prompt to generate text based on their variants.
Set the parameters like the model you would like to use and its Temperature, and write the prompts to generate your text.
generate your dataset.
This is a quick and efficient way to produce a dataset without manually collecting data, offering flexibility and scalability for various use cases.
If you already have a dataset, you can upload it in CSV format. Ensure that your CSV file contains the
System Prompt: The initial instruction or context provided by the system.
User Prompt: The input or query from the user.
Input: Additional context or data relevant to generating a response.
Response: The output generated by the model based on the provided prompts.
After uploading your CSV file, UbiAI will guide you through a column mapping process. This step ensures that the platform understands how to interpret your dataset.
During mapping:
Assign each column in your CSV file to one of the four required fields: System Prompt, User Prompt, Input, and Response.
Once mapping is complete, click on the "Finish" button to finalize the process.
The platform will then start processing your data, transforming each row in the CSV file into an individual document in your dataset.
Once your dataset is processed, you can review and refine it as needed. UbiAI offers robust tools for dataset refinement. After processing, each document is accessible for review and modification:
To edit a document, click the "Edit" button in the top-right corner. This allows you to make adjustments to any field, ensuring data accuracy and consistency.
If needed, access the integrated Response Generator to create or modify responses dynamically without having to rely on an external source.
UbiAI’s integrated Response Generator is a powerful tool relies on LLMs to suggest new responses based on your dataset. Here’s how you can use it:
In the left-hand parameters menu, choose the model you want to use for response generation. UbiAI supports a variety of pre-trained models, enabling you to select one that fits your requirements.
The temperature setting controls the creativity and variability of the model’s output:
A low temperature (e.g., 0.2) results in more predictable and deterministic responses.
A high temperature (e.g., 0.8) generates more diverse and creative responses.
Adjust this setting based on your project’s needs.
After selecting the model and setting the temperature, click the "Generate" button in the top-right corner. The model will generate a new response for the selected prompt, which you can review and edit further if needed.
Validation ensures dataset quality and readiness for training. UbiAI provides two methods for validating documents:
Manually validate each document one at a time, reviewing the prompt-response pairs to ensure they meet your quality standards.
For larger datasets, you can validate multiple documents simultaneously:
Navigate to the Dataset Versions menu.
Click the drop-down menu next to the name feild and select "Select All".
Click "Validate" to approve all selected documents.
You can continue to expand your dataset even after initial creation. UbiAI provides two options:
Upload Additional CSV Files: Simply upload new CSV files, map their columns as before, and merge them into your existing dataset.
Generate New Responses: Click on the "Generate New Response" button to use UbiAI to create additional documents. The new documents will be based on the data you have already provided, ensuring consistency and relevance.
When your dataset is ready, you can either use it to finetune on the platform or export it for external use. Exporting your dataset is very simple:
Go to the Dataset Versions menu.
Click on the "Export" button.
The platform will generate a CSV file containing all validated data. This file is immediately downloadable and compatible with various tools and frameworks.
For easy integration with your applications, UbiAI offers an API that supports dataset management. With the API, you can programmatically upload datasets, retrieve validated data, and interact with projects. Below are examples of API usage:
You can select File Type (only CSV files are available for the Prompt-Response Datasets) and Upload using this API code:
Select Export Type (only CSV files are available for the Prompt-Response Datasets) and Export using this API code: