Fine-Tune Your Models
Last updated
Last updated
In UBIAI, fine-tuning your models is a simple process designed to help you achieve outstanding performance in various natural language processing tasks. This page will walk you through the process of fine-tuning models step-by-step, ensuring that you maximize the potential of your datasets and workflows.
Fine-tuning your model on UBIAI allows you to customize and optimize pre-trained models for your specific needs, enhancing their performance on your particular dataset. UBIAI provides a user-friendly, code-free environment that simplifies the fine-tuning process, making it accessible for both novice and experienced users.
On the Models page, you can manage and monitor all your models. You can find:
Trained Models: Models you’ve already fine-tuned, complete with performance metrics.
Untrained Models: Models that are created but not yet fine-tuned, ready for your data.
Model Details: Each model includes metadata such as task type, model type (e.g., spaCy, Llama), training status, and training history.
This central hub allows you to simplify your model development process, ensuring you stay organized across projects.
UBIAI supports the following tasks to address a broad range of NLP use cases:
Named Entity Recognition (NER): NER models are trained to identify specific entities in text, such as names, locations, dates, and custom domain-specific entities (e.g., product codes, medical terms).
Relation Extraction: Relation Extraction models identify and classify relationships between two entities, such as "employee of," "located in," or "caused by."
Span Categorization: This task involves labeling spans of text, such as sentences or paragraphs, with predefined categories.
Text Classification: Text classification models label entire documents or sections of text into categories such as sentiment (positive, negative, neutral), topic, or priority.
Large Language Models: LLMs can be trained for tasks like question answering and Text Generation.
Navigate to the Models page and click on Create New Model then select the task you want to train your model for:
Named Entity Recognition (NER)
Relation Extraction
Text Classification
Text Generation
Thenname your model and choose the model type to proceed with creating your custom model. you can pick Spacy or LLM depending on your task.
After you add your model details you can either assign an existing dataset or import a new one:
Assign Dataset: Link an existing dataset to your model. The dataset becomes exclusive to the model, ensuring it is not accidentally modified by other models.
Import Dataset: Combine documents from multiple datasets to create a new one. You can include text only or both text and annotations.
Before initiating training make sure to:
Validate the dataset in the Dataset tab to ensure annotations are complete and accurate.
Select the model in the Training Configurations pannel.
Select Training / Validation Ratio
Configure hyperparameters:
Epochs: The number of passes over the dataset.
Dropout: noise that's intentionally dropped from a neural network to improve processing and time to results.
Batch Size: The number of samples processed together during training.
Learning Rate: Adjust the model’s learning efficiency.
Review your configurations and click Start Model Training.
Once training is complete, navigate to the Model Dashboard to view performance metrics: F1-Score, Precision, Recall.
LLM scores are not provided yet, we are working on including them in the future.
In addition to the model and entity scores, you can visualize the loss, Precision, Recall and F-1 curves related to a specific training run.
Ubiai also gives you access to confusion matrix so you can easily visualize the number of correctly and wrongly classified outputs.
A confusion matrix summarizes the performance of a classification model by showing the counts of true positives, true negatives, false positives, and false negatives. Useful for understanding the types of errors a model makes, especially in classification tasks (e.g., sentiment analysis, topic classification)
The matrix presents four key values that represent the outcomes of a classification task:
True Positive (TP): The number of instances that were correctly predicted as positive.
False Positive (FP): The number of instances that were incorrectly predicted as positive.
True Negative (TN): The number of instances that were correctly predicted as negative.
False Negative (FN): The number of instances that were incorrectly predicted as negative.
Programmatic model training for NER and relation extraction can be easily done with our API. This feature is only available for Growth and Business packages:
Within the model details dashboard, select 'Train model with API' drop down menu.
Copy/paste the generated code into your application.
Run the script to launch the training (you can input the hyperparameters of your choice).