Fine-Tuning with Your Dataset
The NuPIC Training Module allows you to fine-tune a model to your specific needs, whether for a specific task or to increase model accuracy for particular domains or use cases.
This step is optional, as it's often appropriate (depending on your use-case) to deploy models directly from the NuPIC Model Library to the Inference Server without any modifications. By default, NuPIC BERT models return embedding vectors, which are useful for a wide variety of tasks. For classification and question-answering, the Training Module can help to add the necessary model heads to adapt BERT models for tasks.
The following script will fine-tune a classification model on a financial sentiment dataset using the NuPIC Training Module, and save the resulting model to model.tar.gz.
The script will also print out the accuracy of the test set during the fine-tuning process. In your own application, you can choose a model from the Numenta Model Library and use your own dataset that is suitable for your use case.
Prerequisites
You should have a working NuPIC Training Module. Follow the instructions here to get it up and running.
Additionally, you will need to install our Python clients to connect to our training client. Detailed instructions can be found here.
While it is possible to fine-tune using a CPU only, we currently recommend running the Training Module on a GPU-enabled environment so that fine-tuning can be completed in a timely manner. Please ensure that the necessary software are installed to enable GPU acceleration during fine-tuning.
Dataset Preparation
Let's start by navigating to the directory containing our fine-tuning example:
cd nupic/nupic.examples/examples/fine_tuned_sentiment_analysis
Next, we want to preview the dataset:
head datasets/financial_sentiment.csv -n 5
Examining outputs from the command above, we see that the dataset has two columns: label
and text
. The label
column contains the ground truth labels for each text
, and can take on the values of positive
, negative
and neutral
. This means we have a three-class classification task on our hands. The text
column contains strings of financial news.
label,text
positive,"The GeoSolutions technology will leverage Benefon 's GPS solutions by providing Location Based Search Technology , a Communities Platform , location relevant multimedia content and a new and powerful commercial model ."
negative,"$ESI on lows, down $1.50 to $2.50 BK a real possibility"
positive,"For the last quarter of 2010 , Componenta 's net sales doubled to EUR131m from EUR76m for the same period a year earlier , while it moved to a zero pre-tax profit from a pre-tax loss of EUR7m ."
neutral,"According to the Finnish-Russian Chamber of Commerce , all the major construction companies of Finland are operating in Russia ."
Now we want to split the dataset into train and test subsets. The train subset is used for fine-tuning model weights, while the test subset is used for independently evaluating the accuracy of the fine-tuned model. Using the following Python script, you will be able to get the exact same splits so long as it is executed on the same machine. The script produces a train-test split of 80:20.
python divide_data.py
The splits are saved to CSV files:
datasets/
├── financial_sentiment.csv
├── financial_sentiment_test_dataset.csv
└── financial_sentiment_train_dataset.csv
Fine-Tune a Model
To start fine-tuning, simply execute the following script:
chmod +x finetune.sh
./finetune.sh
If successful, you will see the following output after some time:
Connected to NuPIC Training Server.
Session key: ******************************
Parameters: {'model': 'nupic-sbert-1-v3', 'task_type': 'classification', 'batch_size': 16, 'epochs': 5, 'eval_metrics': ['accuracy'], 'epochs_per_test_run': 1, 'learning_rate': 1e-05, 'seed': 42, 'init_eval': False, 'cache_dir': '', 'overwrite_cache': False, 'key': 'CBADA2D3709C40A88259E55988BA84BA'}
Data uploaded.
Preprocessing complete.
Training started
Progress 20.0% (293/1465) | Epoch 1/5 | 399.0s -> 1596.0s | Evaluating on test set Progress 20.0% (293/1465) | Epoch 1/5 | 400.0s -> 1602.0s | Evaluating on test set
Epoch 1 | Test eval_loss: 0.7789 | Test eval_accuracy: 0.6527
Epoch 2 | Test eval_loss: 0.6888 | Test eval_accuracy: 0.6869
Epoch 3 | Test eval_loss: 0.6513 | Test eval_accuracy: 0.7100
Epoch 4 | Test eval_loss: 0.6322 | Test eval_accuracy: 0.7160
Epoch 5 | Test eval_loss: 0.6286 | Test eval_accuracy: 0.7211
Progress 100.0% (1465/1465) | Epoch 5/5 | 2008.0s -> 0.0s | Training complete.
Process complete.
Downloading model
Model downloaded to ./nupic-sbert-1-v3-tuned-CBADA2D3709C40A88259E55988BA84BA.tar.gz
Results downloaded to ./results.csv
Let's break down what just happened. Firstly, the Training Module verifies the incoming connection from the Python client, and starts a training session with the defined training parameters (see section below). Next, the data is uploaded from the client machine to the Training Module. Then, the uploaded dataset is tokenized, before proceeding with the actual fine-tuning. Once fine-tuning is complete, the fine-tuned model and its predictions (results.csv) are downloaded back into the client machine.
While the Training Module returns the accuracy metric by default, the downloaded results allow you to calculate more granular metrics and perform error analysis, so as to ensure that the model is performing as expected.
Python Client Configuration
Examining the contents of finetune.sh
, you will see that it uses NuPIC Python training client to communicate with the Training Module. The training client accepts command line arguments denoted by a --
flag:
python -m nupic.client.nupic_train \
--train_path $train_dataset \
--test_path $test_dataset \
--model nupic-sbert-1-v3 \
--seed $seed \
--url http://localhost:8321 \
--batch_size 16 \
--epochs 5 \
Here we explain a few of the more important arguments in greater detail, but you can run python -m nupic.client.nupic_train --help
to see the full list.
Argument | Description |
---|---|
model | Base model from the NuPIC Model Library on which fine-tuning will be performed. The Training Module currently supports fine-tuning of non-generative models. |
task_type | Type of model task:classification or qa . |
url | URL or IP address of the Training Module, as well as the port through which it will be accessed. |
batch_size | The number of examples passed to the model in a single iteration. You may need to reduce this number to fit with your available GPU memory. |
wandb_api_key | Your API key to the Weights and Biases dashboard, for tracking fine-tuning progress. |
wandb_project_name | Your Weights and Biases project name. |
Task Types
Classification The classification task type involves having a model predict a piece of text as one of two or more categories. Examples of text classification tasks include sentiment analysis, spam detection, topic labeling and more. For such tasks, the Training Module strictly requires datasets to be structured following the example above; i.e., the columns must be named "label" and "text".
Question-answering In question-answering (QA) tasks, the model is given a two-part input consisting of a question and a context. The context string is typically a short passage containing the answer. With these information, the model aims to predict the correct answer. Note that this is different from the chatbot use-case typically associated with generative models.
For QA fine-tuning, NuPIC Training Module requires the dataset to follow the SQuAD format.
Deploying the Fine-Tuned Model for Inference
Now that we have a fine-tuned model, we want to make it available for use on the Inference Server. To do this, we want to transfer the fine-tuned model to the Inference Server, and extract it there. The commands below assume that the Python inference client and Inference Server are running on the same machine. If not, please modify the first two lines accordingly.
cp model_xxx.tar.gz <path_to_nupic>/nupic/inference/models
cd <path_to_nupic>/nupic/inference/models
tar -xzvf model_xxx.tar.gz
You've just added your fine-tuned model to your Model Library. Take note of the name of your newly fine-tuned model. Now you're ready to use the Python inference client to call it!
Updated 9 months ago