Sentiment Analysis
Sentiment Analysis allows you to interpret and classify emotions within text. It’s particularly useful in understanding customer feedback, social media monitoring, product reviews and many other applications where understanding sentiment can provide valuable insights. This page shows how to perform inference for sentiment analysis on a financial dataset. The model predicts three possible sentiment labels, positive
, negative
and neutral
, as well as the associated probabilities for each prediction.
Quick Start
Before you start, make sure the NuPIC Inference Server and Training Module are up and running, and the Python environment is set up.
Fine-tuning required
A fine-tuned model is required for sentiment analysis. By default, NuPIC BERT models simply return an embedding vector, so the Training Module helps to add a model head (consisting of simple linear layers) for a classification tasks like sentiment analysis.
Now navigate to the directory containing the GPT summarization example:
cd nupic.examples/examples/fine_tuned_sentiment_analysis
Open inference.py
in a text editor to check the configurations. Specifically, we want to make sure that the Python client is pointing to the correct inference server URL. The URL below would work if the Inference Server is hosted on port 8000 on the same machine as the Python client. Otherwise, please adjust accordingly.
MODEL = args.model_name
URL = "localhost:8000" <--------------------
BATCH_SIZE = 4
PROTOCOL = "http"
DATA = f"{script_dir}/datasets/financial_sentiment_test_dataset.csv"
CERTIFICATES = {}
We'll now perform inference on the test dataset specified above. The argument below specifies that we want to use the sentiment analysis model that we had fine-tuned earlier.
python inference.py nupic-sbert.base-v3-tuned-CBADA2D3709C40A88259E55988BA84BA
Since we are running on a test dataset with known ground truths, the script can help evaluate the model by calculating the confusion matrix and accuracy score:
Calculating logits for test set...
Confusion matrix:
[[ 68 64 40]
[ 43 537 46]
[ 32 101 238]]
Accuracy: 0.7211291702309667
The confusion matrix show the number of predictions against actual ground truth labels in the following format:
Actual Negative | Actual Neutral | Actual Positive | |
---|---|---|---|
Predicted Negative | |||
Predicted Neutral | |||
Predicted Positive |
Updated 5 months ago