Determining Your Use Case

Before diving into our model library, it's important to pinpoint the exact task you wish to accomplish. In many scenarios, it might seem that a generative model is necessary to accomplish a task, but in reality, a non-generative model can achieve the same objective with less complexity and at a fraction of the cost.

Depending on your server capacity or endpoint device, you might prefer models optimized for smaller sizes or lower computational requirements, even at the cost of some performance or accuracy.

Understanding Model Types

The term "large language model" (LLM) is commonly associated with models capable of generating human-like responses. However, from a technical standpoint, LLMs come in both generative and non-generative variants.

Generative Models: These models can generate new data samples from the same distribution as the training data. These are also known as GPT models. Ideal for tasks like text summarization or chatbots.
Non-Generative Models: These are models used to make predictions or classify data without necessarily generating new data. Non-generative models are sometimes also termed as "embedding models". They are great for tasks like text classification, sentiment analysis, and entity recognition.

NuPIC offers both generative and non-generative models for a variety of NLP tasks. You can find the Model Library here.

Assessing Model Performance

Here are the primary metrics you should consider when choosing a model:

Throughput: Measures the number of tasks a model can handle in a given time period. This is important for applications that require the processing of large amounts of data quickly.
Latency: Measures the time taken for a model to give an output after receiving an input. This is vital for real-time applications where rapid responses are necessary.
Accuracy: Refers to how often the model's predictions are correct. This metric is critical when the quality of the prediction is paramount.

Throughput and latency metrics are collectively known as performance metrics. It is important to note that there is generally a balance between throughput and latency. The good news is that NuPIC allows you to configure this balance to suit your needs.

Here, we use the term accuracy in the broad sense, which may encompass more granular metrics such as precision, recall, F1, and many more. It is important to select granular metrics that are well-aligned with both your business requirements and data characteristics (e.g., structure, class balance).

Before deploying a model in a production environment, it's a good practice to run some tests to ensure the model meets your performance and accuracy expectations. You can run our benchmark example here.

If you are unsure about the best model for your task or have specific questions, our team is here to help. Contact us at [email protected].