Bring Your Own BERT Model
The Bring Your Own BERT Model (BYOM-BERT) feature allows you to import and package selected BERT family models for NuPIC. The imported models can be placed in the Inference Server to take advantage of NuPIC features such as running concurrent models and optimizing for throughput or latency. You can also perform further fine-tuning on these models using the Training Module.
Prequisites
You should have a working NuPIC Inference Server and Training Module. Follow the instructions here to get them up and running.
Additionally, you will need to install our Python clients to connect to our training client. Detailed instructions can be found here.
Supported models
Currently BYOM-BERT works with the following models:
Name | Model Type | Description |
---|---|---|
bert-large-cased | Hugging Face | A larger version of BERT with more parameters (336M), for potentially greater accuracy. A general purpose English language encoder model. |
sentence-transformers/all-MiniLM-L6-v2 | Hugging Face | A smaller (22.7M parameters), faster encoder model suited for sentences and short paragraphs. |
sentence-transformers/multi-qa-mpnet-base-dot-v1 | Hugging Face | A large (109M parameters) encoder model suitable for sentences and paragraphs. Originally tuned on question-answer pairs, and intended for comparing input sentences/paragraphs for semantic search/retrieval. |
sentence-transformers/all-mpnet-base-v2 | Hugging Face | A large (109M parameters) encoder model based on the original mpnet-base, tuned for sentences and short paragraphs. |
sentence-transformers/all-roberta-large-v1 | Hugging Face | A large (355M parameters) encoder model based on the original roberta-base, tuned for sentences and short paragraphs. |
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 | Hugging Face | A large (118M parameters) encoder model, with support for >50 languages |
gerulata/slovakbert | Hugging Face | A large (125M parameters) encoder model pretrained model on Slovak language. This model is case-sensitive. It is advised to replace all “ and ” (direct quote marks) with a single "(double quote marks). |
UWB-AIR/Czert-B-base-cased | Hugging Face | A large (110M parameters) encoder model pretrained on Czech language. |
google-bert/bert-base-german-cased | Hugging Face | A large (110M parameters) encoder model pretrained on German language. |
Importing A Model
The nupic.examples/examples/byom/
directory contains a shell script demonstrating how to import the bert-large-cased model. First, let's navigate to the directory and examine the example script by opening it in a text editor.
cd <your_nupic_dir>/nupic.examples/byom
Notice that the example script calls the import_model
Python module from the NuPIC client:
python -m nupic.client.import_model \
--model_type $MODEL_TYPE \
--model_name $MODEL_NAME \
--sequence_length $SEQ_LEN \
--url $SERVER_URL \
--tokenizer \
--on_disk
Here's what the arguments mean:
Argument | Type | Description |
---|---|---|
model_type | str | Currently supports "huggingface" |
model_name | str | Name of model to import |
sequence_length | int | Length of input sequence |
url | str | URL of training module; defaults to "http://127.0.0.1:8321" |
tokenizer | bool | Whether to import the model's tokenizer |
on_disk | bool | Whether to save the model to disk (or store in memory) |
Note that the example assumes that the inference server is running on the same machine as the example script itself. Otherwise, please adjust the SERVER_URL
constant.
Now we will make the example script executable, and proceed to run it:
chmod +x byom_example.sh
./byom_example.sh
You should see this in your terminal:
Connected to NuPIC Training Server.
Session key: 437C61577B314D828128AAE6139C2B17
Importing model
Import complete.
Model stored on server: ./imported_models/import-bert-large-cased-X-v1-437C61577B314D828128AAE6139C2B17
The imported_models/
directory contains the model you just imported, and is located one level above your nupic/
directory. Let's take a look within this directory
imported_models/
└── import-bert-large-cased-X-v1-437C61577B314D828128AAE6139C2B17
├── bert-large-cased-wotokenizer
│ ├── 1
│ │ └── model.pt
│ └── config.pbtxt
├── bert-large-cased.tokenizer
│ ├── 1
│ │ ├── model.py
│ │ ├── special_tokens_map.json
│ │ ├── tokenizer_config.json
│ │ ├── tokenizer.json
│ │ └── vocab.txt
│ └── config.pbtxt
└── bert-large-cased
├── 1
└── config.pbtxt
Looks familiar? That's because the imported model is packaged exactly like the default models in the Inference Server.
Deploying the Imported Model
Before deploying the imported model, let's understand the three subdirectories shown above. bert-large-cased-wotokenizer/
contains the "actual" bert-large-cased model that we imported. The model weights are saved in model.pt
, while config.pbtxt is where you can specify the number of cores to allocate to an instance of the model.
bert-large-cased-tokenizer/
contains the tokenizer specific to the bert-large-cased model, which converts input texts into subunits suitable for processing. Finally, bert-large-cased
is an ensemble of bert-large-cased.tokenizer
and bert-large-cased-wotokenizer/
called sequentially.
Deploying the model is literally just copying each of these folders into the models directory of the Inference Server, Assuming that the two directories are on the same machine:
cd <your_parent_dir>/imported_models/import-bert-large-cased-X-v1-437C61577B314D828128AAE6139C2B17
cp -r bert-large-cased <your_inference_server_dir>/models
cp -r bert-large-cased.tokenizer <your_inference_server_dir>/models
cp -r bert-large-cased-wotokenizer <your_inference_server_dir>/models
Are you running the Training Module and Inference Server on different machines?
You can use the
scp
utility to transfer the imported model.
Now you can query the Inference Server to check if the imported models have been added to the Model Library by running curl -X POST your_inference_server_url:8000/v2/repository/index
.
Expected output:
[{"name":"bert-large-cased"},{"name":"bert-large-cased.tokenizer"},{"name":"bert-large-cased-wotokenizer"}...
If you can't see the new models, you may need to restart the Inference Server. Otherwise, you can now proceed to use the imported model for inference like any other BERT model in the Model Library!
Updated 6 months ago