Guides
Log In
Guides

Bring Your Own BERT Model

The Bring Your Own BERT Model (BYOM-BERT) feature allows you to import and package selected BERT family models for NuPIC. The imported models can be placed in the Inference Server to take advantage of NuPIC features such as running concurrent models and optimizing for throughput or latency. You can also perform further fine-tuning on these models using the Training Module.

Prequisites

You should have a working NuPIC Inference Server and Training Module. Follow the instructions here to get them up and running.

Additionally, you will need to install our Python clients to connect to our training client. Detailed instructions can be found here.

Supported models

Currently BYOM-BERT works with the following models:

NameModel TypeDescription
bert-large-casedHugging FaceA larger version of BERT with more parameters (336M), for potentially greater accuracy. A general purpose English language encoder model.
sentence-transformers/all-MiniLM-L6-v2Hugging FaceA smaller (22.7M parameters), faster encoder model suited for sentences and short paragraphs.
sentence-transformers/multi-qa-mpnet-base-dot-v1Hugging FaceA large (109M parameters) encoder model suitable for sentences and paragraphs. Originally tuned on question-answer pairs, and intended for comparing input sentences/paragraphs for semantic search/retrieval.
sentence-transformers/all-mpnet-base-v2Hugging FaceA large (109M parameters) encoder model based on the original mpnet-base, tuned for sentences and short paragraphs.
sentence-transformers/all-roberta-large-v1Hugging FaceA large (355M parameters) encoder model based on the original roberta-base, tuned for sentences and short paragraphs.
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2Hugging FaceA large (118M parameters) encoder model, with support for >50 languages
gerulata/slovakbertHugging FaceA large (125M parameters) encoder model pretrained model on Slovak language. This model is case-sensitive. It is advised to replace all “ and ” (direct quote marks) with a single "(double quote marks).
UWB-AIR/Czert-B-base-casedHugging FaceA large (110M parameters) encoder model pretrained on Czech language.
google-bert/bert-base-german-casedHugging FaceA large (110M parameters) encoder model pretrained on German language.

Importing A Model

The nupic.examples/examples/byom/ directory contains a shell script demonstrating how to import the bert-large-cased model. First, let's navigate to the directory and examine the example script by opening it in a text editor.

cd <your_nupic_dir>/nupic.examples/byom

Notice that the example script calls the import_model Python module from the NuPIC client:

python -m nupic.client.import_model \
  --model_type $MODEL_TYPE \
  --model_name $MODEL_NAME \
  --sequence_length $SEQ_LEN \
  --url $SERVER_URL \
  --tokenizer \
  --on_disk

Here's what the arguments mean:

ArgumentTypeDescription
model_typestrCurrently supports "huggingface"
model_namestrName of model to import
sequence_lengthintLength of input sequence
urlstrURL of training module; defaults to "http://127.0.0.1:8321"
tokenizerboolWhether to import the model's tokenizer
on_diskboolWhether to save the model to disk (or store in memory)

Note that the example assumes that the inference server is running on the same machine as the example script itself. Otherwise, please adjust the SERVER_URL constant.

Now we will make the example script executable, and proceed to run it:

chmod +x byom_example.sh
./byom_example.sh

You should see this in your terminal:

Connected to NuPIC Training Server.
Session key: 437C61577B314D828128AAE6139C2B17
Importing model               
Import complete.
Model stored on server: ./imported_models/import-bert-large-cased-X-v1-437C61577B314D828128AAE6139C2B17

The imported_models/ directory contains the model you just imported, and is located one level above your nupic/ directory. Let's take a look within this directory

imported_models/
└── import-bert-large-cased-X-v1-437C61577B314D828128AAE6139C2B17
    ├── bert-large-cased-wotokenizer
    │   ├── 1
    │   │   └── model.pt
    │   └── config.pbtxt
    ├── bert-large-cased.tokenizer
    │   ├── 1
    │   │   ├── model.py
    │   │   ├── special_tokens_map.json
    │   │   ├── tokenizer_config.json
    │   │   ├── tokenizer.json
    │   │   └── vocab.txt
    │   └── config.pbtxt
    └── bert-large-cased
        ├── 1
        └── config.pbtxt

Looks familiar? That's because the imported model is packaged exactly like the default models in the Inference Server.

Deploying the Imported Model

Before deploying the imported model, let's understand the three subdirectories shown above. bert-large-cased-wotokenizer/ contains the "actual" bert-large-cased model that we imported. The model weights are saved in model.pt, while config.pbtxt is where you can specify the number of cores to allocate to an instance of the model.

bert-large-cased-tokenizer/ contains the tokenizer specific to the bert-large-cased model, which converts input texts into subunits suitable for processing. Finally, bert-large-cased is an ensemble of bert-large-cased.tokenizer and bert-large-cased-wotokenizer/ called sequentially.

Deploying the model is literally just copying each of these folders into the models directory of the Inference Server, Assuming that the two directories are on the same machine:

cd <your_parent_dir>/imported_models/import-bert-large-cased-X-v1-437C61577B314D828128AAE6139C2B17

cp -r bert-large-cased <your_inference_server_dir>/models
cp -r bert-large-cased.tokenizer <your_inference_server_dir>/models
cp -r bert-large-cased-wotokenizer <your_inference_server_dir>/models

📘

Are you running the Training Module and Inference Server on different machines?

You can use the scp utility to transfer the imported model.

Now you can query the Inference Server to check if the imported models have been added to the Model Library by running curl -X POST your_inference_server_url:8000/v2/repository/index.

Expected output:

[{"name":"bert-large-cased"},{"name":"bert-large-cased.tokenizer"},{"name":"bert-large-cased-wotokenizer"}...

If you can't see the new models, you may need to restart the Inference Server. Otherwise, you can now proceed to use the imported model for inference like any other BERT model in the Model Library!