While NuPIC allows you to deploy your AI models on CPUs, it doesn't mean your existing GPUs will go to waste. In fact, NuPIC can help you maximize utilization of your existing CPU and GPU resources by having multiple inference/fine-tuning workloads running at the same time. This way, your GPUs don't have to sit idle while you're running models on CPUs, and the same goes for the other way round.

Inference on CPUs and GPUs

Suppose you want to run Model A on CPU, and Model B on GPU. You can do so by configuring the instance_group.kind property in the respective config.pbtxt file for each model.

For instance, you might want to run 8 instances of Model A on CPU, and a single instance of Model B on GPU. You'll edit the config.pbtxt files as such:

nupic/inference/models/model_a/config.pbtxt:

instance_group [  
  {  
    count: 8  
    kind: KIND_CPU  
  }  
]

nupic/inference/models/model_b/config.pbtxt:

instance_group [  
  {  
    count: 1  
    kind: KIND_GPU  
  }  
]

🚧
Models running slow on GPUs?
GPUs have a limited amount of memory. If you try to load a model that exceeds the available GPU memory, part of the model may be partly offloaded to the CPU. When this happens, model weights have to be dynamically shuttled between CPU and GPU during inference time, resulting in slower performance.
To avoid this, ensure that your GPUs have sufficient memory for the model you are intending to run. Or better still, we recommend running larger models on CPU, which typically have access to more memory.

Fine-tuning on CPUs and GPUs

The Training Module can be configured to run on CPU only by setting an invalid GPU index at launch time. For instance, you might run ./nupic_training.sh start --gpus 999 (assuming you don't actually have 1,000 GPUs!).

By launching multiple instances of the Training Module, with some on CPU and others on GPU, you can use as much or as little of your compute as you want for fine-tuning.

Using Both CPUs and GPUs

Inference on CPUs and GPUs

🚧
Models running slow on GPUs?

Fine-tuning on CPUs and GPUs

Inference on CPUs and GPUs

🚧Models running slow on GPUs?

Fine-tuning on CPUs and GPUs

🚧
Models running slow on GPUs?