Guides
Log In
Guides

GPT Inference Parameters

Parameters for controlling GPT model behavior during inference. For passing as a dictionary to the inference_parameters argument in the .generate() or .infer() method.

ParameterTypeDescription
min_new_tokensintMinimum number of tokens to generate.
max_new_tokensintMaximum number of tokens to generate.
do_sampleboolIf True, enables additional decoding strategies.
prompt_lookup_num_tokensintFaster inference by specifying number of tokens for prompt lookup decoding.
temperaturefloat; 0.0–1.0Higher temperature values result in more stochastic/creative outputs. Lower temperature values result in more deterministic/predictable outputs.
top_kintTo limit outputs to k most probable tokens.
top_pintProbability threshold required for output tokens to be considered.
repetition_penaltyfloatHigher values discourage repetitive outputs.