max_tokens | integer | no | 65536 | Maximum number of output tokens to generate. · Range: 1 – 131072 |
temperature | number | no | 1 | Controls randomness. Lower values make responses more deterministic. · Range: 0 – 1 |
top_p | number | no | 0.95 | Nucleus sampling cutoff. · Range: 0.01 – 1 |
reasoning_effort | enum | no | "max" | GLM-5.2 reasoning effort. none disables thinking; minimal through max set how hard the model reasons before answering. max is recommended for complex coding. · Allowed: none, minimal, low, medium, high, xhigh, max |
enable_thinking | boolean | no | true | Allow the model to reason before answering. Turn off for the lowest-latency replies or strict structured output. |
do_sample | boolean | no | true | Enable sampling. Turn off for greedy deterministic output (temperature and top_p are ignored). |
tool_web_search | boolean | no | false | Enable built-in web search. Adds $0.033 per request when used. |
search_recency_filter | enum | no | "noLimit" | Limit web search results to a recency window. · Allowed: oneDay, oneWeek, oneMonth, oneYear, noLimit |
count | integer | no | 10 | Number of web search results to retrieve when web search is enabled. · Range: 1 – 50 |
search_domain_filter | string | no | - | Restrict web search to a specific domain. |
search_prompt | string | no | - | Optional prompt used to summarize retrieved web search results. |
search_result | boolean | no | true | Return web search result metadata in the response when web search is enabled. |
tool_stream | boolean | no | false | Stream function-call arguments incrementally when streaming. |
tools | array | no | [] | OpenAI-compatible function calling tool definitions. |
tool_choice | object | no | - | OpenAI-compatible tool choice control. |
response_format | object | no | - | OpenAI-compatible JSON mode. Use thinking disabled for strict structured output. |
stop | array | no | - | Optional stop sequences (up to 4). |