Skip to content

langchain-xai

LangChain integration with xAI.

Classes

ChatXAI

Bases: BaseChatOpenAI

ChatXAI chat model.

Refer to xAI's documentation for more nuanced details on the API's behavior and supported parameters.

Setup

Install langchain-xai and set environment variable XAI_API_KEY.

pip install -U langchain-xai export XAI_API_KEY="your-api-key" 

Key init args — completion params: model: str Name of model to use. temperature: float Sampling temperature between 0 and 2. Higher values mean more random completions, while lower values (like 0.2) mean more focused and deterministic completions. (Default: 1.) max_tokens: int | None Max number of tokens to generate. Refer to your model's documentation for the maximum number of tokens it can generate. logprobs: bool | None Whether to return logprobs.

Key init args — client params: timeout: Union[float, Tuple[float, float], Any, None] Timeout for requests. max_retries: int Max number of retries. api_key: str | None xAI API key. If not passed in will be read from env var XAI_API_KEY.

Instantiate
from langchain_xai import ChatXAI  model = ChatXAI(  model="grok-4",  temperature=0,  max_tokens=None,  timeout=None,  max_retries=2,  # api_key="...",  # other params... ) 
Invoke
messages = [  (  "system",  "You are a helpful translator. Translate the user sentence to French.",  ),  ("human", "I love programming."), ] model.invoke(messages) 
AIMessage(  content="J'adore la programmation.",  response_metadata={  "token_usage": {  "completion_tokens": 9,  "prompt_tokens": 32,  "total_tokens": 41,  },  "model_name": "grok-4",  "system_fingerprint": None,  "finish_reason": "stop",  "logprobs": None,  },  id="run-168dceca-3b8b-4283-94e3-4c739dbc1525-0",  usage_metadata={  "input_tokens": 32,  "output_tokens": 9,  "total_tokens": 41,  }, ) 
Stream
for chunk in model.stream(messages):  print(chunk.text, end="") 
content='J' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9' content="'" id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9' content='ad' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9' content='ore' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9' content=' la' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9' content=' programm' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9' content='ation' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9' content='.' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9' content='' response_metadata={'finish_reason': 'stop', 'model_name': 'grok-4'} id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9' 
Async
await model.ainvoke(messages)  # stream: # async for chunk in (await model.astream(messages))  # batch: # await model.abatch([messages]) 
AIMessage(  content="J'adore la programmation.",  response_metadata={  "token_usage": {  "completion_tokens": 9,  "prompt_tokens": 32,  "total_tokens": 41,  },  "model_name": "grok-4",  "system_fingerprint": None,  "finish_reason": "stop",  "logprobs": None,  },  id="run-09371a11-7f72-4c53-8e7c-9de5c238b34c-0",  usage_metadata={  "input_tokens": 32,  "output_tokens": 9,  "total_tokens": 41,  }, ) 
Reasoning

Certain xAI models support reasoning, which allows the model to provide reasoning content along with the response.

If provided, reasoning content is returned under the additional_kwargs field of the AIMessage or AIMessageChunk.

If supported, reasoning effort can be specified in the model constructor's extra_body argument, which will control the amount of reasoning the model does. The value can be one of 'low' or 'high'.

model = ChatXAI(  model="grok-3-mini",  extra_body={"reasoning_effort": "high"}, ) 

Note

As of 2025-07-10, reasoning_content is only returned in Grok 3 models, such as Grok 3 Mini.

Note

Note that in Grok 4, as of 2025-07-10, reasoning is not exposed in reasoning_content (other than initial 'Thinking...' text), reasoning cannot be disabled, and the reasoning_effort cannot be specified.

Tool calling / function calling:

from pydantic import BaseModel, Field  model = ChatXAI(model="grok-4")   class GetWeather(BaseModel):  '''Get the current weather in a given location'''   location: str = Field(..., description="The city and state, e.g. San Francisco, CA")   class GetPopulation(BaseModel):  '''Get the current population in a given location'''   location: str = Field(..., description="The city and state, e.g. San Francisco, CA")   model_with_tools = model.bind_tools([GetWeather, GetPopulation]) ai_msg = model_with_tools.invoke("Which city is bigger: LA or NY?") ai_msg.tool_calls 

```python [ { "name": "GetPopulation", "args": {"location": "NY"}, "id": "call_m5tstyn2004pre9bfuxvom8x", "type": "tool_call", }, { "name": "GetPopulation", "args": {"location": "LA"}, "id": "call_0vjgq455gq1av5sp9eb1pw6a", "type": "tool_call", }, ] ``` !!! note With stream response, the tool / function call will be returned in whole in a single chunk, instead of being streamed across chunks. Tool choice can be controlled by setting the `tool_choice` parameter in the model constructor's `extra_body` argument. For example, to disable tool / function calling: ```python model = ChatXAI(model="grok-4", extra_body={"tool_choice": "none"}) ``` To require that the model always calls a tool / function, set `tool_choice` to `'required'`: ```python model = ChatXAI(model="grok-4", extra_body={"tool_choice": "required"}) ``` To specify a tool / function to call, set `tool_choice` to the name of the tool / function: ```python from pydantic import BaseModel, Field model = ChatXAI( model="grok-4", extra_body={ "tool_choice": {"type": "function", "function": {"name": "GetWeather"}} }, ) class GetWeather(BaseModel): \"\"\"Get the current weather in a given location\"\"\" location: str = Field(..., description='The city and state, e.g. San Francisco, CA') class GetPopulation(BaseModel): \"\"\"Get the current population in a given location\"\"\" location: str = Field(..., description='The city and state, e.g. San Francisco, CA') model_with_tools = model.bind_tools([GetWeather, GetPopulation]) ai_msg = model_with_tools.invoke( "Which city is bigger: LA or NY?", ) ai_msg.tool_calls ``` The resulting tool call would be: ```python [ { "name": "GetWeather", "args": {"location": "Los Angeles, CA"}, "id": "call_81668711", "type": "tool_call", } ] ``` 

Parallel tool calling / parallel function calling: By default, parallel tool / function calling is enabled, so you can process multiple function calls in one request/response cycle. When two or more tool calls are required, all of the tool call requests will be included in the response body.

Structured output
from typing import Optional  from pydantic import BaseModel, Field   class Joke(BaseModel):  '''Joke to tell user.'''   setup: str = Field(description="The setup of the joke")  punchline: str = Field(description="The punchline to the joke")  rating: int | None = Field(description="How funny the joke is, from 1 to 10")   structured_model = model.with_structured_output(Joke) structured_model.invoke("Tell me a joke about cats") 
Joke(  setup="Why was the cat sitting on the computer?",  punchline="To keep an eye on the mouse!",  rating=7, ) 
Token usage
ai_msg = model.invoke(messages) ai_msg.usage_metadata 
{"input_tokens": 37, "output_tokens": 6, "total_tokens": 43} 
Logprobs
logprobs_model = model.bind(logprobs=True) messages = [("human", "Say Hello World! Do not return anything else.")] ai_msg = logprobs_model.invoke(messages) ai_msg.response_metadata["logprobs"] 
{  "content": None,  "token_ids": [22557, 3304, 28808, 2],  "tokens": [" Hello", " World", "!", "</s>"],  "token_logprobs": [-4.7683716e-06, -5.9604645e-07, 0, -0.057373047], } 

Response metadata

ai_msg = model.invoke(messages) ai_msg.response_metadata 

```python { "token_usage": { "completion_tokens": 4, "prompt_tokens": 19, "total_tokens": 23, }, "model_name": "grok-4", "system_fingerprint": None, "finish_reason": "stop", "logprobs": None, } ``` 
METHOD DESCRIPTION
get_lc_namespace

Get the namespace of the langchain object.

is_lc_serializable

Return whether this model can be serialized by LangChain.

validate_environment

Validate that api key and python package exists in environment.

with_structured_output

Model wrapper that returns outputs formatted to match the given schema.

Attributes

model_name class-attribute instance-attribute
model_name: str = Field(default='grok-4', alias='model') 

Model name to use.

xai_api_key class-attribute instance-attribute
xai_api_key: SecretStr | None = Field(  alias="api_key",  default_factory=secret_from_env(  "XAI_API_KEY", default=None  ), ) 

xAI API key.

Automatically read from env variable XAI_API_KEY if not provided.

xai_api_base class-attribute instance-attribute
xai_api_base: str = Field(default='https://api.x.ai/v1/') 

Base URL path for API requests.

search_parameters class-attribute instance-attribute
search_parameters: dict[str, Any] | None = None 

Parameters for search requests. Example: {"mode": "auto"}.

lc_secrets property
lc_secrets: dict[str, str] 

A map of constructor argument names to secret ids.

For example, {"xai_api_key": "XAI_API_KEY"}

lc_attributes property
lc_attributes: dict[str, Any] 

List of attribute names that should be included in the serialized kwargs.

These attributes must be accepted by the constructor.

Functions

get_lc_namespace classmethod
get_lc_namespace() -> list[str] 

Get the namespace of the langchain object.

Source code in .venv/lib/python3.13/site-packages/langchain_xai/chat_models.py
@classmethod def get_lc_namespace(cls) -> list[str]:  """Get the namespace of the langchain object."""  return ["langchain_xai", "chat_models"] 
is_lc_serializable classmethod
is_lc_serializable() -> bool 

Return whether this model can be serialized by LangChain.

Source code in .venv/lib/python3.13/site-packages/langchain_xai/chat_models.py
@classmethod def is_lc_serializable(cls) -> bool:  """Return whether this model can be serialized by LangChain."""  return True 
validate_environment
validate_environment() -> Self 

Validate that api key and python package exists in environment.

Source code in .venv/lib/python3.13/site-packages/langchain_xai/chat_models.py
@model_validator(mode="after") def validate_environment(self) -> Self:  """Validate that api key and python package exists in environment."""  if self.n is not None and self.n < 1:  msg = "n must be at least 1."  raise ValueError(msg)  if self.n is not None and self.n > 1 and self.streaming:  msg = "n must be 1 when streaming."  raise ValueError(msg)   client_params: dict = {  "api_key": (  self.xai_api_key.get_secret_value() if self.xai_api_key else None  ),  "base_url": self.xai_api_base,  "timeout": self.request_timeout,  "default_headers": self.default_headers,  "default_query": self.default_query,  }  if self.max_retries is not None:  client_params["max_retries"] = self.max_retries   if client_params["api_key"] is None:  msg = (  "xAI API key is not set. Please set it in the `xai_api_key` field or "  "in the `XAI_API_KEY` environment variable."  )  raise ValueError(msg)   if not (self.client or None):  sync_specific: dict = {"http_client": self.http_client}  self.client = openai.OpenAI(  **client_params, **sync_specific  ).chat.completions  self.root_client = openai.OpenAI(**client_params, **sync_specific)  if not (self.async_client or None):  async_specific: dict = {"http_client": self.http_async_client}  self.async_client = openai.AsyncOpenAI(  **client_params, **async_specific  ).chat.completions  self.root_async_client = openai.AsyncOpenAI(  **client_params,  **async_specific,  )  return self 
with_structured_output
with_structured_output(  schema: _DictOrPydanticClass | None = None,  *,  method: Literal[  "function_calling", "json_mode", "json_schema"  ] = "function_calling",  include_raw: bool = False,  strict: bool | None = None,  **kwargs: Any ) -> Runnable[LanguageModelInput, _DictOrPydantic] 

Model wrapper that returns outputs formatted to match the given schema.

PARAMETER DESCRIPTION
schema

The output schema. Can be passed in as:

  • an OpenAI function/tool schema,
  • a JSON Schema,
  • a TypedDict class (support added in 0.1.20),
  • or a Pydantic class.

If schema is a Pydantic class then the model output will be a Pydantic instance of that class, and the model-generated fields will be validated by the Pydantic class. Otherwise the model output will be a dict and will not be validated. See langchain_core.utils.function_calling.convert_to_openai_tool for more on how to properly specify types and descriptions of schema fields when specifying a Pydantic or TypedDict class.

TYPE: _DictOrPydanticClass | None DEFAULT: None

method

The method for steering model generation, one of:

TYPE: Literal['function_calling', 'json_mode', 'json_schema'] DEFAULT: 'function_calling'

include_raw

If False then only the parsed structured output is returned. If an error occurs during model output parsing it will be raised. If True then both the raw model response (a BaseMessage) and the parsed model response will be returned. If an error occurs during output parsing it will be caught and returned as well. The final output is always a dict with keys 'raw', 'parsed', and 'parsing_error'.

TYPE: bool DEFAULT: False

strict
  • True: Model output is guaranteed to exactly match the schema. The input schema will also be validated according to the supported schemas.
  • False: Input schema will not be validated and model output will not be validated.
  • None: strict argument will not be passed to the model.

TYPE: bool | None DEFAULT: None

kwargs

Additional keyword args aren't supported.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
Runnable[LanguageModelInput, _DictOrPydantic]

A Runnable that takes same inputs as a langchain_core.language_models.chat.BaseChatModel.

Runnable[LanguageModelInput, _DictOrPydantic]

If include_raw is False and schema is a Pydantic class, Runnable outputs an instance of schema (i.e., a Pydantic object). Otherwise, if include_raw is False then Runnable outputs a dict.

Runnable[LanguageModelInput, _DictOrPydantic]

If include_raw is True, then Runnable outputs a dict with keys:

Runnable[LanguageModelInput, _DictOrPydantic]
  • 'raw': BaseMessage
Runnable[LanguageModelInput, _DictOrPydantic]
  • 'parsed': None if there was a parsing error, otherwise the type depends on the schema as described above.
Runnable[LanguageModelInput, _DictOrPydantic]
  • 'parsing_error': BaseException | None
Source code in .venv/lib/python3.13/site-packages/langchain_xai/chat_models.py
def with_structured_output(  self,  schema: _DictOrPydanticClass | None = None,  *,  method: Literal[  "function_calling", "json_mode", "json_schema"  ] = "function_calling",  include_raw: bool = False,  strict: bool | None = None,  **kwargs: Any, # noqa: ANN401 ) -> Runnable[LanguageModelInput, _DictOrPydantic]:  """Model wrapper that returns outputs formatted to match the given schema.   Args:  schema: The output schema. Can be passed in as:   - an OpenAI function/tool schema,  - a JSON Schema,  - a `TypedDict` class (support added in 0.1.20),  - or a Pydantic class.   If `schema` is a Pydantic class then the model output will be a  Pydantic instance of that class, and the model-generated fields will be  validated by the Pydantic class. Otherwise the model output will be a  dict and will not be validated. See `langchain_core.utils.function_calling.convert_to_openai_tool`  for more on how to properly specify types and descriptions of  schema fields when specifying a Pydantic or `TypedDict` class.   method: The method for steering model generation, one of:   - `'function_calling'`:  Uses xAI's [tool-calling features](https://docs.x.ai/docs/guides/function-calling).  - `'json_schema'`:  Uses xAI's [structured output feature](https://docs.x.ai/docs/guides/structured-outputs).  - `'json_mode'`:  Uses xAI's JSON mode feature.   include_raw:  If `False` then only the parsed structured output is returned. If  an error occurs during model output parsing it will be raised. If `True`  then both the raw model response (a BaseMessage) and the parsed model  response will be returned. If an error occurs during output parsing it  will be caught and returned as well. The final output is always a dict  with keys `'raw'`, `'parsed'`, and `'parsing_error'`.   strict:  - `True`:  Model output is guaranteed to exactly match the schema.  The input schema will also be validated according to the [supported schemas](https://platform.openai.com/docs/guides/structured-outputs/supported-schemas?api-mode=responses#supported-schemas).  - `False`:  Input schema will not be validated and model output will not be  validated.  - `None`:  `strict` argument will not be passed to the model.   kwargs: Additional keyword args aren't supported.   Returns:  A Runnable that takes same inputs as a `langchain_core.language_models.chat.BaseChatModel`.   If `include_raw` is `False` and `schema` is a Pydantic class, Runnable outputs an instance of `schema` (i.e., a Pydantic object). Otherwise, if `include_raw` is `False` then Runnable outputs a dict.   If `include_raw` is `True`, then Runnable outputs a dict with keys:   - `'raw'`: BaseMessage  - `'parsed'`: None if there was a parsing error, otherwise the type depends on the `schema` as described above.  - `'parsing_error'`: BaseException | None   """ # noqa: E501  # Some applications require that incompatible parameters (e.g., unsupported  # methods) be handled.  if method == "function_calling" and strict:  strict = None  return super().with_structured_output(  schema, method=method, include_raw=include_raw, strict=strict, **kwargs  )