vllm.model_executor.models.chatglm ¶
Inference-only ChatGLM model compatible with THUDM weights.
GLMBlock ¶
Bases: Module
A single transformer layer.
Transformer layer takes input with size [s, b, h] and returns an output of the same size.
Source code in vllm/model_executor/models/chatglm.py
GLMMLP ¶
Bases: Module
MLP.
MLP will take the input with h hidden state, project it to 4*h hidden dimension, perform nonlinear transformation, and project the state back into h hidden dimension.
Source code in vllm/model_executor/models/chatglm.py
GLMTransformer ¶
Bases: Module
Transformer class.