vllm.model_executor.models.whisper ¶
WhisperAudioInputs ¶
Bases: TensorSchema
Dimensions
- b: Batch size
- nmb: Number of mel bins
- t: Time frames (M)
Source code in vllm/model_executor/models/whisper.py
WhisperEncoderAttention ¶
Bases: MMEncoderAttention
Multi-headed attention for Whisper encoder with 2D tensor support.
Source code in vllm/model_executor/models/whisper.py
forward ¶
batch_size x seq_len x hidden_size
or seq_len x hidden_size
Source code in vllm/model_executor/models/whisper.py
WhisperProcessingInfo ¶
Bases: BaseProcessingInfo
Source code in vllm/model_executor/models/whisper.py
_create_fake_bias_for_k_proj ¶
_create_fake_bias_for_k_proj(
weights: Iterable[tuple[str, Tensor]],
fake_bias_key_name: str,
) -> Iterable[tuple[str, Tensor]]
Create full zeros bias for k_proj weight in self-attn and x-attn layers. So that the bias for k_proj in qkv_proj can be initialized with zeros.