Multimodal
An AI model that can work with more than one type of input or output, such as text, images, audio, and video.
Why it matters
Multimodal models can describe a photo, read a chart, or answer questions about a video, not just text.
Example
Uploading a screenshot and asking an assistant to explain it uses a multimodal model.
Related terms
Back to the full AI glossary.