
Multimodal AI can process and generate multiple types of data — text, images, audio, and video — in a single model.
GPT-4V (Vision) and GPT-4o brought these capabilities to ChatGPT users. Upload any image and ask questions about it.
Reference:
TaskLoco™ — The Sticky Note GOAT