
GPT-4o Vision lets you send images to the API and ask questions about them — receipts, screenshots, diagrams, photos, documents.
response = client.chat.completions.create(
model="gpt-4o",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{
"type": "image_url",
"image_url": {"url": "https://example.com/photo.jpg"}
}
]
}]
)For local images, encode as base64 and use "url": "data:image/jpeg;base64,{base64_string}"
Reference:
TaskLoco™ — The Sticky Note GOAT