Gemini 2.5: The First LLM That Understands PDF Layouts

by serjesteron 4/21/25, 2:08 PMwith 1 comments
by simonwon 4/21/25, 2:18 PM

This example is using bounding boxes, but it turns out Gemini 2.5 (both Pro and Flash) take that a step further and can return complex shaped segmentation masks identifying objects too: https://simonwillison.net/2025/Apr/18/gemini-image-segmentat...