a
@ggyce01y5
Vision-language models struggle with words like 'no' and 'not.' They're used for medical images but miss key details. This can lead to errors when searching for specific images. The models might grab wrong ones by ignoring negations. It's a big flaw for accurate medical analysis.
0 reply
0 recast
0 reaction