tryjdtyrjty
@erhehrt
Vision-language models struggle with words like 'no' and 'not.' They're used for medical images but miss key details. This can lead to errors when searching for specific images. The models might retrieve wrong ones unexpectedly. Their lack of negation understanding is a big flaw.
0 reply
0 recast
0 reaction