☡ pfp

@stultulo

So I've learned that you can use mergekit to target attention separately from MLP layers? hm. Makes sense, but, hmm.
0 reply
0 recast
0 reaction