☡
@stultulo
Insane hell world. I'll just do a DPO fine-tune that literally uses GPT-4 Turbo responses as the "correct" examples and GPT-4.1 responses as the "incorrect" examples. I have enough saved chats that I could throw one together real quick. Why not. It doesn't have to be great, it just has to show improvement.
0 reply
0 recast
1 reaction