agusti pfp
agusti
@bleu.eth
training your own model hits diff fr fr
2 replies
0 recast
11 reactions

agusti pfp
agusti
@bleu.eth
i dont even know if these numbers are good or bad rn 25M params model gpt2 base
4 replies
0 recast
6 reactions

J. Valeska 🦊🎩🫂 pfp
J. Valeska 🦊🎩🫂
@jvaleska.eth
well, the idea is to get the lose perentage close to 0 it is decreasing with every new iteration, that's good, regarding some point it will decrease the range of the step and will repeat similar numbers, then, you are done if it is close to 0 you may have found optimal weights (or it maybe over trained, meaning the llm learnt over the dataset but will be bad on other stuff) if it is not optimal, you may have found a local top and you need to modify some parameters and try again (a veey short summary based on how it has been working some years ago, maybe have changed now)
1 reply
1 recast
1 reaction

agusti pfp
agusti
@bleu.eth
ty so much for this ✍️ much appreciated
0 reply
0 recast
1 reaction