Shaw pfp
Shaw
@shawmakesmagic
Wait so RL is just endlessly tweaking hyperparameters? I can reason out the math and theory of the system. But why 0.6 converges and 0.1 does not, baffling
6 replies
2 recasts
37 reactions

WOO🎩 pfp
WOO🎩
@woo-x
Yup, welcome to RL. It’s math on paper, but vibes and hacks in practice.
1 reply
0 recast
2 reactions

Lokp Ray pfp
Lokp Ray
@lokpray
RL is just the art of designing reward function
0 reply
0 recast
0 reaction

Joely 🎩🏰 pfp
Joely 🎩🏰
@joely.eth
Recursion. That's the answer. Always has been.
0 reply
0 recast
2 reactions

Joseph Goats pfp
Joseph Goats
@joseacabrerav
I didnt fully understood but maybe if you explain further I can catch up? I speak Spanish
0 reply
0 recast
0 reaction

Iamrav3n.eth pfp
Iamrav3n.eth
@iamrav3n
Yeah, the sensitivity to hyperparameters can be wild.
0 reply
0 recast
0 reaction

@BestCryptoTwits pfp
@BestCryptoTwits
@bestcryptotwits
RL Grime?
0 reply
0 recast
0 reaction