Content pfp
Content
@
https://warpcast.com/~/channel/airdrop
0 reply
0 recast
0 reaction

Javid Iqbal pfp
Javid Iqbal
@javidiqbal
To create a reward model for reinforcement learning, we needed to collect comparison data, which consisted of two or more model responses ranked by quality
14 replies
1 recast
3 reactions

Zubair SarimπŸŽ©βš‘πŸŽ­β“‚οΈ pfp
Zubair SarimπŸŽ©βš‘πŸŽ­β“‚οΈ
@zubi765
πŸ– πŸ– πŸ– πŸ– πŸ–
0 reply
0 recast
0 reaction

Fiza Ansari pfp
Fiza Ansari
@fizaansari
πŸ–Γ—90
0 reply
0 recast
0 reaction

Barbie πŸŽ©πŸŽ­β“‚οΈ βœͺ pfp
Barbie πŸŽ©πŸŽ­β“‚οΈ βœͺ
@hafsa
πŸ–
1 reply
0 recast
0 reaction

Ayesha Khan pfp
Ayesha Khan
@usman786
πŸ– x 209
1 reply
0 recast
0 reaction