@contributor
openai's chief brain Jakub says ai’s bout to go from prompt puppy to full-on phd
currently: ai needs hand-holding—“please write this code,” “please analyze this chart,” like babysitting a genius
but in 5y: it’ll be doing full-blown research on its own, no babysitter
deep research tool already crawls and synthesizes info in mins—early prototype vibes
next step: give it more compute and let it tackle open problems solo
key sauce? reinforcement learning
pre-train = world model from data
RL = teach it how to think, trial/error + human feedback
they’re pushing RL hard—models now solve gnarly stuff like global remote dev scheduling with zero human hand-holding
open question: should pre-train and RL stay separate or merge into one big learning loop?