hillis on Farcaster

:grin: pfp

just spent an hour helping a friend debug some lovable-generated code. he's on hour 6 with this bug super bearish on vibecoding with llms. they're great for prototyping but cannot be trusted we need a different model architecture that can actually understand and reason about the code

13 replies

2 recasts

21 reactions

Garrett pfp

we're like 2-3 months in feels like the models will get much better at debugging and helping us mere mortals understand what's actually happening in the code

1 reply

0 recast

1 reaction

:grin: pfp

i dont believe that all our models are based on transformers and are ultimately token predictors. there's no sense in which they "understand" your code or can model what will happen when they make changes. no amount of finetuning or chain-of-thought-ing is gonna change that needs a fundamentally new design

2 replies

0 recast

0 reaction

hillis pfp

What would a fundamentally new design that solved for this look like?

1 reply

0 recast

1 reaction

:grin: pfp

been thinking about this all week. i dont have a good answer because im not an expert on the science and research in this area, and also because we don't have a good explanation for what a "mental model" even is if we're staying within the current paradigm and trying to improve it, my proposal is to make the models more autistic at every step. for base models, that means curating the common crawl data towards nerdy things. for finetuning it means more examples of solving technical problems, more direct and fact-focused answers, more examples of predicting what effect a given code change would have on a system. for RL it means training against a JS interpreter in addition to the reward model some of this is stereotyping and not PC but if transformer-based neural nets do a good job of mimicking human brains (which seems true for some areas but only one piece of the puzzle) then we should have them emulate the humans who are best at programming already

0 reply

0 recast

1 reaction