Dan Romero
@dwr
Wonder if ChatGPT will be the last major model to be trained on the open web? robots.txt specifically disallowing crawling from LLMs unless getting paid for the data?
9 replies
0 recast
0 reaction
Venkatesh Rao ☀️
@vgr
I doubt it. We’re at the start of an arms race between training and membership inference algorithms. https://arxiv.org/abs/2301.09956 Even if Western majors respect regulatory type regimes and respect robots.txt directives many won’t. The only defense is encryption not regulation.
1 reply
0 recast
0 reaction
Dan Romero
@dwr
So then people will go dark ie auth only?
1 reply
0 recast
0 reaction