Content
@
0 reply
0 recast
0 reaction
shoni.eth
@alexpaden
yesterday I did LSH deduplication on sanitized thread data and surprisingly only 2-3% of threads fit into this bucket (images not considered)— I expected 10-20% gms though that doesn’t make much sense in retrospect
1 reply
1 recast
3 reactions
shoni.eth
@alexpaden
https://mattilyra.github.io/2017/05/23/document-deduplication-with-lsh.html
0 reply
0 recast
1 reaction