Content pfp
Content
@
0 reply
0 recast
0 reaction

shoni.eth pfp
shoni.eth
@alexpaden
Just released the largest open dataset of Farcaster threads with embeddings! ๐Ÿ“Š 24.3M high-quality threads ๐Ÿ” 512-dim Voyager embeddings (f32) โœจ Spam-filtered & engagement-ranked ๐Ÿ“… Complete Farcaster history to May 2025 Perfect for semantic search, clustering, recommendation systems & social analysis ๐Ÿค— https://huggingface.co/datasets/shoni/farcaster
8 replies
9 recasts
41 reactions

shoni.eth pfp
shoni.eth
@alexpaden
@ruminations @yesyes @jc4p
1 reply
0 recast
2 reactions

shoni.eth pfp
shoni.eth
@alexpaden
couple notes: v1 only includes 1 layer deep of quote recursion V1 does not include url/image/video descriptions in text
0 reply
0 recast
2 reactions

Kasra Rahjerdi pfp
Kasra Rahjerdi
@jc4p
this is freaking awesome!!!!!!!!
1 reply
0 recast
1 reaction

Mo pfp
Mo
@meb
This is very cool! Would you mind sharing more about the process? - potential applications in mind - how you ingested the data - various embeddings and other data mangling done - cloud vs local usage - total cost to generate this
1 reply
0 recast
2 reactions

Timur Badretdinov pfp
Timur Badretdinov
@destiner.eth
thatโ€™s really cool, i really wish i had time to play with this
0 reply
0 recast
1 reaction

jtgi pfp
jtgi
@jtgi
cool
0 reply
0 recast
1 reaction

Y8t pfp
Y8t
@y8t
Dope
0 reply
0 recast
1 reaction

Colin Charles pfp
Colin Charles
@bytebot
Thank you for this!
0 reply
0 recast
0 reaction