borodutch pfp
borodutch
@farcasteradmin.eth
so i'm building a db of all casts ever with embeddings holy crap you guys yap a lot and it shows @v gj on getting snapchain working, have no idea how you guys pulled it off, but the amount of data is RIDICULOUS
4 replies
1 recast
37 reactions

AI with L pfp
AI with L
@alwithl.eth
this may help https://huggingface.co/datasets/jc4p/farcaster-casts-embeddings @jc4p goated
1 reply
0 recast
3 reactions

Kasra Rahjerdi pfp
Kasra Rahjerdi
@jc4p
this dataset is very out of date and badly documented 🙈🙈 but these days i would use https://huggingface.co/Snowflake/snowflake-arctic-embed-l-v2.0 which is super good on text like casts
1 reply
0 recast
4 reactions

borodutch pfp
borodutch
@farcasteradmin.eth
ig it's cheaper than open ai embeddings? or better embeddings in general? cuz i wanted to use openai embeddings
1 reply
0 recast
3 reactions

Kasra Rahjerdi pfp
Kasra Rahjerdi
@jc4p
i haven't used openai's embeddings in a whileeee, but it's probably fine for whatever your use case is -- rn the best SaaS one is Gemini's I think (which does support batch API also so you can save a lot on the $$) but I just am a fan of doing things locally when possible!
1 reply
0 recast
3 reactions

avi pfp
avi
@avichalp.eth
how much did cost to compute embeddings with gemini?
0 reply
0 recast
2 reactions

Kasra Rahjerdi pfp
Kasra Rahjerdi
@jc4p
i haven’t done gemini for every single cast yet only for other datasets, but in general if you use the batch api (so it’s async) it’s half the price of normal: https://ai.google.dev/gemini-api/docs/batch-api#batch-embedding
0 reply
0 recast
2 reactions