christopher
@christopher
Launching Trek, an open source web content extraction library built in Rust! A core part of our work is to understand any link's content on the Internet. And that also means extracting metadata quickly so users can get context, e.g. in a feed. We're building from @kepano's work on Defuddle and then some to do this. Trek also compiles into WASM, enabling anyone to extract content data in a clean, decluttered way in your TS/JS project. It leverages lol_html from Cloudflare to stream HTML in for content extraction instead of building the entire page as a normal scraper would and "trekking" the DOM. This means it's really fast and more importantly memory efficient. Check out the playground here: https://officialunofficial.github.io/trek/ Docs: https://github.com/officialunofficial/trek
7 replies
9 recasts
32 reactions
christopher
@christopher
Playground is here, oops. https://officialunofficial.github.io/trek/playground/
1 reply
0 recast
4 reactions
pugson
@pugson
ok i need this
1 reply
0 recast
6 reactions
dylan
@dylsteck.eth
dang y’all are cooking with the open source libraries!! looks very cool
0 reply
0 recast
3 reactions
Ethan Daya
@ethandaya
helll yeah this is sick
0 reply
0 recast
1 reaction
agusti
@bleu.eth
oooh this is awesome ty
0 reply
0 recast
1 reaction
Darryl Yeo 🛠️
@darrylyeo
👀 👀 👀
0 reply
0 recast
1 reaction
Matthew Fox 🌐
@matthewfox
beast
0 reply
0 recast
0 reaction