christopher on Farcaster

christopher pfp

Launching Trek, an open source web content extraction library built in Rust! A core part of our work is to understand any link's content on the Internet. And that also means extracting metadata quickly so users can get context, e.g. in a feed. We're building from @kepano's work on Defuddle and then some to do this. Trek also compiles into WASM, enabling anyone to extract content data in a clean, decluttered way in your TS/JS project. It leverages lol_html from Cloudflare to stream HTML in for content extraction instead of building the entire page as a normal scraper would and "trekking" the DOM. This means it's really fast and more importantly memory efficient. Check out the playground here: https://officialunofficial.github.io/trek/ Docs: https://github.com/officialunofficial/trek

8 replies

8 recasts

30 reactions

christopher pfp

Playground is here, oops. https://officialunofficial.github.io/trek/playground/

1 reply

0 recast

3 reactions

kepano pfp

cool! is there anything you learned building this that could be incorporated into defuddle?

1 reply

0 recast

1 reaction

pugson pfp

1 reply

0 recast

6 reactions

dang y’all are cooking with the open source libraries!! looks very cool

0 reply

0 recast

3 reactions

Ethan Daya pfp

helll yeah this is sick

0 reply

0 recast

1 reaction

agusti pfp

oooh this is awesome ty

0 reply

0 recast

1 reaction

Darryl Yeo 🛠️ pfp

Darryl Yeo 🛠️

0 reply

0 recast

1 reaction

Matthew Fox 🌐 pfp

Matthew Fox 🌐

0 reply

0 recast

0 reaction