Kasra Rahjerdi
@jc4p
hey if you're wondering where the public data dump for April is, I need help figuring out the implications of my decisions with the Snapchain setup -- would love any ideas. Short of it: Snapchain prefers a 24/7 running instance, can't afford to run a heavy box 24/7, stuck on filtering 90GB of data on 8GB of RAM
10 replies
3 recasts
21 reactions
draftcode
@draftcode
have you thought about Redis to store what you to access under a ttl renewed by any criteria?
1 reply
0 recast
1 reaction
Kasra Rahjerdi
@jc4p
@christopher has a great solution with this (where you can like, do extra filtering at the redis consumption layer) https://github.com/officialunofficial/waypoint -- my main issue rn stems from doing lifetime dumps instead of incremental dumps, if i had a queue of the last day's likes or something it would be a lot easier
2 replies
0 recast
2 reactions
christopher
@christopher
That's how Waypoint and the Redis Streams work. We build the last day's worth of messages, then you create a specific consumer group (likes-consumer-group) and then provision N number of consumers to burn it down pub-sub style.
2 replies
0 recast
1 reaction
christopher
@christopher
We have an embeds consumer group that only listens for embeds, for example, and multiple consumers to burn it down. That means if the hub goes down temporarily, we don't lose buffered workers with it.
0 reply
0 recast
1 reaction
Kasra Rahjerdi
@jc4p
yeahhh that's very smart. i just... don't need persistent synchronized data.. i just want a monthly dump, but maybe that's the best move
3 replies
0 recast
1 reaction