Content
@
0 reply
0 recast
0 reaction
caz.eth
@caz.eth
The Warpcast team updated their dataset of user spam labels a few hours ago. They have been doing this weekly for about a month and I have been going through the data on each release. This update was probably the biggest change in the data in terms of the distribution of spam labels: After the previous release only ~9% of users with labels were labeled as unlikely to be spam and 42% as might engage in spam. The remaining 49% were labeled as likely to engage in spam (these are the three labels that exist - a user can also have no label). After this release 18% of users are labeled as unlikely - doubled from before this update - and only 35% as might be spam. So there has been a big increase in users labeled as unlikely to be spam, which is coming from a relabeling of existing users. About 19k users who were previously labeled as likely to be spam and ~29k users who were labeled as might be spam are now labeled as unlikely. ->
1 reply
2 recasts
5 reactions
caz.eth
@caz.eth
On the other hand, out of new users (new users in this context are users who haven't previously had a label but now have one - this seems to largely be new farcaster users but some are older users who haven't had a label), only 10% are labeled unlikely and 17% as might be spam. 73% of users who got their first label in this release are labeled as likely to be spam. Users seem to be consistently labeled as more spammy when they get their first label than the overall average. In total, 47% of users with a label are labeled likely to be spam, which I believe are the users that Warpcast will remove from the follow count. The total user number in the dataset is ~500k so this means that roughly 235k users will no longer be shown in follow counts.
1 reply
0 recast
1 reaction