Since, everything done on behalf of your instance is logged, detecting if you have a large number of bots, or invalid users isn’t that challenging.

These queries can be executed via docker exec -it, via remoting into the container, via pg query tools, or via pgadmin.

For listing all comments performed by users on your instance (This includes comments made remotely):

SELECT
	p.actor_id
	, p.name
	, c.content as comment
FROM public.comment c 
JOIN public.person p on p.id = c.creator_id
WHERE 
	p.local = 'true'
	AND p.admin = 'false' -- Exclude Admins
;

For listing all posts created, by users, from your instance-

SELECT
	p.actor_id
	, c.name AS title
	, c.body as body
FROM public.post c 
JOIN public.person p on p.id = c.creator_id
WHERE 
	p.local = 'true'
	AND p.admin = 'false' -- Exclude Admins
;

Lastly, here is a query to identify users who consistently upvotes or downvotes the same user over and over.

SELECT
	p.id
	, p.name
	, p.actor_id
	, cr.name as creator
	, count(1)
FROM public.comment_like l
JOIN public.comment c on c.id = l.comment_id
JOIN public.person p on p.id = l.person_id
JOIN public.person cr on cr.id = c.creator_id
WHERE 
	p.id != cr.id
	AND p.local = 'true'
	AND p.admin = 'false' -- Exclude Admins
GROUP BY p.id, p.name, p.actor_id, cr.name
ORDER BY count(1) desc
;

If- anyone has idea of other queries which can be created for detecting suspicious activity, please LMK.

Edit- added where clause to exclude admins. If your admins are spambots, you have bigger issues to worry about.

  • Deedasmi@lemmy.timdn.com
    link
    fedilink
    English
    arrow-up
    3
    ·
    1 year ago

    Well, my bots haven’t done anything apparently, but there are a couple thousand of them. I’m surprised considering my instance wasn’t even listed anywhere lol. Have a query for safely deleting all but two accounts by chance?

      • Deedasmi@lemmy.timdn.com
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        21,000 users with no comments, posts, or votes on an instance that’s never been advertised and isn’t on the community browser or thefederation… Yeah nah lol.

        Can I just drop them from public.persons and move on?

        • HTTP_404_NotFound@lemmyonline.comOP
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 year ago

          I- wouldn’t recommend that, without knowing the schema/layout better.

          You can, however update public.persons set banned = 'true' where --criteria here

        • freeskier@centennialstate.social
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          1 year ago

          Yes, person table is top level. Delete from person table and it cascades down and deletes from other tables. User count also automatically updates. Just be careful because person table also contains federated users. There is a “local” column to determine if they are local users or not.

          I had about 6k bot accounts, but they were all unverified, so I just deleted all local unverified accounts from the person table.

          Just don’t go messing with the database without backups. My host supports snapshots so I did a quick snapshot before messing with anything.

          • Deedasmi@lemmy.timdn.com
            link
            fedilink
            English
            arrow-up
            2
            ·
            1 year ago

            Ty. I have full disk, and literally one real user with one comment. Re-subbing would be the only annoying thing lol

            • HTTP_404_NotFound@lemmyonline.comOP
              link
              fedilink
              English
              arrow-up
              1
              ·
              edit-2
              1 year ago

              Remember- any federated content is also stored on your database- Part of how this platform works.

              If- you don’t have much disk space- you might consider joining a larger instance.

              (Also, you CAN clean up the activity table daily too)

              Edit- I do have a kubernetes CRD which handles automatically purging the activity table, for data older than a couple days.