blog.thms.uk

Blocking Hashtags from your Mastodon instance

When running a small or single user mastodon instance you often use relays to get content to your instance and increase reachability of your posts.

This sadly increases the amount of content that reaches your instance from federated instances you may not want to host or spread.

@stefan@social.stefanberger.net has written a really nice tool that allows you to delete such content, and I've gone and extended it somewhat, to make it more useful for me (and hopefully other people).

You can see the my version of it on GitHub at nanos/mastodon_block_hashtags.

How does this work?

This script gets any posts containing media attachments, that are tagged with any of the tags you provide. If that post isn't authored by anyone on your server, or by anyone followed by anyone on your server, the user will be suspended, and all their content deleted from your server.

How do I use this?

The README has all the details, but the simplest way to run this, is on a daily cron job through GitHub Actions.

The basic steps are:

  1. Fork nanos/mastodon_block_hashtags.
  2. Get an API token from your Mastodon instance, and provide it as a Secret.
  3. Configure your Mastodon server, and which tags you want to block in config.sh. (My version comes with a fairly long list of NSFW tags built in, but you may very well have other needs.)
  4. Enable GitHub Actions for your fork.

Or you can run it as a standard cron job on your local machine / server - full instructions in the README.

How do I find block-worthy tags?

One way to get tags that concern a specific subject is to go into Postgres and execute a query like this:

SELECT	t."name", count(*)
FROM	media_attachments a
  JOIN 	statuses s 
  	ON	s.id = a.status_id
  JOIN 	statuses_tags st
  	ON	st.status_id = s.id
  JOIN	tags t
  	ON	t.id = st.tag_id
  AND 	t."name" like '%nsfw%' -- (adjust this as desired, or remove altogether, if you just want a count)
GROUP BY t.name
	order by count(*) desc;

You may (or may not 🙈) want to have a look at the timeline for those tags, to see if they are actually ban worthy, and it's probably better not to go too broad, to reduce the risk of false positives.

Gotchas