Full-text search in Mastodon
To say search is viewed with great suspicion on Mastodon, is an understatement: There is a feeling that search leads to abuse. That people will search for controversial posts / subjects they don't like, and just pile on.
Whilst I can understand that view, for me personally search is crucial to my enjoyment of social media: I like to know what others may have said about a subject before weighing in, so as not to repeat the same arguments over and over again, which is just tiring. I also like to see if anyone has already posted a cool article, to gauge consensus around it. Overall I love being able to find discussions other than by hashtags.
Unfortunately, default Mastodon search doesn't allow for any of that: even if full-text search is enabled, it only searches your own posts, mentions, favourites, and bookmarks, so doesn't help you discover new stuff at all. But there are options that I'd like to discuss in this post.
Using Google
Using Google to search for mastodon posts actually works pretty well: For example, if you want to search for discussions on full-text search, you can use Google to search site:mastodon.social full text search
and get loads of results:
This works pretty well, but requires Google, and it works less well on less-known instances, which might not be indexed quite as well (e.g. searching with site:mstdn.thms.uk
doesn't surface many useful results).
Extended Search on your own instance.
Note: Since writing this post, VyrCossont has released a further patch that fully replaces the original and offers extended search of both accounts and posts. I've updated this part of the post throughout, to reference the new patch instead.
If you are running your own Mastodon instance, there is a great patch by VyrCossont that enables full-text search of all statuses and accounts on your instance. This pull request can be applied to your own code using standard git tools (it's specifically for glitch-soc, a fork of Mastodon that provides a host of really helpful additional features, but others have reported installing it on Vanilla Mastodon with just minor adjustments).
I have installed this on mstdn.thms.uk a short while ago, and absolutely love it! Have a look at the sort of search results I get on my instance:
Here is how you can set it up too, if you want to:
Set up Elasticsearch
Firstly, you need a server to run Elasticsearch on. I did experiment with running Elasticsearch on my mastodon instance, but ultimately decided to create a separate server for it: Mastodon just runs far more reliably if it doesn't have to contend with Elasticsearch for server resources.
Currently, I'm running Elasticsearch on a small 1 vCPU, 2GB DigitalOcean server for $14 per month. This seems to be plenty for my single user instance, but depending on the size of your Mastodon instance, you may need a larger server.
Once you have your server, set up Elasticsearch. I simply followed Mastodon's instructions. But, make sure you set up your firewall to only allow access to port 9200 from your Mastodon instance!
Apply and configure the Extended Search patch.
You can use the standard Git tools to apply the PR. Once done, navigate to the mastodon directory on your server, and open .env.production
. Add the following lines:
ES_ENABLED=true
ES_HOST=localhost # or the IP of your Elasticsearch instance
ES_PORT=9200
STATUS_SEARCH_SCOPE=discoverable # or public, or
public_or_unlisted
ACCOUNT_SEARCH_SCOPE=discoverable # or all, or classic
Now restart your services to apply the changes:
systemctl restart mastodon-web
systemctl restart mastodon-sidekiq
Finally, populate your Elasticseach index:
RAILS_ENV=production bin/tootctl search deploy --only accounts
RAILS_ENV=production bin/tootctl search deploy --only statuses
Particularly the second command can run for many hours (it ran about 18 hours on my tiny, single user instance), so I recommend using something like tmux
to ensure it doesn't stop if you get disconnected, as well as running it during off-peak hours, if possible.
Using Extended Search
This patch also greatly extends Mastodon's existing advanced search syntax. Here are the details on the new operators. All can be inverted by prefixing them with -
, except for before:
, after:
, scope:
, and sort:
.
Accounts
-
#hashtag
: find only accounts that are tagged with#hashtag
-
domain:mastodon.social
: find only accounts onmastodon.social
-
is
:-
is:bot
: account that created the post is a bot -
is:group
: account that created the post is a group -
is:local
: account is on this instance -
is:memorial
: account is a memorial -
is:sensitive
: account is marked as sensitive/🔞
-
-
scope:following
: restrict search to users that the searching user is following
Posts
-
from:local_username
orfrom:username@domain.tld
: find posts from a specific user (This is actually a vanilla Mastodon feature, but is not documented anywhere.) -
#hashtag
: find only posts that are tagged with#hashtag
-
domain:mastodon.social
: find only posts frommastodon.social
-
lang:es:
find posts in Spanish -
is
:-
is:bot
: account that created the post is a bot -
is:group
: account that created the post is a group -
is:local
: post is on this instance -
is:local_only
: post is a local-only post (only applies to Glitch and Hometown, vanilla Mastodon doesn't have these) -
is:reply
: post is a reply to another post -
is:sensitive
: post is marked as sensitive/🔞
-
-
has
:-
has:cw
: post has a content warning -
has:link
: post has at least one link that is not a mention or hashtag, as determined by the parser used to fetch link preview cards -
has:media
: post has at least one media attachment -
has:poll
: post has a poll attachment -
has:tag
: post has at least one hashtag
-
-
visibility:
: can bepublic
,unlisted
,private
,direct
, orlimited
-
before:
,after:
with a date: Maps to ES date range search -
scope:classic
: restrict search to current user's bookmarks, favs, boosts, own posts, etc. as in vanilla Mastodon -
sort:
-
sort:newest
: display newest posts first (default) -
sort:oldest
: display oldest posts first
-
If you previously had the original Extended search #2 patch applied
This new patch is a superset of the original patch. All features included in #2 are also included in #5. As such, migrating from the one to the other is very straightforward:
- Merge the VyrCossont/account-search branch.
- In your
.env.production
replaceSEARCH_SCOPE
withSTATUS_SEARCH_SCOPE
, and add your choice forACCOUNT_SEARCH_SCOPE
- Restart services:
sudo systemctl restart mastodon-sidekiq.service
sudo systemctl restart mastodon-web.service
- Reindex all accounts and posts
RAILS_ENV=production bin/tootctl search deploy --only accounts
RAILS_ENV=production bin/tootctl search deploy --only statuses
Particularly the last command can run for many hours (it ran about 18 hours on my tiny, single user instance), so I recommend using something like tmux
to ensure it doesn't stop if you get disconnected, as well as running it during off-peak hours, if possible.
Summary
If you want to find mastodon posts using full-text search, Google may be a viable option. But if you are running your own instance, you can also install a patch to enable proper full-text search on it, which made me enjoy Mastodon even more.
As I'm running a single user instance, the risk of harm from enabling extended full-text search on my instance is small: It's only me using it, so I only need to keep an eye on myself to prevent abuse.