arrow_back Blog Fixing Mastodon avatars after migrating to S3

Fixing Mastodon avatars after migrating to S3

translate Ten post jest dostępny tylko w języku angielskim.

Well, long story short - I run a Mastodon server for my friends. Since it loves to hoard media like a digital raccoon, I quickly realized that massive part of my storage requirements is just cache.

Well, I've decided to migrate to S3-compatible storage. Following official guide on how to set it up - I've done so, pretty soon finding out that I have to actually sync the files to the cloud.

I did not expect how much data is there.

First of all, I've done some clean-up:

bin/tootctl media remove --days=30
bin/tootctl preview_cards remove --days=30

This removed any media that are older than 30 days, which was nice, but what I was left with was 200+ GB of avatars. I've decided to just remove those, because they will be redownloaded, right?

Nope.

After moving the whatever was left in the system folder

aws s3 sync public/system/ s3://bucket-name/ --endpoint-url=https://provider.endpoint/ --acl public-read

and updating .env.production

S3_ENABLED=true
S3_BUCKET=bucket-name
AWS_ACCESS_KEY_ID=XXXXXXXX
AWS_SECRET_ACCESS_KEY=XXXXXXXXXXX
S3_PROTOCOL=https
S3_HOSTNAME=cdn.mastodon.host
S3_ENDPOINT=https://provider.endpoint/

I opened the Mastodon UI to see none of the avatars.

Lovely.

I've then went to the rails console and found out that I have... 75k remote accounts.

Account.left_outer_joins(:user).where(users: { id: nil }).count

This will take a while... will it?

After some google-fu I found that you can use reset_avatar! and reset_header! on the account, which will redownload those - but doing that for 75k accounts will take ages.

Here's the thing - I only really care about the most recent accounts and the ones that my friends follow - so let's do just that.

Quick script to reset avatars and headers on followed accounts:

remote_accounts = Account.left_outer_joins(:user)
    .where(users: { id: nil }) # remote accounts only
    .where(id: Follow.where(account_id: Account.joins(:user).select(:id)).select(:target_account_id))

remote_accounts.find_each do |account|
  begin
    account.reset_avatar!
  rescue => e
    puts "  Avatar reset failed: #{e.class} - #{e.message}"
  end

  begin
    account.reset_header!
  rescue => e
    puts "  Header reset failed: #{e.class} - #{e.message}"
  end

  account.save
  sleep 0.1
end

The sleep 0.1 is so that any requests are limited to at most 10 per second - I don't want to bring someone else's instance down because of my own mistakes.

Anyway, this was much better - this brought the number down to like 100x less.

Then I ran another one to update the lastest 100 accounts - the most recent on the timeline, so those will be the ones we really care about.

Account.left_outer_joins(:user)
       .where(users: { id: nil })
       .order(created_at: :desc)
       .limit(100)
       .find_each do |account|
  begin
    account.reset_avatar!
  rescue => e
    puts "  Avatar reset failed: #{e.class} - #{e.message}"
  end

  begin
    account.reset_header!
  rescue => e
    puts "  Header reset failed: #{e.class} - #{e.message}"
  end

  account.save
  sleep 0.1
end

Good! It seems to be just fine!

Then we can clean up some individual accounts:

a = Account.find_remote('nickname', 'mastodon.social')
a.reset_avatar!
a.reset_header!
a.save

And done!