Fixing Mastodon avatars after migrating to S3
Well, long story short - I run a Mastodon server for my friends. Since it loves to hoard media like a digital raccoon, I quickly realized that massive part of my storage requirements is just cache.
Well, I've decided to migrate to S3-compatible storage. Following official guide on how to set it up - I've done so, pretty soon finding out that I have to actually sync the files to the cloud.
I did not expect how much data is there.
First of all, I've done some clean-up:
bin/tootctl media remove --days=30
bin/tootctl preview_cards remove --days=30
This removed any media that are older than 30 days, which was nice, but what I was left with was 200+ GB of avatars. I've decided to just remove those, because they will be redownloaded, right?
Nope.
After moving the whatever was left in the system folder
aws s3 sync public/system/ s3://bucket-name/ --endpoint-url=https://provider.endpoint/ --acl public-read
and updating .env.production
S3_ENABLED=true
S3_BUCKET=bucket-name
AWS_ACCESS_KEY_ID=XXXXXXXX
AWS_SECRET_ACCESS_KEY=XXXXXXXXXXX
S3_PROTOCOL=https
S3_HOSTNAME=cdn.mastodon.host
S3_ENDPOINT=https://provider.endpoint/
I opened the Mastodon UI to see none of the avatars.
Lovely.
I've then went to the rails console and found out that I have... 75k remote accounts.
Account.left_outer_joins(:user).where(users: { id: nil }).count
This will take a while... will it?
After some google-fu I found that you can use reset_avatar!
and reset_header!
on the account, which will redownload those - but doing that for 75k accounts will take ages.
Here's the thing - I only really care about the most recent accounts and the ones that my friends follow - so let's do just that.
Quick script to reset avatars and headers on followed accounts:
remote_accounts = Account.left_outer_joins(:user)
.where(users: { id: nil }) # remote accounts only
.where(id: Follow.where(account_id: Account.joins(:user).select(:id)).select(:target_account_id))
remote_accounts.find_each do |account|
begin
account.reset_avatar!
rescue => e
puts " Avatar reset failed: #{e.class} - #{e.message}"
end
begin
account.reset_header!
rescue => e
puts " Header reset failed: #{e.class} - #{e.message}"
end
account.save
sleep 0.1
end
The sleep 0.1
is so that any requests are limited to at most 10 per second - I don't want to bring someone else's instance down because of my own mistakes.
Anyway, this was much better - this brought the number down to like 100x less.
Then I ran another one to update the lastest 100 accounts - the most recent on the timeline, so those will be the ones we really care about.
Account.left_outer_joins(:user)
.where(users: { id: nil })
.order(created_at: :desc)
.limit(100)
.find_each do |account|
begin
account.reset_avatar!
rescue => e
puts " Avatar reset failed: #{e.class} - #{e.message}"
end
begin
account.reset_header!
rescue => e
puts " Header reset failed: #{e.class} - #{e.message}"
end
account.save
sleep 0.1
end
Good! It seems to be just fine!
Then we can clean up some individual accounts:
a = Account.find_remote('nickname', 'mastodon.social')
a.reset_avatar!
a.reset_header!
a.save
And done!