Why is the Database Size Smaller in Backups?

Most databases need indexes. Indexes aim to make the database more responsive when performing queries against large data sets. In the case of the Mastodon database, many indexes are necessary to make Mastodon faster to retrieve data. The more indexes one creates, the more disk space is needed, even if the data on the database is the same.

When generating a backup, the indexes are not included. Only the data in the database is backed up, and the information on how to generate the indexes. When restoring the database from the backup, the data is first restored, then the indexes are generated. In the end, the disk space necessary to restore the database will be larger than the backup archive size.

Also, the database backup is a custom-format archive generated using pg_dump. This format is not only the most portable but also compressed by default. When the database is restored, the disk space necessary will be similar to the total database disk space usage you see on the Masto.host interface. You can read more about pg_dump in the PostgreSQL documentation.