Some social services let users export their data as an archive file. Instagram’s export, for example, provides an archive with JSON and media files that describe all the posts and stories posted by the user on the service. It would be nice if this data could be imported into mastodon. Such an import would help users migrate away from non-federated alternatives such as FB and IG to mastodon. Or at least, it would help users to consider mastodon as a viable alternative.
I’m creating this topic as a way to gather (technical) feedback on this idea. I’d be working on this myself and would like to know what problems I can anticipate.
Some potential pitfalls I have thought about:
- Exporting the data archive would be a manual task for the user, typically the service emails you a download link and then you download the archive to your local storage. This would be a lot of effort to automate.
- As the exports contain media files (images, video), the resulting archives tend to be rather large. Think 200Mb to 1-2GB.
- Exports contain privacy-sensitive data, so privacy-conscious users are unlikely to upload it to some shady third party service that offers the import.
- To me this sounds like such an import application should be run by a trusted party that has the resources necessary to process potentially big archives. For example, either the mastodon instance itself or a computing device of your choosing (desktop, headless server, laptop, mobile device(?)).
When posting the exported contents to mastodon, I see some issues as well:
- If we want to keep the dates of the imported statuses, we need to be able to post statuses in the past. I believe this is disallowed by the mastodon REST api, according to the documentation. I don’t yet know a way around this, without altering the REST API.
- Rate limiting: most mastodon instances don’t allow you to submit hundreds of REST API requests per second.
- Some posts might violate the instance’s rules (there would then need to be moderation for posts in the past?)
- Some posts might violate the instance’s restriction: e.g. more characters than allowed or more media messages than allowed.
Are there any other issues that I should expect? I want to have a working solution for myself (to import all the contents of a travel instagram account to my mastodon instance), but it would be nice if the result ends up being usable by others as well.