Migratory Accounts


#1

This is based on a conversation I had with
@jerry@infosec.exchange
regarding how to reduce/mitigate the impact on a Mastodon user whose instance plans to cease operations.

I propose a change to the Mastodon API such that a Mastodon user has the (limited) ability to transfer their account and associated data to another Mastodon server. This would allow users to find a new home before a known instance shutdown as well as provide a way for users to just “move to a new server”.

A quick example of what the process could look like for a user named Alice to transfer their account from mastohost.one to mastohost.two:

  1. Alice logs into their @alice@mastohost.one account via the Mastodon web interface on mastohost.one.

  2. Alice clicks the link to edit their profile

  3. Alice clicks a new button marked “Unlock Account for Migration”.

  4. Alice is shown a modal explaining that once their account is unlocked for migration it is possible for their account to be transferred to another server. Alice is also informed that migration is permanent.

  5. Alice clicks a button to acknowledge the warning and unlock their account for migration.

  6. Alice clicks a new button marked “Migrate Account”.

  7. Alice is forced to confirm they want to migrate yet again because this should not be undertaken lightly.

  8. Alice is shown a dialog similar to the remote follow dialog asking for the address of the Mastodon server they want to migrate their account to.

  9. Alice types mastohost.two and clicks Migrate.

  10. Alice is taken through the signup process for mastohost.two without requiring a username or profile info.

  11. Before completing signup, mastohost.two has Alice confirm yet again that they want to transfer their account from mastohost.one to mastohost.two.

  12. Alice is shown a progress display as data is transferred between mastohost.one and mastohost.two

  13. Alice is logged in to mastohost.two. All of Alice’s data remains intact at new mastohost.two URLs.

Behind the scenes, mastohost.one exports all of the pertinent information directly to mastohost.two.

mastohost.two then assigns new locally-unique IDs to individual items and inserts them.

At this point, mastohost.two returns a map of the original URLs and each corresponding new URL to mastohost.one.

This allows mastohost.one to return a 301 redirect for each old URL with a forward to the new one for both the user’s profile and all their posts.

Follow graphs could be maintained via implementing routines to treat a 301 redirected profile as a migration.

That’s a rough sketch of the concept. There’s plenty of detail to cover obviously (frequency migration is allowed, disabling migration to/from an instance, etc.), but I figured this was a start to a conversation about the possibilities.


#2

I really like this idea. Ultimately if we want users to have control over their data, they should be able to migrate it to any instance that will have them.

On the admin side, it’d be nice to have some settings for migration:

  • Allow anyone to migrate
  • Allow users from certain servers to migrate
  • Allow one specific remote account to migrate
  • Don’t allow anyone to migrate

#3

I was going to point out the potential that a given instance may have an account limit it wants to support (and I see various numbers out there), so that would need factored in somehow.

But then @wolfteeth may have solved that issue by putting the burden on the admin (in settings) to say when migrations are open or not.

The proposed idea aside for a minute… Last time a checked, it was possible to download one’s follow/mute/block settings, so to reestablish them at a new instance. What was missing, I think, was a means to also download your own toots history.

So if we at least had the ability to download toots history (along with the other parameters) from a given instance, it would be relatively easy to setup shop on a new instance as existing before, assuming you could just as easily upload the info too.

To complete this slightly more manual but feasible migration, we’d also need the autonomous ability to delete the abandoned account. (I’m less in favor of just freezing an account, as that has implications for people worried about their personal brands to landrush instances and lock out those username options.)

But, that manual approach aside, which I’d be happy with at the very least, I do like the idea of a more integrated migration approach. /thumb up/


#4

Most definitely.

In addition, the “receiving” admin should be able to set a threshold for the maximum size of an account to accept for migration - a user with 100,000 posts but only 5 image attachments total has a very different resource use requirement than a user with 1,000 posts that each have a .webm attachment.


#5

What about some kind of “character” check from the receiving admin’s perspective? In other words, if you care about your flock, you don’t want some nasty shit-flinger crawling in to start hating on everybody.

I don’t know how you could do this, except maybe require a migration request (so not fully autonomous) that gives an admin a chance to preview the old history of the requester, maybe.

Or perhaps this suggests the development of a “reputation system” too, of some concept, but that would probably just be gamed.

I realize we like to have faith in our fellow bi-peds as being good citizens, but as human history shows, that’s not always the case.


#6

I would imagine a “require approval for all migration requests” checkbox wouldn’t be too onerous to add for admins who would like an opportunity to review any migrating accounts before accepting their data. Visually to the user trying to migrate, we could display a UI similar to the current “follow request pending” UI when a user follows a protected account.


#7

If someone else owns alice@mastohost.two, at what point is our Alice informed that they need to change their username?

For that matter, this could be used as a way to change usernames on the original instance:

  1. Alice goes through the process of migrating their account to mastohost.two.
  2. Alice creates a new, empty alice@mastohost.one account.
  3. Alice migrates back to mastohost.one, which informs Alice that there is an existing alice@mastohost.one and asks them to select a new username.
  4. Having chosen the username they want for their original account, Alice leaves alice@mastohost.one fallow or requests that it be deleted.

#8

That’s a good question - I would tend to wanting the user to get that notification as early in the process as possible (so they can choose a different “new server” if they wish to retain their username) but that would require a mechanism to “reserve” the username for the duration of the migration process.

Renaming a user should be a lot easier than a double migration, I would think, but could re-use a lot of the same functionality (redirects, follow graph update notifications to followers/followings, etc.)


#9

Connecting to Forwarding · Issue #4 · swicg/general · GitHub which is not Mastodon-specific.

I think as a user I might like the ability to have the “copy” operation separated from “move”. That is, I’d like to copy all my stuff from t0.example to t1.example, then look at it on t1.example for a while, before it’s no longer available at t0.example.

It seems like kind of a coin-flip whether to run the migration from the t0 UI or the t1 UI. Conceptually, are you going to your old place and asking your stuff be moved, or going to your new place and asking your stuff be moved?

Given that you’re probably moving because you’re somewhat unhappy with old place, I think it would feel a little better to run the move from your new place. Under the hood, t1 would authenticate as a client at t0. That should be enough to do the copying. To turn on forwarding, either the client API couid be extended, or the user could be sent back to the t0 site. This also means the old site needs very little specialized code to support migration.

Is copying or turning on forwarding such a big deal, such that it needs this heavy confirmation? Especially if the data isn’t deleted, what would happen if you copied the data, turned on forwarding for an hour, then turned it off again? I guess it depends how confused all your followers will end up. It’d be nice if it could be a low-stakes operation, until the time when you actually say “delete all my data”, which is optional.


#10

Hm, what happens if you copy your data from t0.example to t1.example and then copy your data from t1.example to t2. example and then finally delete the copy at t1.example?

If there were only one way for an account to be migrated from t0.example to t1.example it would be clear what the results would be of any particular chain of account migrations. It feels like adding more places to fork the process makes the process more likely to end badly, but that’s just a feel and I can certainly be talked out of it.


#11

Here’s a sketchy proposal. It has a few separable elements.

  1. Copying. Users can copy all their posts, with all the replies, likes, and boosts on each post, and their settings (including their subscriptions & blocklist), from t0 to t1. I lean towards this being done by t1 talking to t0’s API, but it could be done with t0 talking to t1’s API, or an app talking to the APIs of both t0 and t1.

  2. Forwarding. After doing a copy, users can instruct t0 to http-redirect all requests for their t0 URLs to corresponding t1 URLs. Whoever did the copying will need to provide a mapping database for use during this, since the ids in the URLs will probably be different. I sketched out a fairly simple approach to this in Forwarding · Issue #4 · swicg/general · GitHub. It’s simplest if t1 did the copying and maintains the mapping, but if someone else does the copying they could tell t1 the mapping. I think it’s best if t0 not do the mapping long-term, because we want to minimize the complexity of t0 for afterlife mode. When forwarding is turned on, user’s data on the site could still be accessed via the API, but not the public endpoints. Through those public endpoints, no one can tell if the data is then deleted.

  3. Being Forwarded. This is a deeper change, touching the database, API, and potentially every UI. (It should be done in a way that existing UIs dont break, of course.). In all the server-to-server interactions, servers need to behave appropriately when they hit an HTTP redirect. I think that means they need to understand redirect may mean that user@t0.example has moved, for now, to user@t1.example. I think it’s probably good to be explicit to users about this, like how mediawiki pages tell you about redirections when a page is renamed. But if the UI doesn’t know this json field, then it’ll ignore it, fine. To handle multi-hop forwardings, it needs to be that each user identity has a set of zero of more forwarded-from identities, I guess.

  4. Permanent Change. After some period (30 days?) of consecutive occurrences of forwarding, without ever getting proper access without forwarding, servers should probably stop trying to access the old identity. I’m not sure how important this is. I think there are competing use cases. You don’t want to make the change permanent immediately because it might be user@t0 was briefly compromised. But if you never make it, some operations will be slower, maybe much slower if you need to do a timeout.

With that framing, is it clear enough what happens if a user moves from t0 to t1, then to t2, then maybe to t3? It would be good to specify and test this enough that it can even handle the user moving back to t0. (Basically, they just need to turn off the t0 forwarding before they forward to it, and all the other systems need to expect that as a possibility.)

In this sketch, I’ve been treating a user’s public end points as just simple web pages (like blog posts), but of course it’s more complicated. We’d want this to work with:

  • looking at someone’s profile
  • looking at their feed
  • looking at one of their posts
  • seeing the replies, boosts, likes on that post
  • seeing their replies, boosts, likes on other people’s posts
  • muting/blocking
  • … other things i’m forgetting

It’s all sounding a bit daunting at the moment. But I think the four parts can be done in order, each bringing its own value. There are probably steps 5, 6, etc, too, involving 3rd parties and crypto. :slight_smile:

(On blocking: I guess my server will have to dereference everyone I’m blocking every once in a while, to see if they’ve moved, so it can block their new identifier. Ohhhh, for that we want the new identity, in a profile field, to list the old identities. Otherwise A could damage B by getting blocked by a lot of people, then forwarding to B, so B is blocked without having done anything.)


#12

Hubzilla has provided a working nomadic mobility (“clone”) and backup model for the last 4-5 years. It certainly is not the only solution to the mobility problem. Several projects have implemented or have partially implemented export/import “move” migrations. Friendica and GNU-Social (actually StatusNet; I think the ability may have gotten lost) have both done this and Diaspora is in the process of implementing this ability.

I think we tend to re-invent a lot of wheels in this space. While there are aspects of each solution which may not meet with general approval for this audience, studying prior art is always a good exercise. The StatusNet solution used AtomPub for instance; and Hubzilla’s implementation is based on a separation of identity and location and live sync of changes to your data to all your available clones. Changing links and existing data structures to preserve your experience across locations is something we’ve all had to deal with.


#13

I think process-wise we’re in pretty much the same place.

I’m not sure I see the use cases of severing your elements 1-3 into discrete independent actions. While I can see reasons to want my content to be available from multiple endpoints, I can see just as many reasons that having N copies of ostensibly the same data under the control of P administrators across the network could be a disaster without many more data integrity controls than current fediverse protocols specify.

In the short term, I’d like to see a simple solution to save users some pain in the form of inevitably disappearing Mastodon instances - we’ve already seen a non-trivial number of servers get set up, build userbases, and then close up shop. That’s always been a part of certain frontiers of our telecommunications world, but it would be great to be able to offer at least a majority of those affected by an eminent instance shutdown a way to get all the way into another boat before theirs sinks.

We should most definitely build a more robust solution into the underlying protocols so that the robust solution can work across various server platforms that implement said protocols, but I’m not sure that will see the light of day before a much larger number of Mastodon instances find themselves shutting down for various reasons. That’s my underlying motivation for bringing it up here, rather than a swicg or activitypub venue - a little battlefield triage and first aid before we see about preventing folks from getting shot in the first place. :slight_smile:


#14

I wonder if one way of proceeding would be for a user to be able to

  1. request a downloadable archive (.zip) of their content from server S1
  2. get an email from S1 saying their archive is ready for download
  3. download the archive from S1
  4. create a new account at S2
  5. import the archive to S2
  6. get an email from S2 saying the import is complete

1-3 is something that Twitter and Facebook do already to some degree. 4-6 are not. If S1 is turned off, the URLs pointing to S1 will no longer work. I don’t think there’s a way around this without replacing DNS + URI with something else.


#15

I think this is a great idea. :slight_smile:

I do want to add – and I’m not sure if this diverges so much that I should ask this as a separate topic or not – but: What happens when one user has these two accounts, perhaps simply to do different topics on one host than the other:

mastohost.one/@applebaum
mastohost.two/@applebaum

and mastohost.one decides it’s closing down but mastohost.two is not (or they decide they don’t need to separate those topics after all) – and they want to consolidate the two accounts into one? The tool as described wouldn’t provide any way to do so.

I realize they could conceivably set up, say, mastohost.two/@applebaum-from-one, but it might be less messy for users to make a way that they could effectively subsume the mastohost.one account into the mastohost.two account they already have. (Plus it seems that it’s very hard for users to notify all their followers and get follows through – but if something like a server note gets sent to following users, that’d have more visibility and reduce missed follows, perhaps.)

Of course this would require confirmation that they have that account, but I imagine that wouldn’t require a lot of extra work.

(edited to clarify, rephrase, etc., and put a ‘user need’ item before a ‘user want’ item to demonstrate the application of this functionality would solve a certain problem as well as offer users a way to streamline their experience)


#17

Zip download would be very useful from a user point of view regarding backup. The ability to grab a copy of your data periodically has some appeal.
The migration might be better done another way, but if as a user we are unaware of an instance having issues, for it to suddenly go down… Well, an occasional personal zip backup would mitigate against this a little.
Equally it allows a user to leave an instance and have a period of time offline before returning.

(edit for spelling after using phone…)


#18

Why the email delay on the archive? Isn’t it simpler for the user to have the download just stream as slowly as the server needs it to? Unless the server needs to do a lot of work before it can even begin streaming the archive…, but I wouldn’t think that would be the case. (Certainly tar can stream immediately; I haven’t played with zip a lot.)


#19

I wrote once a streaming, repeatable implementation of ZIP directory, can be made to work especially if you do not compress the input files.(you need to know metadata of the files to be included in advance, though).


#20

It seems like the export function ought to be under the control of the individual user, but the import should allow for an admin to be the gatekeeper. If your instance blocks another instance, you wouldn’t want someone from that instance to import their entire history of shitposts into yours.


#21

That absolutely makes sense.