Account media backup - A little tool and an ethical dilemma


So, because of the fact that I might go back into teaching high school students, I thought I might as well refresh my Python and scratch an itch.
So on the basis that I have posted a bunch of content onto, but am really bad at remembering what I have posted and where, taking a useable backup was on my mind.

Also, although I am happy with my admin/instance, the fact that all of that could just disappear if the server goes offline for reasons out of my control has been playing on my mind. For this reason I think that migratory accounts or the ability to at least backup your own content is rather important.
I selected an instance with very little research, and I’m sure I’m not alone in this. The fact that it has turned out fine is luck, rather than design.

Anyway, I have a little Python script that grabs an offline copy of all the media I have posted, storing it in dated directories and skipping previously retrieved stuff.
It occurs to me that it might be useful to others as there is no real user backup in Mastodon as yet.
(I’m working on toot content as well as just media, along with database storage using sqlite - that’s the next itch)

HOWEVER, Mastodon is sort of more private and less searchable.

I am only using my profile URL, but it would work with anybody’s, so could be misused.

There is a possibility of somebody grabbing all of the images/media of a third party, just by using their profile URL. (Toots etc… if I later implement that and release it)

Now I know, people can browse easily enough the public stuff anyway, but the level of automation makes it feel a little different.

Legitimate concern, or not something to worry about?

Any thoughts or ideas welcome. Currently it’s my code on my system and not released.
I’d happily set it free for others to use/improve*, but do not want to let something that could be misused in some way out there.

I did Toot about this, but obviously have a limited following and scope of interaction on which to base any decision.

*(I am probably doing this a weird way for an programmers out there, so there is the fact that letting my oddball coding/methods loose would be a source of embarrassment when somebody replaces it with a few lines of tight code :slight_smile: )


I wouldn’t worry about it myself. Users cannot expect that publicly available content would remain unseen.

That you’re collecting it in bulk changes little.


OK, so the little script is here if anyone wants it.
As I tooted a minute ago:

Warning, I am not a programmer by trade, just playing. I am a generalist, not a specialist.
Trawls original posts of your profile and downloads any media you uploaded in your toots. This means only toots you have initiated. Not boosts or indeed at the moment replies…