How Can I Keep (My Data) Healthy?

Google offers something I want. Yeah. I hate to say that, but its true. Google wants me to pay for it. Given my aversion to all things financial, I don’t particularly want to. Just right now, I’d rather sysadmin my own solution.

Here’s what they offer: automatic backups of my photos from my cell phone. Here’s what I have: multiple Linux desktops and servers, with terabytes of unused storage. Can I have that? I want to just kind of apt-get install it, turn it on, and forget about it.

On a related note: Amazon AWS offers something I want: transparent cloud storage. Here’s what I have: an old ailing desktop that has *all* of my files, and the laptop I’m using right now, which kind of has none of them. Can I just, like, you know, just sort of put them on the cloud, and just sort of use them? Without, umm, paying google/amazon to store the many terabytes of accumulated cruft that I have?

Which reminds me… I have a website. Two, actually (this, and gnucash.org). The Internet Archive Wayback Machine has a decade or two of backup copies (thank you!) but here’s what they don’t back up: graphics. Large PNGs, gifs, photos. I’ve got photos of my great grand-parents. It would be kind of nice if these photos could sort-of-ish get vaguely replicated in a manner where they are not lost. Perhaps I desire this because I’ve inherited some sort of familial DNA that promotes archival urges. I want the family photo album to not be lost. Hmm. Maybe I should give it to the Church of the LDS. They’re kind of into saving stuff like this. Some day, When ELon Must gets to Mars, I expect the Church of the LDS to be right behind him, with a dozen rocketfuls of USB sticks containing archived geneology data.

What shall I do? For the first part: I’ll watch some youtube on how to download photos off my phone. Check. Later on, I will have to manually delete some of those photos from google storage, to free up some space so that they stop nagging me. This is remarkably difficult to do: they would rather nag me to pay them, than allow me to easily manage storage. Capitalism hard at work, doing the opposite of what I actually want. Funny how that is.

For the second part, it’s grim. Here are the choices:

XtreemFS — seems like it would do almost exactly what I want. Hasn’t been updated since 2014. Dead project.

Coda — runner up. Hasn’t been updated since 2020. What the heck. Why? At any rate, I cannot apt-get install any coda tools. Why? Because they’re not in the Debian repos.
Ceph — Cool! Actively maintained! Alive & well! Doesn’t quite do what I want. It”s aimed at supercomputer big data people. I’d have to create some big pool of raw disk space, spread over multiple computers, and uhh, dump all of my home directories into it? I dunno. That’s not really an improvement over my current state of affairs.

What is it that I want again? Well, mostly to use my laptop and desktop more or less the same way that I always do. Just to, kind, of, you know, make one or the other visible to both. And, uh, yeah, OK, for this, I guess either NFS or Samba/CIFS will do. Or even sshfs. But none of these are actually fault tolerant. So, like, if the wifi is bonkers or there’s some power outage, I want to mostly kind of continue working, and kind of automatically sync everything back up when the power/network is back. Apparently, this is … well … not really possible for ordinary people (even like me).

I kind of want my own self-hosted cloud, local to my house, except for when I travel to other places, where I want to get at it anyway. And I want it to be fault-tolerant, replicated. I don’t mind buying an extra disk or two. I do mind complicated sysadmin, and sleepless nights worrying about data loss. I don’t want to “backup my data”. I want it to back itself up.

What else we got? Lets see. Theres:

Storj — Very actively developed! Hoorrah! 99.8% compatible with AWS … that’s got to be a good thing. Does it do what I say I want, above? Not quite, not really. The market segment does not include Joe Blow who sits on a sofa all day long and putters around with computers.
GnuNet — The wikipedia article is a virtuoso list of technologies, with nary a breath of practical application. Installing gnunet-fs-gtk reveals that it is just like SleepyCat, or Gnutella, or even Bittorrent of two, almost three decades ago. But infinitely crappier GUI. So this is like a teenagers wet dream of secretly privately sharing … whatever, but that teenager has no clue about GUI’s. This is, uhh, not what I want, either.
IPFS — another knock-your-socks-off technology, that doesn’t actually do what I want.

Lets take a closer look at this last one. Can I use it to create my local fault-tolerant network-area-storage? No, cause its interplanetary. I don’t really need my local desktop replicated everywhere, and … if it was, I’d rather make it private, so others can’t see it.

Can I use it to provide a fault-tolerant mirror of my websites? Well, not really. Or not yet. Or something. So, like a decade ago, or two, I could go to any one of hundreds of hosting providers — you know the ones, those with the really annoying superbowl ads, and create a website, and be pretty sure that it just works and is up 99.999% of the time. As long is I pay them every month. And if I don’t pay? Sorry kiddo.

So I kind of want that, but, I dunno, as some kind of voluntary, communal, resource-pooled way. Like I said, I’ve got multiple servers and lots of hard disks to throw at the problem. I’m just fine with hosting other people’s content, if they host mine. Can IPFS do this? Well … uhhh… we got a little problem with URL’s …

Does filecoin solve any problem, here? I dunno. Maybe if I spent 50 years accumulating filecoin, with the hope that, as a result, I could spend that filecoin for the next 50 or 100 or 200 years, long after I’m dead, to make sure that my content lives on. Can I do that? No.

Who can? Well, github has this thing called “Arctic Code Vault” and some of my creations are there. They’ll outlive me for at least 100 years. And what about all the stuff that the Internet Archive/Wayback machine has captured? I figure 100 years, and maybe much much longer. Does it include my family photos? No. Would IPFS plus Filecoin solve this problem? No.

At the meta level, this blog post is about the stability and availability of my data, to me, and to those I share it with. Abstractly, this is kind of like the stability and availability of my personal memories. Unless I have severe head trauma, I’m pretty sure I’ll wake up tomorrow morning with my memories intact. Well, maybe no memory of what I ate for breakfast last month, but who cares. Well, OK, with old age, things like Alzheimer’s, dementia, Creutzfeld-Jacob Bovine Wasting Disease, prions, and general downward decay of health become scary bedfellows.

But look, all I’m asking for is a little bit of certainty in an uncertain world. A little bit of reliability and general health. For me, for my data. Google offers me that, in exchange for money. I’d like to see GNU, or the open source world more broadly, offer a viable alternative. I’ve rattled off some technologies. They almost get there. Just not quite. The youthful enthusiasm to create and maintain this is not focused here.

Why am I writing this? By writing, I was hoping to clarify things in my own mind. Decades of experience indicate that kind of no one actually reads what I write. Perhaps what I write is too disconnected from the daily affairs of others. Perhaps its too abstract. Perhaps its just plain boring and unoriginal. I never wanted to be a youtube influencer (I guess I just wanted it to happen by accident? Naturally, without any effort of my own, beyond the actual effort of, you know, writing shit down?) So I’ve no right to complain. Perhaps some day some large language model will ingest this blog entry and incorporate it into it’s Weltanshauung. I will have to content myself with that. If you are a human reader, I apologize. Better luck next time!

Oh, I almost forgot: there’s more.

This blog. It’s visible to you as HTML pages. But behind the scenes, its a bunch of data in a MariaDB database. Can I please please fault-tolerant backup auto-replicate that database? Without spending three days doing basic compsci research first?
The OpenCog Wiki. I’ve spent years editing content there, but I don’t have a reliable backup, and whenever Ben forgets to pay the bill, it goes off-line. Worst-case scenario is that the Wayback Machine has a copy, but it has a copy of *the html*. I want a (fault-tolerant, distributed) copy of the raw WikiMedia data!
My memories of my uncle, my youth on his farm in Wausaukee, Wisconsin and … well, if I don’t write my own autobiography, those memories will be lost forever. And if I do write my own autobiography, you still won’t know “what it was like”. My brain won’t show up in a jar at the Computer History Museum. Best I can hope for is that it shows up in a quantum vat of 2D AdS-CFT entangled bits on the surface of some Black Hole, in some variant of Frank Tipler’s Omega Point. Unfortunately, my life is more likely like a raindrop in a Nick Bostromian weather simulation. The simulation ran. On some supercomputer. The result was printed up on a glossy sheet of paper, handed to the President, who then used a black sharpie to render my life meaningless. Perhaps like it always was. And if, one day, a trillion years hence, I find myself sitting at my desktop computer backing up my vital desktop data onto some fault tolerant, highly available network storage, that would be … pretty weird, eh? In a trillion years? Yeah, it would be. For the next few months, I will have to content myself with RAID-1 and paying the electric bill. And writing crazy autobiographical blog posts.

Comments

One response to “How Can I Keep (My Data) Healthy?”

arkdae

May 21, 2025

Well, I read your blog post at least.

For automatic backups of your photos from your phone, you can use NextCloud. No, I don’t think you can just apt-get install and be done. You do need to configure it.

But that’s not enough for sure. I did try to use NextCloud as kind of a remote file server to a desktop computer but it’s just too slow, and if the network is severed, so is your access to your files.

What you want doesn’t exist. Probably something like rsync could be used to get close to what you want.

How Can I Keep (My Data) Healthy?

Comments

One response to “How Can I Keep (My Data) Healthy?”

Leave a Reply Cancel reply