{"id":285,"date":"2024-11-18T05:33:00","date_gmt":"2024-11-18T05:33:00","guid":{"rendered":"https:\/\/linas.org\/blog\/?p=285"},"modified":"2024-11-18T05:33:31","modified_gmt":"2024-11-18T05:33:31","slug":"how-can-i-keep-my-data-healthy","status":"publish","type":"post","link":"https:\/\/linas.org\/blog\/2024\/11\/how-can-i-keep-my-data-healthy\/","title":{"rendered":"How Can I Keep (My Data) Healthy?"},"content":{"rendered":"\n<p>Google offers something I want. Yeah. I hate to say that, but its true. Google wants me to pay for it. Given my aversion to all things financial, I don&#8217;t particularly want to. Just right now, I&#8217;d rather sysadmin my own solution.<\/p>\n\n\n\n<p>Here&#8217;s what they offer: automatic backups of my photos from my cell phone. Here&#8217;s what I have: multiple Linux desktops and servers, with terabytes of unused storage. Can I have that?  I want to just kind of apt-get install it, turn it on, and forget about it.<\/p>\n\n\n\n<p>On a related note: Amazon AWS offers something I want: transparent cloud storage.  Here&#8217;s what I have: an old ailing desktop that has *all* of my files, and the laptop I&#8217;m using right now, which kind of has none of them.  Can I just, like, you know, just sort of put them on the cloud, and just sort of use them? Without, umm, paying google\/amazon to store the many terabytes of accumulated cruft that I have? <\/p>\n\n\n\n<p>Which reminds me&#8230; I have a website. Two, actually (this, and <a href=\"https:\/\/gnucash.org\/\" data-type=\"link\" data-id=\"https:\/\/gnucash.org\/\">gnucash.org<\/a>). The Internet Archive Wayback Machine has a decade or two of backup copies (thank you!) but here&#8217;s what they don&#8217;t back up: graphics. Large PNGs, gifs, photos. I&#8217;ve got photos of my great grand-parents. It would be kind of nice if these photos could sort-of-ish get vaguely replicated in a manner where they are not lost. Perhaps I desire this because I&#8217;ve inherited some sort of familial DNA that promotes archival urges. I want the family photo album to not be lost. Hmm. Maybe I should give it to the Church of the LDS. They&#8217;re kind of into saving stuff like this. Some day, When ELon Must gets to Mars, I expect the Church of the LDS to be right behind him, with a dozen rocketfuls of USB sticks containing archived geneology data.<\/p>\n\n\n\n<p>What shall I do? For the first part: I&#8217;ll watch some youtube on how to download photos off my phone. Check. Later on, I will have to manually delete some of those photos from google storage, to free up some space so that they stop nagging me. This is remarkably difficult to do: they would rather nag me to pay them, than allow me to easily manage storage. Capitalism hard at work, doing the opposite of what I actually want. Funny how that is.<\/p>\n\n\n\n<p>For the second part, it&#8217;s grim. Here are the choices:<\/p>\n\n\n\n<ul>\n<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/XtreemFS\" data-type=\"link\" data-id=\"https:\/\/en.wikipedia.org\/wiki\/XtreemFS\">XtreemFS<\/a> &#8212; seems like it would do almost exactly what I want. Hasn&#8217;t been updated since 2014. Dead project.<\/li>\n<\/ul>\n\n\n\n<ul>\n<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Coda_(file_system)\" data-type=\"link\" data-id=\"https:\/\/en.wikipedia.org\/wiki\/Coda_(file_system)\">Coda<\/a> &#8212; runner up. Hasn&#8217;t been updated since 2020. What the heck. Why? At any rate, I cannot apt-get install any coda tools. Why? Because they&#8217;re not in the Debian repos.<\/li>\n\n\n\n<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Ceph_(file_system)\" data-type=\"link\" data-id=\"https:\/\/en.wikipedia.org\/wiki\/Ceph_(file_system)\">Ceph<\/a> &#8212; Cool! Actively maintained! Alive &amp; well! Doesn&#8217;t quite do what I want. It&#8221;s aimed at supercomputer big data people. I&#8217;d have to create some big pool of raw disk space, spread over multiple computers, and uhh, dump all of my home directories into it? I dunno. That&#8217;s not really an improvement over my current state of affairs.<\/li>\n<\/ul>\n\n\n\n<p>What is it that I want again? Well, mostly to use my laptop and desktop more or less the same way that I always do. Just to, kind, of, you know, make one or the other visible to both. And, uh, yeah, OK, for this, I guess either NFS or Samba\/CIFS will do. Or even sshfs. But none of these are actually fault tolerant. So, like, if the wifi is bonkers or there&#8217;s some power outage, I want to mostly kind of continue working, and kind of automatically sync everything back up when the power\/network is back. Apparently, this is &#8230; well &#8230; not really possible for ordinary people (even like me).<\/p>\n\n\n\n<p>I kind of want my own self-hosted cloud, local to my house, except for when I travel to other places, where I want to get at it anyway. And I want it to be fault-tolerant, replicated. I don&#8217;t mind buying an extra disk or two. I do mind complicated sysadmin, and sleepless nights worrying about data loss. I don&#8217;t want to &#8220;backup my data&#8221;. I want it to back itself up. <\/p>\n\n\n\n<p>What else we got? Lets see. Theres: <\/p>\n\n\n\n<ul>\n<li><a href=\"https:\/\/github.com\/storj\" data-type=\"link\" data-id=\"https:\/\/github.com\/storj\">Sto<\/a>rj &#8212; Very actively developed! Hoorrah! 99.8% compatible with AWS &#8230; that&#8217;s got to be a good thing. Does it do what I say I want, above? Not quite, not really. The market segment does not include Joe Blow who sits on a sofa all day long and putters around with computers.<\/li>\n\n\n\n<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/GNUnet\" data-type=\"link\" data-id=\"https:\/\/en.wikipedia.org\/wiki\/GNUnet\">GnuNet<\/a> &#8212; The wikipedia article is a virtuoso list of technologies, with nary a breath of practical application. Installing gnunet-fs-gtk reveals that it is just like SleepyCat, or Gnutella, or even Bittorrent of two, almost three decades ago. But infinitely crappier GUI. So this is like a teenagers wet dream of secretly privately sharing &#8230; whatever, but that teenager has no clue about GUI&#8217;s.  This is, uhh, not what I want, either.<\/li>\n\n\n\n<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/InterPlanetary_File_System\" data-type=\"link\" data-id=\"https:\/\/en.wikipedia.org\/wiki\/InterPlanetary_File_System\">IPFS <\/a>&#8212; another knock-your-socks-off technology, that doesn&#8217;t actually do what I want.<\/li>\n<\/ul>\n\n\n\n<p>Lets take a closer look at this last one. Can I use it to create my local fault-tolerant network-area-storage? No, cause its interplanetary. I don&#8217;t really need my local desktop replicated everywhere, and &#8230; if it was, I&#8217;d rather make it private, so others can&#8217;t see it.<\/p>\n\n\n\n<p>Can I use it to provide a fault-tolerant mirror of my websites? Well, not really. Or not yet. Or something. So, like a decade ago, or two, I could go to any one of hundreds of hosting providers &#8212;  you know the ones, those with the really annoying superbowl ads, and create a website, and be pretty sure that it just works and is up 99.999% of the time. As long is I pay them every month. And if I don&#8217;t pay? Sorry kiddo.<\/p>\n\n\n\n<p>So I kind of want that, but, I dunno, as some kind of voluntary, communal, resource-pooled way. Like I said, I&#8217;ve got multiple servers and lots of hard disks to throw at the problem. I&#8217;m just fine with hosting other people&#8217;s content, if they host mine. Can IPFS do this? Well &#8230; uhhh&#8230; we got a little problem with URL&#8217;s &#8230;<\/p>\n\n\n\n<p>Does filecoin solve any problem, here? I dunno. Maybe if I spent 50 years accumulating filecoin, with the hope that, as a result, I could spend that filecoin for the next 50 or 100 or 200 years, long after I&#8217;m dead, to make sure that my content lives on. Can I do that? No.<\/p>\n\n\n\n<p>Who can? Well, github has this thing called &#8220;Arctic Code Vault&#8221; and some of my creations are there. They&#8217;ll outlive me for at least 100 years. And what about all the stuff that the Internet Archive\/Wayback machine has captured? I figure 100 years, and maybe much much longer. Does it include my family photos? No. Would IPFS plus Filecoin solve this problem? No.<\/p>\n\n\n\n<p>At the meta level, this blog post is about the stability and availability of my data, to me, and to those I share it with. Abstractly, this is kind of like the stability and availability of my personal memories. Unless I have severe head trauma, I&#8217;m pretty sure I&#8217;ll wake up tomorrow morning with my memories intact. Well, maybe no memory of what I ate for breakfast last month, but who cares. Well, OK, with old age, things like Alzheimer&#8217;s, dementia, Creutzfeld-Jacob Bovine Wasting Disease, prions, and general downward decay of health become scary bedfellows.<\/p>\n\n\n\n<p>But look, all I&#8217;m asking for is a little bit of certainty in an uncertain world. A little bit of reliability and general health. For me, for my data. Google offers me that, in exchange for money. I&#8217;d like to see GNU, or the open source world more broadly, offer a viable alternative. I&#8217;ve rattled off some technologies. They almost get there. Just not quite. The youthful enthusiasm to create and maintain this is not focused here.<\/p>\n\n\n\n<p>Why am I writing this? By writing, I was hoping to clarify things in my own mind. Decades of experience indicate that kind of no one actually reads what I write. Perhaps what I write is too disconnected from the daily affairs of others. Perhaps its too abstract. Perhaps its just plain boring and unoriginal. I never wanted to be a youtube influencer  (I guess I just wanted it to happen by accident? Naturally, without any effort of my own, beyond the actual effort of, you know, writing shit down?) So I&#8217;ve no right to complain. Perhaps  some day some large language model will ingest this blog entry and incorporate it into it&#8217;s Weltanshauung.  I will have to content myself with that. If you are a human reader, I apologize.  Better luck next time!<\/p>\n\n\n\n<p>Oh, I almost forgot: there&#8217;s more.<\/p>\n\n\n\n<ul>\n<li>This blog. It&#8217;s visible to you as HTML pages. But behind the scenes, its a bunch of data in a MariaDB database. Can I please please fault-tolerant backup auto-replicate that database? Without spending three days doing basic compsci research first?<\/li>\n\n\n\n<li>The <a href=\"https:\/\/opencog.org\" data-type=\"link\" data-id=\"https:\/\/opencog.org\">OpenCog Wiki<\/a>. I&#8217;ve spent years editing content there, but I don&#8217;t have a reliable backup, and whenever Ben forgets to pay the bill, it goes off-line. Worst-case scenario is that the Wayback Machine has a copy, but it has a copy of *the html*. I want a (fault-tolerant, distributed) copy of the raw WikiMedia data!<\/li>\n\n\n\n<li>My memories of my uncle, my youth on his farm in Wausaukee, Wisconsin and &#8230; well, if I don&#8217;t write my own autobiography, those memories will be lost forever. And if I do write my own autobiography, you still won&#8217;t know &#8220;what it was like&#8221;. My brain won&#8217;t show up in a jar at the Computer History Museum. Best I can hope for is that it shows up in a quantum vat of 2D AdS-CFT entangled bits on the surface of some Black Hole, in some variant of Frank Tipler&#8217;s Omega Point. Unfortunately, my life is more likely like a raindrop in a Nick Bostromian weather simulation. The simulation ran. On some supercomputer. The result was printed up on a glossy sheet of paper, handed to the President, who then used a black sharpie to render my life meaningless. Perhaps like it always was. And if, one day, a trillion years hence, I find myself sitting at my desktop computer backing up my vital desktop data onto some fault tolerant, highly available network storage, that would be &#8230; pretty weird, eh? In a trillion years? Yeah, it would be. For the next few months, I will have to content myself with RAID-1 and paying the electric bill. And writing crazy autobiographical blog posts.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Google offers something I want. Yeah. I hate to say that, but its true. Google wants me to pay for it. Given my aversion to all things financial, I don&#8217;t particularly want to. Just right now, I&#8217;d rather sysadmin my own solution. Here&#8217;s what they offer: automatic backups of my photos from my cell phone. [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/linas.org\/blog\/wp-json\/wp\/v2\/posts\/285"}],"collection":[{"href":"https:\/\/linas.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/linas.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/linas.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/linas.org\/blog\/wp-json\/wp\/v2\/comments?post=285"}],"version-history":[{"count":10,"href":"https:\/\/linas.org\/blog\/wp-json\/wp\/v2\/posts\/285\/revisions"}],"predecessor-version":[{"id":296,"href":"https:\/\/linas.org\/blog\/wp-json\/wp\/v2\/posts\/285\/revisions\/296"}],"wp:attachment":[{"href":"https:\/\/linas.org\/blog\/wp-json\/wp\/v2\/media?parent=285"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/linas.org\/blog\/wp-json\/wp\/v2\/categories?post=285"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/linas.org\/blog\/wp-json\/wp\/v2\/tags?post=285"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}