srfi-194, Zipf, git, blockchain, AtomSpace

So I recently implemented the Zipfian random distribution for the scheme request-for-implementation 194. The code is available on git.  Discussions turned to the merits of using git and github, and so I had the opportunity to opine about the meaning of life in a long email.  Without further ado, the meaning of life:

Both bitcoin and git showed up at the same time, and both were built on the same idea of append-only logs. git explicitly allowed multiple forks; bitcoin explicitly forbade them. Both made design mistakes, as they were early adopters. Git wasn’t sufficiently file-system-like-ish, (which is plan9’s strong point). A shame, since file-backup, restore, corruption-protection from viruses, accidental deletion, etc. are a “thing”. For example, keeping the unix  /etc in git is a life-saver for system admins.  No matter; seems that guix and nix might have a superior solution, anyway, from what I can tell.

Bitcoin made two mistakes: not being file-system-ish enough, and not defining a sufficiently generic compute platform (solved by ethereum. Basically, bitcoin knows how to add and subtract; that’s all that’s needed for accounting. Ethereum knows how to multiply and divide and branch and tail-recurse). The not-being-a-filesystem choice is forced by anti-forking properties of bitcoin/ethereum, since the blockchain grows rapidly and so all commits must be tiny. By contrast, git allows not only large commits, but any kind of code, c++, scheme, whatever. But never defines an execution context for that code. In git, it’s up to the user to download, compile, install. It’s not automated like in ethereum.

Git fails to provide a fully-automated network filesystem. You have to manually git-push, git-pull to get network updates. There’s no peer-discovery protocol (which is what started this email chain), and resolving file conflicts is problematic during `git merge`. Also, git fails to provide a directory search mechanism: if you don’t know where the content is, you can’t search for it (as also complained about in this email chain).  Github sort-of-ish solves this, but its proprietary. Compare this to IPFS, which is full-network-automated, and does provide content search. Unfortunately, IPFS doesn’t have the append-only log structure of git. It also uses DHT’s, which are problematic, as they completely fail to deal with locality. Enter scuttlebutt .. which provides a “true” decentralized append-only log. The scuttlebutt core is fully generic (which is why git-on-scuttlebutt is possible). However, 90% of scuttlebutt development focus is on social media, not filesystems. Think of it as a kind-of git-for-social-media posts. Or maybe a web-browser-displaying-git-content-in-a-pretty-way. (BTW, the scuttlebutt people are really nice people. You should go hang out with them.)

The lack of proper indexes in git is severe, as is the lack of content-based search. Once you get into search, you wander down the rabbit-hole of query languages, pattern matching and pattern-mining. So scheme (and most functional programming languages) have pattern-matchers. For example, case-lambda but also define-syntax and syntax-case, or racket’s racket/match … but in a certain sense, these are pathetically weak and limited in ability and performance. Compare to, for example SQL — it blows the doors off syntax-case in usability and power. Never mind pattern-mining. And then we have the issue of term-rewriting. So, for example, schemes’ hygienic macros do term re-writing, but they do it only once, when you start your scheme code up for the first time. There is no ability to perform runtime expansion of hygienic macros.  Macro languages are not run-time – again, compare/contrast to SQL select/update.

Well, but SQL is obviously deficient — its record-based, and if you compare it to syntax-case, define-syntax, you will note that trees aka s-expressions are what we really wanted. Pulling on that thread gives you graph query languages, e.g. GraphQL for javascript… which is nice cause json is kind-of-ish like typed s-expressions. Yes, scheme is explicitly untyped, but don’t knock types, they’re really nice. The racket people are onto something, something that ain’t javascript or CamL or haskell.

So, although graph query languages are vast improvements over plain-jane pattern matchers, one can do better still… which brings me to what I work on… the AtomSpace and Atomese (sorry, I hate camel-case, but that’s history now.) It’s a graph database — you can store arbitrary typed s-expressions. It’s a pattern matcher, but far more sophisticated than GraphQL. It’s a programming language, but is more like assembly or byte-code, or an intermediate-language: low-level, not for humans, but for other algos. It could’ve/should’ve been more javascript-like, but that’s a historical mistake. Maybe still fixable. It’s vaguely prolog-like, and so you could say minikanren-like, but it has a stronger runtime, and generalizes truth values beyond true/false, so e.g. for Bayesian probability or neural-nets. It’s .. well, it’s a graph database, it’s weakly distributed; there’s some ongoing work, there, but the as mentioned, DHT’s are terrible in ensuring data locality. So if I have to e.g. (case-lambda (foo)(..) (bar baz) (...)) I would rather that (foo) and (bar baz) be on the same network node, rather than opposite sides of the planet. But Kademlia puts them on opposite sides of the planet (because their hashes are completely different), and I haven’t been able to crack that nut.

What does this have to do with zipf? Well, all patterns have a grammar. This is Chomsky’s theorem from the 1960’s or something. If you know the grammar, you can parse text to see if it validly obeys the grammar. Alternately, you can generate syntactically-valid random text from the grammar.  But what if you don’t know the grammar? Well, that’s what machine-learning and deep-learning is supposed to be for, except that machine-learning could only learn very simple grammars (e.g. decision-tree-forests) until it hit the wall. So deep-learning overcame that wall, but it bypasses syntax by claiming its not important, or that it is a black box, or whatever the pundits claim these days.  So I’m trying to do pattern mining correctly.

But to do that, I need to validate that the syntax that is learned matches the syntax used to generate the sample corpus. So I have to generate a random syntax, generate a random text corpus from it, pattern match it, deduce the syntax, and check precision/recall accuracy. Most networks in nature (natural language, genomics, proteomics) seem to be Zipf-distributed. And so I need a Zipf random generator.  So here we are…

FWIW genomes seem to be zipfian with exponent of 1/2 .. I have been unable to find any explanation why it’s 1/2 and not 1. Its not just genomes, its also text.  (Although I was banned from Wikipedia for pointing that out. Hash-tag time: #defund-wikipedia-admins) Anyone have a clue, here? Anyone help me out? I mean, with the exponent=1/2 part?

Or help me out with Atomese, or with the syntax-learning project? Or anything at all? A lifeline? Donate some bitcoin? 1MjcVE8A4sKDqbbxSf8rej9uVPZfuJQHnz

Hello world!

End Game – Charles Rash

Welcome to WordPress. This is your my first post. Edit or delete it, then start writing! Let’s go for edit. But first, a legal disclaimer:
By reading or using this site, you agree that I (the author and writer, Linas Vepstas) is not very likeable, and is prone to saying things that you will incorrectly  construe to be offensive. Therefore, if you are the kind of person who doesn’t like other people, or are psychologically weak, or if you try to protect your own mental well-being by shying away from anything offensive, then you are advised to avoid this blog. As will become patently obvious, I do not write offensive things, unless, of course, you are the kind of person who gets offended. If you are easily offended, then you can go #$%^&* yourself. And stay offa my site.

Why is a legal disclaimer necessary? Because, in these modern times, people seem to have problems interacting with other people, and some people think that legal boiler-plate somehow improves the situation (We’ll talk about why, a bit further down below).

Why am I not likeable? Well, probably because you’re an a#$%^ that has difficulty controlling your own mental and emotional state. Not a big deal, all humans are like that (including me). To be likeable, I would have to say things that make you feel good about yourself, and frankly, I’m sort of busy, and I do that only for people I like. Actually, I’m kind of drawn to people who are a little psychologically damaged; I like them because I can be kind to them.    But only the honest ones. Some people are so fu’ed, they seem beyond repair, and I stay away from them. I don’t like them… and they don’t like me.  Since many people are fu’ed beyond repair, ergo, I’m not likeable (by them). Law of averages kind of thing. In a democracy, the majority wins, right?

Why am I talking about likeability? Well, because this is my first WordPress post. Why is that? Because I’ve been kicked off of facebook. Why have I been kicked off of facebook? I’m not sure, but I can guess.  Based on what my sister said, they have a button that is labelled “objectionable content” that you can press. She said she’s pressed it many times. I reminded her that this was very Stalinesque – rat your your neighbor, they get deported to Siberia. So, WordPress is the Siberia of the social-media world.  Welcome to Siberia!

Apparently, I’ve been banished forever. They only way to contest facebook’s ban is for them to send a text to some phone number I no longer control. The only way to change my phone number is for them to send a text to the phone number I no longer control… so.. forever. I’m going to use small fonts for unimportant comments.

So, why would anyone find my facebook page objectionable? Well, lets see. My posts included:

Mother Nature forging a baby
  • A medieval tempera painting of Mother Nature forging babies. Like, in a black-smith’s shop. There’s a forge in the background, glowing orange-hot; a pile of coal, and Mother Nature swinging a ten-pound hammer over her head, bringing it down on a baby. And baby parts all over the floor – arms, legs, heads. I mean, where did you think babies come from? A factory? And how do you think the factory makes them? Pretty offensive, if you’ve never thought about making babies.

    Timoclea
  • A 17th century action-figure painting, depicting some woman throwing some guy down a well. I think she’s supposed to be some Ancient Greek figure, and the guy is some Ancient Greek a*#$%^& who deserved it. Except that they are both dressed like Roman gladiators, in those leather skirts, you know.  They’re both muscular. The girl is pretty, and its very much an action-pose, from before the invention of stop-motion cameras, when artists had not yet discovered that physical wrestling doesn’t look like two people striking poses. So this is a kind of #me-too painting from three centuries earlier. Brutal. Is it offensive, the kind of thing to get one kicked from facebook?  Sure, if you’re one of those white-lives-matter snowflakes who thinks they don’t deserve to be thrown down a well.
  • I told you I’m not likeable. If you are a right-wing snowflake, you’ll hate my guts. Sucks to be you, doesn’t it? I’m not your friend.

    Greta Thunberg as Hitler Proof Sheet
  • What else? Well, there’s the photo proof-sheet of Greta Thunberg making angry facial expressions. Someone had inked in a little Hitler mustache under her lip. Someone else told me that she suffers from mercury poisoning – a modern industrial pollutant – well known for shaping an angry disposition. I feel for her; I get angry too. And I struggle to control it. So #$%^ you too. None-the-less, she has an excellent point – global warming – and it is much more important than what I write here, so why are you wasting your time reading this? Go do something to stop global warming, already! So, sure, someone found my Greta-as-Hitler post offensive. That same someone should ask me to make some angry faces, and ink in a Hitler mustache on my lips.
The Trump Bomber Eagle
  • Oh, I know! It was my post of a painting showing Trump riding an American Bald Eagle, red-tie waving in the wind, missiles bolted underwing the eagle, a McDonald’s Happy Meal in his lap, and Mt.  Rushmore in the background. And in the distant background, Pence in a WWI Red Baron biplane. Of course, but Trump and Pence are evil, and so making fun of them does not do them justice. Chomsky is correct, Trump might just be the most evil human, ever.

So .. which one of these violated facebook’s “acceptable use policy”? Or “community guidelines”, or whatever? Any of them? All of them? Was I inciting hatred and violence? Yes, of course I was. Trump deserves to hang from a lamp-post, where we can throw rocks at him. Drawn and quartered. And medically put back together, so we can do it again.


I should mention that I’ve been kicked off of Wikipedia, as well. For editing math articles. Apparently, this got under the skin of some WP admin.  So this is like police violence: when you give some people absolute power over other people, they will abuse that power. So, dual-hash-tag time: #defund-the-police and #defund-wikipedia-admins I know that the admins are unpaid volunteers aka “vigilantes”. All the more reason to get rid of them.

I told you I’m not likeable. I violate acceptable use policies.

So, what’s a blog post without some philosophy? Let’s look at “Acceptable Use Policy”. Or “Community Guidelines”. Well, blow me down.  @#$%^& that. You are a moron if you think it is necessary to regulate other individual human beings with a “policy”. Of course, this raises interesting philosophical questions: how do you regulate other human beings?

We live in an era where the power of the algorithm, and the algorithmic nature of reality has been recognized (flexibility and reasonableness have not been, because flexibility and reasonableness do not yet have a mathematical description.)

When a child mis-behaves, an adult can say “go to your room”, and enforce that. Adults do not use written “acceptable use policies” in the interaction with children. When an adult misbehaves, a pastor or psychologist can minister to their faults.  Both pastors and psychologists have some training, or at least a high IQ offering them shamanistic insight. When the misbehavior becomes criminal, we get courts and judges, and written legal codes. The codes help ensure the uniformity of judgement, minimizing the outrageous miscarriage of justice.

Are “codes of conduct” really required to administer cooperative communities?  I don’t think so. I think it just leads to wiki-lawyering (the technique used to banish me from wikipedia) or to algorithmic lynch mobs (the technique that facebook applied to get rid of me).  We are in the middle of an experiment of social-media policing, and are getting lots of things wrong.


Whatever. Enough for now. I’ve tired of writing, you’ve tired of reading. In future posts, I will try to resurrect the content that was lost to the black-hole of facebook. For now, lets try for finding those offensive pictures, again. Sexist tripe.

The washing machine as a labor-saving device.
Unruly demonstrators being subdued by police.
More people in prison than died of Covid-19
Bride tossing her cat to the Bridesmaids
Make America Great Again
Copyright Law, Today
Not Salvador Dali
Uncontacted Tribes
The Miniskirt comes to Soviet Vilnius, 1964
Four perfectly round circles. No, really.
Is written on the wall.
Maidens Putin and Obama in the Square