[Webfunds-devel] Cooking Hashes

Ian Grigg iang@systemics.com
Thu, 31 Aug 2000 20:24:22 -0400


"Jeroen C. van Gelderen" wrote:

> Unless I'm missing something...
> 
> I would suggest that the hash used is exactly the same
> hash that is signed by PGP. I.e. follow the OpenPGP rules
> exactly.

well, sure, but we want to hash the PGP signature as
well, in the sense that we want that signature, not
any other.  Our sense of deep suspicion means that
even a single byte changing, and we want to know why,
which is, in Ricardian terms, bounced up to the user
as "the contract isn't the same!"

So, we can't use the PGP hash, as we need to hash it
and the following sig as well.

> If we don't we'll end up with a contract hash that does not
> represent what the Issuer signed for. I.e. the OpenPGP
> signature is calculated over the contract without trailing
> CR/LF whereas our hash is calculated over the contract
> with CR/LF.
> 
> Additionally this would cause a problem where two contract
> hashes are considered to be different (i.e. different
> contracts) but OpenPGP would consider them to be the same
> and both valid.
> 
> If someone were to mutilate an existing contract by adding
> in the spurious newline, it would still look like a valid
> contract but the hash wouldn't be correct. I can feel a
> support headache coming up already ;-)

Yes, and no!  Yes, it would be a support headache, and no,
because this is exactly the support headache that we want
to generate, as we don't want even the slightest change to
slip thorugh.

> I'm not sure why you wouldn't strip of the trailing CR/LF.
> Enlighten me please! In any case, should we 'upgrade' to
> an XML format it wouldn't make a difference as the
> </contract> tag would denote the end of a contract
> unambiguously.

Yes, this is a different issue.  In deciding to regularise
the CR/LF back to "every line has one" rather than the *PGP
"optimisation," I had to think long and hard.  In the end, I
considered these things:

   * the dropping of the last CR/LF was a coding nuisance,
     resulting in bugs / support issues / knowledge issues.
   * PGP code gives us no support for any of this, it
     simply does the sig according to its rules, and returns
     the results - the hash is an internal issue only.
   * so much so that there is no external concept of
     calculating the pgp sig hash; the exported concept
     is one of calculating the sig, and how it does it is
     a PGP issue.
   * so we have to write the code anyway...
   * so, it is only programmers who are well versed in *both*
     code bodies that will ever know there is a difference.
   * meanwhile, there is an externally exported concept of
     calculating the Ricardian Hash - I want a FreeBSD pgm
     to do it, I'd like the same in dos, I'd like all the
     Perl to do it as easily as the Java as easily as all
     the other languages (and I want that *more* than checking
     the signatures, which can be farmed out to a PGP code
     base).
   * and, it is actually harder to write a simple script
     in Perl / shell / etc that treats the last line as
     distinct to the others with something as tricky as
     a newline difference (recall that these scripting
     languages are line based...).
   * and, it is what we've always used, and it has always
     worked well.

> I basically think that details like this are probably not
> what you want to try and explain to a judge and that's
> why I think following the OpenPGP signature rules could
> -despite some braindamage- be a better solution than not
> doing so...

I agree that a judge might be confused as to why we use a
distinct method to OpenPGP.  But I don't see it as so much
of an issue as the whole point of using this format is that
we do in fact surface any contract changes.  So whilst he
may say that we are confusing the issue by using different
formats, he cannot say that this confusion will result in
any difficulties for users, because all the code will work
with this method, and will all agree - or not as the case
may be.

It's a bit different to the contract clauses - there, we
need a clear mapping between technical processes (programs)
and visible processes (lawyers).

With the hash, there is no legal concept of interpretation;
so we can choose any way we like:  so we can choose the way
that best suits our needs.

> Which negotiation bit did I miss here? ;-/

The last 16?

:)

Iang:
> > BTW, from memory, the canonical hash rules we have established are:
> >
> >    * from BEGIN to END lines, inclusive, excluding other lines
> >    * space-trimmed at end (same as cleartext sig)
> >    * _all_ lines terminated with \r\n (almost the same as sig)
> >    * hash is undefined by this format, but can be tweaked in
> >      local file, or assumed by application.  We assume SHA1.
> >
> > The salient difference with the signature rules is that we
> > do not drop the last \r\n.  That's a zimmerism that I wasn't
> > prepared to punish any programmer with ;)

-- 
iang