Opened 2 years ago

Last modified 2 years ago

#63885 new defect

Replace rmd160 use in MacPorts with something else

Reported by: ryandesign (Ryan Carsten Schmidt) Owned by:
Priority: Normal Milestone:
Component: base Version:
Keywords: Cc: mascguy (Christopher Nielsen), Schamschula (Marius Schamschula), i0ntempest, cjones051073 (Chris Jones)
Port:

Description

The default checksum types used by MacPorts in portfiles are sha256, rmd160, and size. rmd160 is also used for each archive on the packages server.

OpenSSL 3 has designated ripemd160 (rmd160) as a legacy algorithm.

https://www.openssl.org/docs/man3.0/man7/EVP_MD-RIPEMD160.html

I guess that means we should phase out the use of rmd160 in MacPorts, replacing it with something more modern.

Change History (19)

comment:1 Changed 2 years ago by mascguy (Christopher Nielsen)

Cc: mascguy added

comment:2 Changed 2 years ago by Schamschula (Marius Schamschula)

Cc: Schamschula added

comment:3 Changed 2 years ago by i0ntempest

Cc: i0ntempest added

comment:4 Changed 2 years ago by cjones051073 (Chris Jones)

Cc: cjones051073 added

comment:5 Changed 2 years ago by pmetzger (Perry E. Metzger)

So I'm not sure there's a security reason to use two algorithms at once; SHA256 is enough for our purposes. We could just deprecate using two checksums at once.

Alternatively, if we decide we really need to use two, I'd recommend using SHA-3 which is Keccak based and uses a quite different construction than SHA-2, and is a national standard. Using a different construction makes it less likely that both SHA-2 and SHA-3 would have security issues at once. If tastes run against SHA-3, I'd suggest BLAKE2 or BLAKE3, which are based on very heavily studied primitives.

In no case should a hash as short as 128 bits be used; birthday attacks on such hashes are feasible.

However, again, my own recommendation would be to just drop RMD160 and not replace it with anything.

comment:6 Changed 2 years ago by ryandesign (Ryan Carsten Schmidt)

We use two algorithms so that a compromise of one algorithm does not compromise the integrity of the files.

comment:7 Changed 2 years ago by pmetzger (Perry E. Metzger)

As a side note:

$ find . -name Portfile -print | xargs egrep -w '(sha1|md5)' | wc -l
    3954

comment:8 in reply to:  6 ; Changed 2 years ago by cjones051073 (Chris Jones)

Replying to ryandesign:

We use two algorithms so that a compromise of one algorithm does not compromise the integrity of the files.

but then, we only use rmd160 to validate the binary tarballs, no ?

comment:9 in reply to:  6 ; Changed 2 years ago by pmetzger (Perry E. Metzger)

Replying to ryandesign:

We use two algorithms so that a compromise of one algorithm does not compromise the integrity of the files.

I think the probability of a high quality exploit that occurs without prior warning against any of the modern hash algorithms is quite low. That said, SHA-3 or BLAKE2/BLAKE3 are good options as I mentioned. I'd personally pick SHA3.

We should also systematically get rid of reliance on MD5 (people with inexpensive machines can fake that at this point) and SHA1 (people with expensive machines can fake that at this point.)

Last edited 2 years ago by pmetzger (Perry E. Metzger) (previous) (diff)

comment:10 Changed 2 years ago by ryandesign (Ryan Carsten Schmidt)

As a side note:

$ find . -name Portfile -print | xargs egrep -w '(sha1|md5)' | wc -l
    3954

Yes, of course.

Many ports are old and have not been touched. There are probably still many that list only an md5 sum. They should be changed to list two newer sums and size.

For many years, our default set of checksum types was md5, sha1 and sha256. Even though md5 and sha1 are considered insecure, the fact that both are used to secure a single file means that the file's integrity is still assured.

Last edited 2 years ago by ryandesign (Ryan Carsten Schmidt) (previous) (diff)

comment:11 in reply to:  8 Changed 2 years ago by ryandesign (Ryan Carsten Schmidt)

Replying to cjones051073:

Replying to ryandesign:

We use two algorithms so that a compromise of one algorithm does not compromise the integrity of the files.

but then, we only use rmd160 to validate the binary tarballs, no ?

Yes, but the rmd160 used for the binary archives is not merely a checksum; it is also somehow validating a signature with our public key. I have not attempted to understand exactly how that works. If it is a problem that we only use one algorithm there, we could use more than one.

comment:12 in reply to:  9 Changed 2 years ago by ryandesign (Ryan Carsten Schmidt)

Replying to pmetzger:

I think the probability of a high quality exploit that occurs without prior warning against any of the modern hash algorithms is quite low.

It's obviously not about prior warning. It's about the fact that ports often do not get touched for years, so we want security in case an algorithm is discovered to be insecure and the portfile is then not updated for years after that.

We should also systematically get rid of reliance on MD5 (people with inexpensive machines can fake that at this point) and SHA1 (people with expensive machines can fake that at this point.)

Not when two different checksum types protect one file.

comment:13 Changed 2 years ago by cjones051073 (Chris Jones)

What constraints are there on what crypto algorithms we could consider using ? i.e. how does base generate them ? I presume via whatever SSL library the OS ships with, in which case are we at all limited here ?

comment:14 Changed 2 years ago by pmetzger (Perry E. Metzger)

BTW, just to mention: Apple Silicon has acceleration for SHA-3. BLAKE variants aren't (to my knowledge) accelerated by any architecture, but it is very very fast. Typical unaccelerated SHA-3 implementations are around 12 cycles/byte, BLAKE-3 is about half a cycle per byte (yes, 24x faster.) Maybe (arguably) given our support for older hardware BLAKE-3 might be friendlier. It is also a 256 bit hash function, FWIW.

comment:15 in reply to:  10 ; Changed 2 years ago by pmetzger (Perry E. Metzger)

Replying to ryandesign:

Many ports are old and have not been touched. There are probably still many that list only an md5 sum. They should be changed to list two newer sums and size.

I could write a script to do that if I had access to the distfiles cache (or someone else could).

comment:16 Changed 2 years ago by mouse07410 (Mouse)

When hash is used for non-cryptographic purposes (just to produce a unique identifier for a package), it does not really matter whether it's cryptographically broken or not. Thus, I wouldn't bend over backwards to eradicate all the usage of MD5, SHA1, RIPEMD160, etc.

. . . rmd160 used for the binary archives is not merely a checksum; it is also somehow validating a signature with our public key

This, ideally, should be replaced. Anything SHA-2 would be good, or SHA-3. I personally like SHA-3, and some candidates (like Blake2). Didn't pay attention to Blake3 (and probably won't, enough Post-Quantum things on the plate to occupy my time)...

Given how port validation and signature verification are done, I don't think we need to worry about performance of the hash.

In short: if you can move to SHA-2 or (better yet) SHA3, it would be great from several points of view. Especially if Perry can write a script to speed up the update.

comment:17 Changed 2 years ago by neverpanic (Clemens Lang)

The best alternative for the .rmd160 signatures we use for packages is probably signify(1).

See https://www.openbsd.org/papers/bsdcan-signify.html and https://man.openbsd.org/signify.1.

https://github.com/macports/macports-base/pull/184 would already add a copy of signify to MacPorts base. This PR doesn't change the archive signature mechanism to use signify, but it was my intention to eventually do this as well. When doing this, we should also switch from one globally configured key to a key per source, so that other sources can provide their own key with their own binaries.

comment:18 in reply to:  17 Changed 2 years ago by ryandesign (Ryan Carsten Schmidt)

Replying to neverpanic:

the .rmd160 signatures we use for packages

Any change to this will require changing how MacPorts base fetches and verifies archives, since it currently expects to be able to fetch a file whose name is the archive name with ".rmd160" appended (and for it to contain data in an rmd160-based format). This would be a good opportunity to implement a suggestion I mentioned previously, where instead of a ".rmd160" file, MacPorts fetches an information file, whose name could be the archive name minus the archive/compression format suffix plus a suffix like ".info.json" (e.g. "zlib-1.2.11_0.darwin_21.arm64.info.json"). This file would contain fields that specify the archive/compression suffix (e.g. "tbz2") and the signature and its format. This would give us the ability to modify, over time, on an archive by archive basis, what archive/compression and signature format we use.

comment:19 in reply to:  15 Changed 2 years ago by ryandesign (Ryan Carsten Schmidt)

Replying to pmetzger:

I could write a script to do that if I had access to the distfiles cache (or someone else could).

Everyone has access to it at http://distfiles.macports.org and rsync://rsync.macports.org/macports/distfiles/. But you don't need the full set of distfiles, only those distfiles mentioned in portfiles that don't use the right checksum types. Once you've identified which portfiles those are, you can use sudo port fetch to fetch the distfiles, sudo port checksum to verify that the deficient checksums in the portfile match, and then replace the checksums with modern ones.

Note: See TracTickets for help on using tickets.