Rant: Why does POSIX not provide any Crypto Utils

Background

Readers of my code and/or my blog will probably know that I am trying to always follow POSIX specification, especially when writing Shellscripts, to be as compatible with other systems as possible. This is due to the fact that I am using multiple platforms, particularly many Linux distributions and FreeBSD, along with some embedded systems. It is rather convenient when my automations run on each and every platform I run on. Today, I was/am writing a tool for storing and distributing patches, PatchVault, when I encountered the issue of distributing FreeBSD-like distinfos.

The problem

The POSIX standard does not provide any ways of creating cryptographically safe hashes. The only utility which comes close to helping one distribute distinfos is cksum(1). Please note, that there are no flags provided in the specification at all!

Now, some *BSD or Linux users reading have probably used one of the members of the shasum(1) utility family. Conveniently, it provides the option to generate a file containing the file paths and their hashes. Amazing tool! You can quickly generate SHA256 hashes for a tree and then just check against it using the -c/--check flag, providing similar functionality to mtree(8).

Let's take a look at a few (and by far not all) popular systems and see if they provide a unified checksum/hashsum checkfile interface!

FreeBSD

A similar (albeit incompatible) format is used for distinfo files on the FreeBSD ports tree. Namely, sha256. It generates an output like (for sha256 <file>) SHA256 (.sh_history) = 3071da2af8b82b4aeb9a5a40c497ceb5ef608f35034cf72987fad994c4ee2727. Technically, Linux does not have any tool (by default) that produces the same output. Using shasum(1) you can get a similar output file, but most definitely not the same format. Linux shasum(1) does not note the used algorithm.

Linux

Now, do you still remember cksum(1)? Well, technically, cksum -a sha256 <file> on Linux does produce the 1:1 output from above. Great! Two systems, two tools, one output. Neither of the above are a POSIX compatible way of implementing what I want of course. But I am not a purist. If systems unanimously changed a tool to do something beyond POSIX, I am happy to use that. So let's see what other systems do:

QNX

Of course, as a POSIX certified OS, QNX does have cksum(1). It even provides some extensions, like the -v flag... No way to check against an output file, so that is a no-go. Ergo, QNX does not provide a way to check against a file of sha256 sums by default.

AIX

Well, let's see what AIX did. Of course, we have cksum(1) again. Well, no SHA256 extension here either. That's a bummer. No check either.

z/OS

"Funnily" enough, z/OS provides a different cksum(1) implementation compared to AIX. Still, no check, on SHA256.

Solaris

Similarly to AIX, cksum(1) on Solaris is very close to the original POSIX specification without any extensions, except some largefiles(7) differences. However, Solaris provides digest(1) which can be used in the format digest -v -a sha256 <file> to produce the same output seen from sha256 <file> on FreeBSD: sha256 (/etc/motd) = e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855. Another operating system, another utility!

SmartOS

Unsurprisingly, same as Solaris.

Solutions?

Well, it is about 2AM here, so it is about time I wrap this up and finish what I wanted in my initial project, lol! Well, turns out, we have two separate ways of solving this problem (none of which I like much).

1st: OpenSSL

Albeit slightly differently formatted (whitespaces do not matter when checking, so beyond visual differences there isn't any), OpenSSL does give you the same format as FreeBSD's distinfo files! You have to do something along the lines of openssl dgst -sha256 <file> to see it! Nice! The above systems should support OpenSSL so this is a success! (Well, technically, I do not believe OpenSSL would be recommended on z/OS due to their in-house cryptography solutions.)

Of course, for a simple patch applying script, why do I have to have a third-party (and rather heavy) utility/library?

2nd: Good old POSIX...

Well, you can always count on cmp(1)! Just simply generate a CRC32 file using cksum(1) and compare a locally created checksum file against the remotely fetched one. Of course I would recommend to sort(1) it beforehand! (Just to avoid strangeness in file ordering)

So what?

Well, The question is rather "Why?". Why does POSIX not give a way of generating SHA256 (or any other cryptographically safe hashes) out of the box? What is holding them back from making shasum(1) standard?

Clearly, they are reluctant to "standardize" cryptography. This is evident from crypt(3).

The algorithm is implementation-defined.

Also, they don't quite force this on you, as you have the ENOSYS escape hatch:

[ENOSYS] The functionality is not supported on this implementation.

So what is the reason behind this?

Well, the Arch Linux manpages for crypt(3) states:

Due to historical restrictions on the export of cryptographic software from the USA, crypt is an optional POSIX component.

Aha! Politics. Should have guessed it :D Unfortunate, that because of this, we do not have a unified cryptographic interface, something rather important in my opinion.

Until IEEE POSIX decides to clean up this archaic state, I am stuck with modifying the cksum(1) output into something like:

CRC32 (<file>) <hash>

so that it goes well with SHA256:

SHA256 (<file>) <hash>
CRC32  (<file>) <hash>

This ensures that systems that don't have the SHA256 distinfo-looking output AND OpenSSL can just use cksum(1) and not just fail. Of course, it doesn't solve the problem. At least there is some error detection. Meh...

Books and Resources

Books and Resources

This post should be a constantly updated one enumerating all the books I've read and enjoyed or learned from. I will most likely provide links to such books. I do NOT encourage people to not buy these books and just read them online. In fact, I bought most, if not all of them.

Books

Modern Operating Systems 4th Edition --- Andrew Tanenbaum

A must read book for low-level enthusiasts. It explains operating systems from zero with connecting it to history as well so that you see the exact evolution. The book itself is quite modern and still applies today conceptually.

Compilers: Principles, Techniques, and Tools

To be fully read An amazing book to get introduced to what LLVM accomplishes.

Design Patterns --- Element of Reusable Object-Oriented Software

Another must read for any OOP user. Describes high-level strategies and patterns that makes you better and planning and implementing complex OOP applications.

Practical Vim

Great for minimalist Vim users. I prefer to keep plugins at a minimum and some of these tricks help me drop a plugin or two from my workflow. It is definitely a more beginner friendly book.

Computer Architecture — A Quantitative Approach

To be fully read The Bible for computer architecture. Not much else to say here. You want to understand how CPUs work, this is the book you HAVE to read. It does a great job at giving you a full overview of architectures without going too in-depth for some specific topics. There are other books for that, like Modern Processor Design. It is a long and exhausting read, but overall definitely worth it.

Modern Processor Design: Fundamentals of Superscalar Processors

To be fully read A great follow-up after Hennessy and Patterson's book. Did not read much of it, but so far I like it. It is more focused on superscalar, speculative, and out-of-order CPUs. AMD microarchitecture is mentioned with great detail.

Memory Systems: Cache, DRAM, Disk

To be fully read Recommended Bible for memory subsystem engineers. Latency, bandwidth, prefetching, DRAM timing are detailed. It is a good read for anyone, our job does not stop at understanding CPU design. Seemingly shows some of the physics and chemistry behind it all.

Parallel Computer Architecture

To be fully read If you ever wondered how to scale performance beyond a single core, this book is for you. Lot of pseudocode, and does not seem like an easy read. The authors recommend it to graduate students and engineers working in system architecture design.

Websites

Refactoring Guru

A great quick reference for design patterns.

Linux Inside

An amazing guide from 0xAX (as always) on Linux internals. Well structured and goes quite in-depth into the internals. I would recommend it for any OS developer in general. 0xAX's blog on NASM is also spot on and discusses many of the "secrets" in NASM. So unless you want to read the NASM manual again and again (which eventually you WILL do), their guide is a good place to "steal" tricks from.

Atlassian's Advanced Git tutorials

Learning the basics of Git is quite straightforward and the internet offers MANY guides on it. However once you reach a certain level, your knowledge will stagnate. Most of my Git knowledge comes from actually collaborating on real projects, and as such I am missing subtle but important details of Git. Things like amending or rebasing come up often, but have you heard of Gitk or Git hooks? Or Git gc? If not, Atlassian's Advanced Git guide might help you get to know Git more.

Movies

Erlang: The Movie

A movie about the Erlang language. Even if you don't plan to use the language itself, the Actor-Model paradigm can be implemented/used in any language. Some people hail it as the replacement of OOP.