diff -u

What's New in Kernel Development By Zack Brown

Removing Profanity from the Source Tree

Warning: this article contains profanity.

Linus Torvalds recently stepped away from kernel development temporarily in order to think about how to be less harsh with developers in certain situations. Simultaneous with his departure was a patch introducing a new Code of Conduct into the kernel source tree. The effects of this are beginning to be felt.

Jarkko Sakkinen recently posted a patch to change a kernel comment containing the word "fuck" to use the word "hug" instead. So the code comment, "Wirzenius wrote this portably, Torvalds fucked it up" would become "Wirzenius wrote this portably, Torvalds hugged it up".

Steven Rostedt replied to this, saying that the code in question had changed so much that the original comment was out of date, and it should just be removed entirely. He said, "that will be an accurate change with or without CoC."

Jonathan Corbet remarked, "I'd much rather see either deletion or a rewrite over bleeping out words that somebody might not like." And Jiri Kosina agreed, saying, "turning comments into something that often doesn't make sense to anybody at all is hardly productive."

Sergey Senozhatsky pointed out that Linus was the author of the original self-deprecating comment. He asked, "Linus has made a comment, in his own words, about his own code. Why would anyone be offended by this?"

And Tobin C. Harding remarked of the original code comment, "This is my favourite comment to date in the kernel source tree. Surely there are still some people working on the kernel that do so for fun. I actually laughed out loud when I first stumbled upon this file."

In a different thread, Kees Cook said he agreed with removing "fuck" from the source tree, but felt that the word "hug" was not a good replacement, since it didn't maintain the original meaning. He said:

"This API is hugged" doesn't make any sense to me. "This API is hecked" is better, or at least funnier (to me). "Hug this interface" similarly makes no sense, but "Heck this interface" seems better. "Don't touch my hecking code", "What the heck were they thinking?" etc...."hug" is odd.

He added, "Better yet, since it's only 17 files, how about doing context-specific changes? 'This API is terrible', 'Hateful interface', 'Don't touch my freakin' code', 'What in the world were they thinking?' etc.?"

Geert Uytterhoeven replied to Kees, saying, "As a non-native speaker, I find both replacements ['hug' and 'heck'] difficult to understand. While many of the original comments are easy to grasp for +7 year olds who were never taught English, but are exposed to modern global ways of communication." And Matthias Brugger also said, "I don't think that the word 'fuck' is something we have to ban from the source code, but I don't care too much. Anyway, please don't change it to something like heck as it might be difficult for non-English speakers to understand."

Some developers just shook their heads in bewilderment. Davidlohr Bueso remarked of Jarkko's original patch, "I hope this is some kind of joke. How would anyone get offended by reading technical comments? This is all beyond me."

John Paul Adrian Glaubitz added, "We're all grown up and don't freak out when a piece of text contains the word 'fuck'. I still don't understand why people think that the word 'fuck' is what would keep certain groups from contributing to the Linux kernel. In all seriousness, it doesn't."

And Jens Axboe said, "Agree, this is insanity."

David Miller also said, "Whether or not it is a joke, it is censorship. And because of that, I have no intention to apply any patches like this to any code I am in charge of."

At one point Jarkko pointed to the part of the Code of Conduct he relied on when posting his original patch: "Harassment includes the use of abusive, offensive or degrading language, intimidation, stalking, harassing photography or recording, inappropriate physical contact, sexual imagery and unwelcome sexual advances or requests for sexual favors."

He felt that the word "fuck" clearly fell into the category of offensive language.

James Bottomley replied, "No, because use of what some people consider to be bad language isn't necessarily abusive, offensive or degrading. Our most heavily censored medium is TV and 'fuck' is now considered acceptable in certain contexts on most channels in the UK and EU."

Taking another tack, James also pointed out that the Documentation/process/code-of-conduct-interpretation.rst file said specifically, "contributions submitted for the kernel should use appropriate language. Content that already exists predating the Code of Conduct will not be addressed now as a violation." Which, James said, "definitely means there should be no hunting down of existing comments in kernel code."

Jarkko replied, "Ugh, was not aware that there two documents."

The discussion petered out shortly thereafter, but this is the sort of discussion we can expect to see again and again on the linux-kernel mailing list, as long as the Code of Conduct retains its current form.

The interesting thing for me is that the original issue had to do specifically with Linus' rough statements toward developers in specific situations. If he felt that someone should know better regarding a given issue, and already had been told how a given patch or feature should be done, but still persisted in trying to get a rejected patch or feature into the kernel, Linus might yell at them.

Somehow this has morphed into removing banned sets of "unacceptable" words from code comments. And, it did this in a brief matter of a few weeks. I wonder what else is in store.

Fun Little Tidbits in a Howling Storm (Re: Intel Security Holes)

Some kernel developers recently have been trying to work around the massive, horrifying, long-term security holes that have recently been discovered in Intel hardware. In the course of doing so, there were some interesting comments about coding practices.

Christoph Hellwig and Jesper Dangaard Brouer were working on mitigating some of the giant speed sacrifices needed to avoid Intel's gaping security holes. And, Christoph said that one such patch would increase the networking throughput from 7.5 million packets per second to 9.5 million—a 25% speedup.

To do this, the patch would check the kernel's "fast path" for any instances of dma_direct_ops and replace them with a simple direct call.

Linus Torvalds liked the code, but he noticed that Jesper and Christoph's code sometimes would perform certain tests before testing the fast path. But if the kernel actually were taking the fast path, those tests would not be needed. Linus said, "you made the fast case unnecessarily slow."

He suggested that switching the order of the tests would fix it right up. He added:

In fact, as a further micro-optimization, it might be a good idea to just specify that the dma_is_direct() ops is a special pointer (perhaps even just say that "NULL means it's direct"), because that then makes the fast-case test much simpler (avoids a whole nasty constant load, and testing for NULL in particular is often much better).

But that further micro-optimization absolutely *requires* that the ops pointer test comes first. So making that ordering change is not only "better code generation for the fast case to avoid extra cache accesses", it also allows future optimizations.

Regarding Linus' micro-optimization, Christoph explained:

I wanted to do the NULL case, and it would be much nicer. But the arm folks went to great lengths to make sure they don't have a default set of dma ops and require it to be explicitly set on every device to catch cases where people don't set things up properly, and I didn't want to piss them off....But maybe I should just go for it and see who screams, as the benefit is pretty obvious.

Linus also suggested that for Christoph's and Jesper's tests, the dma_is_direct() function should be sure to use the likely() call. And this was interesting because likely() is used to alert the compiler that a block of code is more "likely" to be run than another in order to optimize it. And, Christoph wasn't sure this was true. He said, "Yes, for the common case, it is likely. But if you run a setup where you say always have an iommu, it is not, in fact, it is never called in that case, but we only know that at runtime."

So Christoph was concerned about misleading the compiler and generating worse code. But Linus explained:

Note that "likely()" doesn't have any really huge overhead—it just makes the compiler move the unlikely case out-of-line.

Compared to the overhead of the indirect branch, it's simply not a huge deal, it's more a mispredict and cache layout issue.

So marking something "likely()" when it isn't doesn't really penalize things too much. It's not like an exception or anything like that, it's really just a marker for better code layout.

And that was it. Helpful hints in a time of desperate sorrow. These Intel hardware security holes are almost beyond belief. And we keep hearing about new batches of them being discovered all the time, or new exploits that require different workarounds from the ones already in place.

I'm sure Intel is working like mad to address all of this in future generations of its hardware. But the thing about security holes is that they are, by definition, hard to discover. Hardware manufacturers can poke and prod their products all they please and still miss the thing that a lone actor out in the world discovers one day by mistake. This time, it was Intel; next time, it'll be something else. Kudos to Intel for working with the OS people in spite of the public embarrassment to find good workarounds for these problems.

Disk Encryption for Low-End Hardware

Eric Biggers and Paul Crowley were unhappy with the disk encryption options available for Android on low-end phones and watches. For them, it was an ethical issue. Eric said:

We believe encryption is for everyone, not just those who can afford it. And while it's unknown how long CPUs without AES support will be around, there will likely always be a "low end"; and in any case, it's immensely valuable to provide a software-optimized cipher that doesn't depend on hardware support. Lack of hardware support should not be an excuse for no encryption.

Unfortunately, they were not able to find any existing encryption algorithm that was both fast and secure, and that would work with existing Linux kernel infrastructure. They, therefore, designed the Adiantum encryption mode, which they described in a light, easy-to-read and completely non-mathematical way.

Essentially, Adiantum is not a new form of encryption; it relies on the ChaCha stream cipher developed by D. J. Bernstein in 2008. As Eric put it, "Adiantum is a construction, not a primitive. Its security is reducible to that of XChaCha12 and AES-256, subject to a security bound; the proof is in Section 5 of our paper. Therefore, one need not 'trust' Adiantum; they only need trust XChaCha12 and AES-256."

Eric reported that Adiantum offered a 20% speed improvement over his and Paul's earlier HPolyC encryption mode, and it offered a very slight improvement in actual security.

Eric posted some patches, adding Adiantum to the Linux kernel's crypto API. He remarked, "Some of these patches conflict with the new 'Zinc' crypto library. But I don't know when Zinc will be merged, so for now, I've continued to base this patchset on the current 'cryptodev'."

Jason A. Donenfeld's Zinc ("Zinc Is Not crypto/") is a front-runner to replace the existing kernel crypto API, and it's more simple and low-level than that API, offering a less terrifying coding experience.

Jason replied to Eric's initial announcement. He was very happy to see such a good disk encryption alternative for low-end hardware, but he asked Eric and Paul to hold off on trying to merge their patches until they could rework them to use the new Zinc security infrastructure. He said, "In fact, if you already want to build it on top of Zinc, I'm happy to work with you on that in a shared repo or similar."

He also suggested that Eric and Paul send their paper through various academic circles to catch any unanticipated problems with their encryption system.

But Paul replied:

Unlike a new primitive whose strength can only be known through attempts at cryptanalysis, Adiantum is a construction based on well-understood and trusted primitives; it is secure if the proof accompanying it is correct. Given that (outside competitions or standardization efforts) no-one ever issues public statements that they think algorithms or proofs are good, what I'm expecting from academia is silence :) The most we could hope for would be getting the paper accepted at a conference, and we're pursuing that but there's a good chance that won't happen simply because it's not very novel. It basically takes existing ideas and applies them using a stream cipher instead of a block cipher, and a faster hashing mode; it's also a small update from HPolyC. I've had some private feedback that the proof seems correct, and that's all I'm expecting to get.

Eric also replied, regarding Zinc integration:

For now I'm hesitant to completely abandon the current approach and bet the farm on Zinc. Zinc has a large scope and various controversies that haven't yet been fully resolved to everyone's satisfaction, including unclear licenses on some of the essential assembly files. It's not appropriate to grind kernel crypto development to a halt while everyone waits for Zinc.

He added that if Zinc is ready, he'd be happy to use it. He just wasn't sure whether it was.

However, in spite of the uncertainty, Eric later said, "I started a branch based on Zinc: https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git, branch 'adiantum-zinc'."

He listed the work he'd done so far and the work that remained to be done. But regarding Zinc's remaining non-technical issues, he said:

Both myself and others have expressed concerns about these issues previously too, yet they remain unaddressed nor is there a documentation file explaining things. So please understand that until it's clear that Zinc is ready, I still have to have Adiantum ready to go without Zinc, just in case.

Jason was happy to see the Zinc-based repository and promised to look it over. He also promised to add a documentation file covering many of Eric's concerns before posting another series of Zinc patches. And as far as Eric and Paul being ready to go without Zinc integration, he added, "I do really appreciate you taking the time, though, to try this out with Zinc as well. Thanks for that."

Meanwhile, Herbert Xu accepted Eric and Paul's original patch-set, so there may be a bit of friendly shuffling as both Zinc and Adiantum progress.

It's nice to see this sort of attention being given to low-end hardware. But, it's nothing new. The entire Linux kernel is supposed to be able to run on absolutely everything—or at least everything that's still in use in the world. I don't think there are too many actual 386 systems in use anymore, but for real hardware in the real world, pretty much all of it should be able to run a fully featured Linux OS.

Note: if you're mentioned in this article and want to send a response, please send a message with your response text to ljeditor@linuxjournal.com and we'll run it in the next Letters section and post it on the website as an addendum to the original article.

About the Author

Zack Brown is a tech journalist at Linux Journal and Linux Magazine, and is a former author of the "Kernel Traffic" weekly newsletter and the "Learn Plover" stenographic typing tutorials. He first installed Slackware Linux in 1993 on his 386 with 8 megs of RAM and had his mind permanently blown by the Open Source community. He is the inventor of the Crumble pure strategy board game, which you can make yourself with a few pieces of cardboard. He also enjoys writing fiction, attempting animation, reforming Labanotation, designing and sewing his own clothes, learning French and spending time with friends'n'family.

Zack Brown