diff -u

What's New in Kernel Development. By Zack Brown

KUnit and Assertions

KUnit has been seeing a lot of use and development recently. It's the kernel's new unit test system, introduced late last year by Brendan Higgins. Its goal is to enable maintainers and other developers to test discrete portions of kernel code in a reliable and reproducible way. This is distinct from various forms of testing that rely on the behavior of the system as a whole and, thus, do not necessarily always produce identical results.

Lately, Brendan has submitted patches to make KUnit work conveniently with "assertions". Assertions are like conditionals, but they're used in situations where only one possible condition should be true. It shouldn't be possible for an assertion to be false. And so if it is, the assertion triggers some kind of handler that the developer then uses to help debug the reasons behind the failure.

Unit tests and assertions are to some extent in opposition to each other—a unit test could trigger an assertion when the intention was to exercise the code being tested. Likewise, if a unit test does trigger an assertion, it could mean that the underlying assumptions made by the unit test can't be relied on, and so the test itself may not be valid.

In light of this, Brendan submitted code for KUnit to be able to break out of a given test, if it triggered an assertion. The idea behind this was that the assertion rendered the test invalid, and KUnit should waste no time, but proceed to the next test in the queue.

There was nothing particularly controversial in this plan. The controversial part came when Frank Rowand noticed that Brendan had included a call to BUG(), in the event that the unit test failed to abort when instructed to do so. That particular situation never should happen, so Brendan figured it didn't make much difference whether there was a call to BUG() in there or not.

But Frank said, "You will just annoy Linus if you submit this." He pointed out that the BUG() was a means to produce a kernel panic and hang the entire system. In Linux, this was virtually never an acceptable solution to any problem.

At first, Brendan just shrugged, since as he saw it, KUnit was part of the kernel's testing infrastructure and, thus, never would be used on a production system. It was strictly for developers only. And in that case, he reasoned, what difference would it make to have a BUG() here and there between friends? Not to mention the fact that, as he put it, the condition producing the call to BUG() never should arise.

But, Frank said this wasn't good enough. He said that whether you felt that KUnit belonged or didn't belong in production systems, it almost certainly would find its way into production systems in the real world. That's just how these things go. People do what isn't recommended. But even if that were not the case, said Frank, non-production systems likewise should avoid calling BUG(), unless crashing the system were the only way to avoid actual data corruption.

Brendan had no serious objection to ditching the call to BUG(), he was just posing questions, because it seemed odd that there would be any problem. But, he was fine with ditching it.

So the feature remains, while the error handling will change. An interesting thing about this particular debate is that it underscores the variety of conflicts that can emerge with so many debugging and error-handling aspects of the kernel. All sorts of conflicts and race conditions might emerge.

For example, a developer might write a new driver and want to test how it behaves under heavy load. So they'll run a memory-intensive process while using their driver, only to discover that the kernel's out-of-memory (OOM) killer kills the process generating the load, before the key test situation can be triggered within the driver.

It's amazing to consider the sheer quantity of testing and debugging features that have encrusted themselves on every aspect of the Linux kernel development process. Even git itself, the revision control system created by Linus Torvalds specifically to host kernel development, is itself a debugging tool that ensures it is possible to identify and possibly revert changes that turn out to cause a problem. In addition to everything else, there also are a wide array of automated systems running within a variety of private enterprises. Some of those load up running systems with particular workloads; some read the source code directly, looking for patterns. It's impossible to know the full variety and extent of testing that the Linux kernel receives on a daily basis.

Crazy Compiler Optimizations

Kernel development is always strange. Andrea Parri recently posted a patch to change the order of memory reads during multithreaded operation, such that if one read depended upon the next, the second could not actually occur before the first.

The problem with this was that the bug never could actually occur, and the fix made the kernel's behavior less intuitive for developers. Peter Zijlstra, in particular, voted nay to this patch, saying it was impossible to construct a physical system capable of triggering the bug in question.

And although Andrea agreed with this, he still felt the bug was worth fixing, if only for its theoretical value. Andrea figured, a bug is a bug is a bug, and they should be fixed. But Peter objected to having the kernel do extra work to handle conditions that could never arise. He said, "what I do object to is a model that's weaker than any possible sane hardware."

Will Deacon sided with Peter on this point, saying that the underlying hardware behaved a certain way, and the kernel's current behavior mirrored that way. He remarked, "the majority of developers are writing code with the underlying hardware in mind and so allowing behaviours in the memory model which are counter to how a real machine operates is likely to make things more confusing, rather than simplifying them!"

Still, there were some developers who supported Andrea's patch. Alan Stern, in particular, felt that it made sense to fix bugs when they were found, but that it also made sense to include a comment in the code, explaining the default behavior and the rationale behind the fix, even while acknowledging the bug never could be triggered.

But, Andrea wasn't interested in forcing his patch through the outstretched hands of objecting developers. He was happy enough to back down, having made his point.

It was actually Paul McKenney, who had initially favored Andrea's patch and had considered sending it up to Linus Torvalds for inclusion in the kernel, who identified some of the deeper and more disturbing issues surrounding this whole debate. Apparently, it cuts to the core of the way kernel code is actually compiled into machine language. Paul said:

We had some debates about this sort of thing at the C++ Standards Committee meeting last week.

Pointer provenance and concurrent algorithms, though for once not affecting RCU! We might actually be on the road to a fix that preserves the relevant optimizations while still allowing most (if not all) existing concurrent C/C++ code to continue working correctly. (The current thought is that loads and stores involving inline assembly, C/C++ atomics, or volatile get their provenance stripped. There may need to be some other mechanisms for plain C-language loads and stores in some cases as well.)

But if you know of any code in the Linux kernel that needs to compare pointers, one of which might be in the process of being freed, please do point me at it. I thought that the smp_call_function() code fit, but it avoids the problem because only the sending CPU is allowed to push onto the stack of pending smp_call_function() invocations.

That same concurrent linked stack pattern using cmpxchg() to atomically push and xchg() to atomically pop the full list -would- have this problem. The old pointer passed to cmpxchg() might reference an object that was freed between the time that the old pointer was loaded and the time that the cmpxchg() executed. One way to avoid this is to do the push operation in an RCU read-side critical section and use kfree_rcu() instead of kfree(). Of course, code in the idle loop or that might run on offline CPUs cannot use RCU, plus some data structures are not happy with kfree_rcu() delays, so...

In other words, the issue of how the C compiler should treat pointers depends to some extent on whether they are pointers at all. There's nothing about a pointer that distinguishes it from any other number, except that the compiler knows it's a pointer and can therefore do certain things with it that wouldn't make sense in other contexts. It's this issue of the origins of a number—that is, their provenance—that the standards committee was trying to resolve. The reason any of this is useful and relevant is that the compiler can only optimize code to be faster and more efficient if it can understand what's happening and what's going to happen.

Peter poked around online until he found a paper describing the situation in detail.

It horrified him. His conclusion was, "that's all massive bong-hits. That's utterly insane. Even the proposed semantics are crazy."

Paul did not dissent from that view, though obviously more efficient code is better than less efficient code, and the compiler should go to whatever extremes it can manage to achieve it.

Paul said that none of this was new. In fact, it all dated back 20 years and more to the relatively early days of multithreaded operation. There were, Paul said, a variety of approaches, and he said he hoped to be able to show the kernel folks some of what the GCC folks were thinking on the matter to get feedback and suggestions.

Peter still was a bit freaked out by the situation. In particular, he was concerned about whether the compiler could produce reliable code at all. He remarked, "at the very least we should get this fixed and compile a kernel with the fixed compiler to see what (if anything) changes in the generated code and analyze the changes (if any) to make sure we were ok (or not)."

The GNU C compiler is definitely filled with insanity. The whole question of how to convert C code into the best possible machine code is one that can never fully be answered—and in fact, the question continually changes as new CPUs come out on the market. Not to mention that the compiler also has to work around processor-specific security flaws like the ones plaguing Intel chips in recent years.

Add to this the fact that GCC needs to produce good code not just for the Linux kernel, but for any coding project that someone might dream up. So GCC has to remain both highly specialized and highly generalized at the same time. It makes perfect sense that its dark innards would be dark and innardly.

CGroup Interactions

CGroups are under constant development, partly because they form the core of many commercial services these days. An amazing thing about this is that they remain an unfinished project. Isolating and apportioning system elements is an ongoing effort, with many pieces still to do. And because of security concerns, it never may be possible to present a virtual system as a fully independent system. There always may be compromises that have to be made.

Recently, Andrey Ryabinin tried to fix what he felt was a problem with how CGroups dealt with low-memory situations. In the current kernel, low-memory situations would cause Linux to recuperate memory from all CGroups equally. But instead of being fair, this would penalize any CGroup that used memory efficiently and reward those CGroups that allocated more memory than they needed.

Andrey's solution to this was to have Linux recuperate unused memory from CGroups that had it, before recuperating any from those that were in heavy use. This would seem to be even less fair than the original behavior, because only certain CGroups would be targeted and not others.

Andrey's idea garnered support from folks like Rik van Riel. But not everyone was so enthralled. Roman Gushchin, for example, pointed out that the distinction between active and unused memory was not as clear as Andrey made it out to be. The two of them debated this issue quite a bit, because the whole issue of fair treatment hangs in the balance. If Andrey's whole point is to prevent CGroups from "gaming the system" to ensure more memory for themselves, then the proper approach to low-memory conditions depends on being able to clearly identify which CGroups should be targeted for reclamation and which should be left alone.

At the same time, the situation could be seen as a security concern, with an absolute need to protect independent CGroups from each other. If so, something like Andrey's patch would be necessary, and many more security-minded developers would start to take an interest in getting the precise details exactly right.

Note: if you're mentioned in this article and want to send a response, please send a message with your response text to ljeditor@linuxjournal.com and we'll run it in the next Letters section and post it on the website as an addendum to the original article.

About the Author

Zack Brown is a tech journalist at Linux Journal and Linux Magazine, and is a former author of the "Kernel Traffic" weekly newsletter and the "Learn Plover" stenographic typing tutorials. He first installed Slackware Linux in 1993 on his 386 with 8 megs of RAM and had his mind permanently blown by the Open Source community. He is the inventor of the Crumble pure strategy board game, which you can make yourself with a few pieces of cardboard. He also enjoys writing fiction, attempting animation, reforming Labanotation, designing and sewing his own clothes, learning French and spending time with friends'n'family.

Zack Brown