Rachel Kroll

Reader feedback on autoconf and bugginess in general

It's time for some responses to reader feedback.

One person mentions that running git log with "--full-history" should show the change that's buried within the commit that also was a merge. Unfortunately, that's not it. I tried that and a bunch of other things before landing on "git show" the other night. They did mention that it might be some other flag.

That plus help from a friend who's clearly been down this road before turned up the right answer: you probably want to add -c or -cc. Now, if you go looking for those in the actual manual for git log, you'll trip over the usual problem that searching for "-c" matches some other term that starts with c, in this case, clear-decorations. So, if you then search for " -c" (note: leading space only), then you'll find this first:

Note that unless one of --diff-merges variants (including short -m, -c, and --cc options) is explicitly given, merge commits will not show a diff, even if a diff format like --patch is selected, nor will they match search options like -S. The exception is when --first-parent is in use, in which case first-parent is the default format.

... and now you're getting somewhere.

Why this isn't the default, I have no idea. You're asking it to show a patch, so show a patch, already!

...

I saw some other comments saying that I should report some of the wacky brokenness that I tripped over, and I need to respond to that.

It's been made very clear to me that plenty of people don't give a shit about what I report as "broken" in their projects. This isn't necessarily about the world of free software/open source, either. It's happened plenty of times at actual jobs when I was there specifically to find badness.

Case in point: someone apparently had a conversation along the lines of "hey, it would be bad if the <redacted spooky people> got their hands on this stuff and it didn't work, so let's have someone who isn't one of the devs test this out first". Then they got a hold of me and ask me to Do some Stuff with their products.

I got my hands on the hardware and then the software, and in so doing, found all kinds of crazy problems in it, almost like nobody had actually tried to use the full extent of the hardware with their driver software.

I started asking questions of the engineers, trying to figure out what they have done. At some point they went radio-silent and I started getting responses from their boss. At that point I thought "it's good to be a contractor" and referred it to MY boss - the person who hired me for the gig. It was clearly an internal matter and not anything for me to deal with. (One of the rare perks of being a contractor, gotta say.)

So, back to what happened the other night. Did I find some badness in pkg-config or pkgconf or whatever variant they're using? I probably did! Could it be exploited? Maybe? Probably? Do I care that much? Not really. Why? Because *far too few people* actually give a shit about this kind of correctness.

This is fairly common when you set me loose in an ecosystem: I find all kinds of dumb "mosquito bite" things that make things less than pleasant for someone who isn't expecting it. They're the things that experienced users know how to route around because they start accepting it as "normal".

If my "trouble reports" tend to be seen as more of a burden than a value, why would I keep generating them?

For those few people out there who truly care about this sort of thing, obviously I love and cherish you and people who do what you do. Please don't think I'm painting you with the same brush. Just realize that you are very much in the minority, and I have no idea if I'm going to encounter one of you (unlikely) or one of the others (very likely) when I file a report.

Another thing: I'm not paying any of these people, so why should they listen to me? There are exactly zero consequences for them if my stuff goes ignored. They don't work for me, and it's not like I could fire them or somehow use some other leverage in order to make things happen.

...

I believe that really good products have at least one person behind them who's spotting all of the goofiness and is having some success in getting those things ironed out. It's not enough to just notice them, and it's not enough to do the kind of gunky eng work it takes to find out why something happened or how it broke.

Whoever does this work has to actually have traction in the organization so that these problems will actually be fixed. Otherwise, the first two parts have no real value, and only serve to burn out whoever's making the attempts.

I'd love to do this kind of work for a place that truly cares about it, to the point of providing air cover to those who are tasked with finding the problems. If the company doesn't actually care, that's "fine". Don't hire people who are tasked with trying to improve those things, then if there's a hypocrisy gap in your engineering culture.

Also, I get that companies change, and what's valued today is not what will be valued tomorrow. It seems like the least a good manager would do is *explicitly* say as much when that point arrives, so you can cut and run instead of trying to push back an unstoppable tide. When the company shifts from engineering, reliability and delighting the customers to growth at all costs and exploiting the customers, it's probably time to go.

Is anyone actually doing this now? I'd love to hear about it if so.

...

Finally, to the handful of people who say that the results of ./configure "will change all the time" because of the kernel or libc version changing... that's just wrong. You're making that up based on a technicality that does not actually track with real-world use of our systems.

I'm going to guess that most of the people who are saying this are going based on the fact that you CAN change the C library or kernel arbitrarily, but probably never have. Meanwhile, I was there way back in the day when we first got ELF kernels in the 1.3 world and had to install a new compiler in order to link it in the first place, among plenty of other godawful things we had to do back in the 90s. I've been there, and I'm telling you, this is not the way of the world right now. Systems just don't change that often. When they do change to that extent, it tends to be associated with a major version bump (unless someone's mighty clueless).

The whole point of having things set up for you by the OS builder is that it takes the kernel, C library, and other localizations into account. If someone is changing those things to where syscalls and library calls are shifting materially, they are essentially signing up to maintain their own OS variant, and should update the "how you do things on this OS" data at the same time if it affects such things.

This *already* happens. Companies take a base OS (like CentOS), and then they run a far newer kernel and/or glibc atop it for their own reasons. This forces a whole raft of other changes to take advantage of the new stuff: new versions of perf, strace, ethtool, mcelog or whatever else. I know this because I did this to help out as a certain employer went from CentOS 5 to 6 to 7 over the span of several years.

Their build systems already know how to build stuff on this hybrid environment, particularly to use the overlay glibc instead of the system one. So, someone is maintaining this, and they track whatever changes when it actually matters.