GregKH on
the future of enterprise kernels suggests one possible future scenario..
On every major update, the kernel is updated to the latest kernel.org release, much
like the consumer products are (Fedora, openSUSE, Ubuntu, Mandriva, etc.)
This will ensure that any upstream update for drivers and new features will be automatically included.
* Pro: All of the latest kernel drivers and features will be automatically supported and
included by the distro, enabling the Partners to focus on upstream kernel.org development
and not worry about backporting things to older kernel versions. All bugfixes and security
updates that the vendor has not included in their minor updates are also pulled in at this
time (and there are a lot of them.)
* Con: Partners whose code is not present in kernel.org releases for whatever reason
(do not want it, incompatible licenses, etc.) will have to do a bit more work in tracking
the new releases, although this should be only be slightly more than the current amount of
development and testing that they currently do.
I don't think this is realistic.
The big problem with this scenario is that it ignores the fact that kernel.org kernels are on the whole significantly less stable these days than they used to be. With the unified development/stable model, we introduce a lot of half-baked untested code into the trees, and this typically doesn't get stabilised until after a distro rebases to that kernel for their next release, and uncovers all the nasty problems with it whilst it's in beta.
As well as pulling 'all bugfixes and security updates', a rebase pulls in all sorts of unknown new problems.
It isn't just new code that is problematic either. Which each upstream point revision, we fix x regressions, and introduce y new ones. This isn't going to make enterprise customers paying lots of $ each year very happy.
Greg points out earlier in his write-up that some customers decide to stay on earlier revisions because they fear regressions in production systems that much. I fear such a move would only increase this use scenario.
Next, some upstream kernel releases are real stinkers. Sometimes the timing is right, the planets line up, and you manage to base a product on a solid base. 2.6.18 was pretty well-rounded for example, and I'm glad we moved up to it for RHEL5 rather than our original plan of 2.6.17 an extra 3 months of in-house stabilisation. On the flip-side, basing Fedora7 on 2.6.21 may not have been such a great idea judging from the fallout in bugzilla.
In an ideal world, the upstream -stable process would be working to such an extent that any 'must haves' that fix up most of the damage would get backported from 2.6.22-rc, but due to (perfectly valid) acceptance criteria in -stable like 'must not be too big', not everything that is needed makes it. End result -- I'm holding out for 2.6.22 to 'fix the world'.
But of course, when we rebase, there will be a host of new problems to deal with.
When we stop rebasing the kernel for a new RHEL release, we typically spend 3 months doing nothing but shaking out the bugs, and a lot of these bugs aren't fix at the time in the next upstream version, so it's more than just backporting we're talking about here. Sustained bug-fixing without introducing new problems along the way isn't easy, and it sure as hell isn't fun, which probably explains why so few people actually enjoy this work. The upstream kernel model as-is isn't designed for this kind of activity. (See for eg, the negative reaction to the recurring suggestion of a 'bugfix only' release).
Finally, the 'ABI' issue. Rebasing a kernel decimates any semblance of an ABI.
Functions disappear, change prototypes, change semantics etc. Someone :-) wrote a nice 'stable API nonsense' document for the kernel Documentation dir explaining just the reasons why. "Get your code upstream" isn't the universal answer. As much as I agree with Greg's stance on binary kernel modules, I don't see nvidia, vmware & co opening their code overnight. Fedora users know only too well how often these modules don't build/work when we rebase. This kind of breakage in an update isn't acceptable for the people paying for those expensive support contracts. Whilst our existing ABI promises aren't perfect (sometimes, screw-ups still occur, and by god do we hear about it when we do), but the tools here are getting better, and for the most part, a binary module built against RHEL5-GA will work on RHEL5-U1 and beyond.
So, what is the ideal answer to Greg's dilemma? I'm not sure, but I don't think the status quo is going to change much any time soon. The current situation isn't perfect, but as Greg summarised, it's for the most part, well understood even if it does involve more work all round for everyone.