-
gitomat
[illumos-gate] 5190 nvlist_lookup_nvpair should return ENOENT if there is no target NVP, but it returns EINVAL -- Toomas Soome <tsoome⊙mc>
-
tsoome
eh, this issue was created 2014:)
-
gitomat
[illumos-gate] 16435 util-tests setup fixes -- Bill Sommerfeld <sommerfeld⊙ho>
-
wiedi
Not that I understand much of it, but in freebsd they added a bunch of SIMD optimized string functions in libc recently and also have SSE2 as a baseline:
fuz.su/~fuz/freebsd/2023-04-05_libc-proposal.txt and
github.com/freebsd/freebsd-src/commits?author=clausecker
-
sommerfeld
I was under the impression that any CPU built that supports x86-64 also supports SSE2
-
sommerfeld
but I don't have exhaustive knowlege - perhaps there's some weirdo system out there that boots illumos but somehow lacks SSE2 in 32-bit mode.
-
sommerfeld
given the added diversity on the 64-bit side it would make sense to use the v1/v2/v3/v4/... thing.
-
sommerfeld
I'm guessing that the Intel APX thing (more integer registers, 3-operand integer ops) will get designated as x86-64 v5 or v6 at some point)
-
rmustacc
I think there are two disjoint things to think about.
-
rmustacc
So if you just want a faster version of x in libc, we should really just use object capabilities.
-
rmustacc
Which will cause rtld to pick the right thing.
-
rmustacc
Most of what we want is just optimized bcopy/mem/str* functions.
-
rmustacc
The v1/etc. things are there, but also they're very Intel-centric and most folks don't want to build 4 copies of a library.
-
rmustacc
For example, libmd uses this to enable the sha optimized instructions and not play lofs games.
-
sommerfeld
and the other?
-
rmustacc
sommerfeld: And the other what?
-
alanc
my understanding was that the separate full copies of libc were to avoid having to have hundreds of object capability functions to handle the different system call instructions on AMD vs Intel
-
sommerfeld
the two disjoint things - one being "faster version of x selected via object capabilities"
-
rmustacc
Yeah, I can get not wanting that for the specifics of libc system calls.
-
rmustacc
Less certain it's relevant for say amd64 and string / memory functions.
-
rmustacc
sommerfeld: Sorry for being incomplete there. There are two kind of different things I see here. One is I want to use an optimized version of function X which is often a performance sensitive thing (memory, string functions). The other is I want the compiler to build with -march=XXX so the baseline compiler is taking advantage of newer things.
-
alanc
right - using hw capabilities for individual function optimizations makes sense
-
rmustacc
I don't think we necessairily need a libsmbios.so.1 that's build at the different modern psABI definitions.
-
rmustacc
But do we probably want a more modern bcopy that handles small and large copies well using faster instrucitons, probably.
-
rmustacc
(With the caveat of having to be careful with AVX-512 for small copies on early Intel AVX-512 systems).
-
rmustacc
Does that make more sense?
-
sommerfeld
yes. It occurs to me that rather than lofs games another approach would be to use pkg facets. A distribution could decide how much is worth recompiling.
-
rmustacc
You only need lofs for libc because it's somewhat special.
-
rmustacc
rtld already deals with the idea of checking arch-specific things.
-
rmustacc
But again, I don't think that's infra that's worth it in illumos for most things.
-
rmustacc
At least, from a headache to value perspective.
-
rmustacc
IIRC, rtld understands the idea of isalist(7) to do all this and even isaexec works this way.
-
rmustacc
But there are multiple ways to skin the cat, but I think what's actually valuable is the hard part: high-perf asm functions.
-
sommerfeld
For raising the baseline it's also an engineering time vs. recompile time tradeoff. Per binary analysis to see where it's worthwhile vs. just recompile & repackage the world N different ways (for small N).
-
sommerfeld
there can definitely be big payoffs from tweaking specific leaf functions to go fast using modern extensions.
-
sommerfeld
(and one of these days I'm going to get myself a faster build machine)
-
richlowe
lofs for libc happens because of "somewhat special" in general, but also because the syscall path changing means that an awful lot of libc gets sucked into the capabilities bits if you do it that way
-
richlowe
long ago (and I don't' know if I have it still) I had a symcaps-y libc, it wasn't great to achieve
-
richlowe
(and if you already have 3 libcs, why symcap the rest inside them, when it will always swing the same way)
-
richlowe
there is also probably a way to implement it sparc-ily, if you were willing to reintroduce libc_psr
-
richlowe
rmustacc: if I remember, the high-perf asm for common stuff is something you could just ask the CPU vendor for
-
richlowe
I remember AMD's implementation breaking shit for a while in the opensolaris days
-
richlowe
(probably the same memcpy v. memmove thing that we hit on arm)
-
sommerfeld
So I'm attempting to wrap up 16192 and I realized that there's a potential ABI gotcha: some compilers actually believe you when you declare a function as __NORETURN, and will sometimes generate code that will exhibit undefined behavior (return into things that aren't code) if the NORETURN function actually returns..
-
sommerfeld
and the assfail() and assfail3() in libfakekernel will return if aok is set nonzero.
-
sommerfeld
so marking a function __NORETURN is a point-of-no-return in the ABI in two ways - both on the surface level, and in terms of being a change that can't be undone easily.
-
richlowe
does that imply the noreturn function means it doesn't push a (valid?) return address?
-
tsoome
um, test.c:8:1: warning: 'noreturn' function does return -- this should ring some alarms, no?
-
richlowe
tsoome: interesting that `int __attribute__((noreturn)) foo(void);` is fine
-
richlowe
(as long as it doesn't actually return that int)
-
sommerfeld
The caller of the noreturn function uses a call, not a jmp, to get there.
-
richlowe
ah, so we just stumble out of the bottom?
-
sommerfeld
yep.
-
sommerfeld
-
sommerfeld
clang will happily generate code that would return to a string constant:
godbolt.org/z/v7he1nnG8
-
tsoome
ouch
-
sommerfeld
tsoome: right, if the caller and callee are compiled the same way with the same prototypes in scope, it will catch the error
-
sommerfeld
but if they're not (if something built with a noreturn assfail in scope is linked with the not-noreturn libfakekernel implementation) then setting aok in libfakekernel sets you up for unhappiness.
-
sommerfeld
the reverse mix is of course compatible (nobody says a function not declared noreturn ever has to return)
-
tsoome
-
tsoome
I figured I have seen aok around recently:)
-
sommerfeld
tsoome: yes, that's what libfakekernel is copying.
-
tsoome
mkay, makes sense.
-
sommerfeld
richlowe: gcc doc says "It does not make sense for a noreturn function to have a return type other than void." but gcc doesn't enforce it or even warn about it.
-
jclulow
good lord, aok
-
jclulow
also, good lord, aask
-
jclulow
also TIL some memory doesn't get zeroed on reboot on SPARC
-
jclulow
(panicbuf, etc)
-
sommerfeld
I have this vague recollection that setting aok managed to be useful in recovering data from a badly scrambled zfs pool.
-
jclulow
sommerfeld: Are you thinking perhaps of zfs_panic_recover() and zfs_recover
-
jclulow
While I think the idea is dubious in general, that one at least is definitely not a noreturn function haha
-
jclulow
I feel like the utility of being able to properly annotate assfail*() as being noreturn is higher than the utility of aok generally
-
sommerfeld
jclulow: (re: zfs_recover). hmm. perhaps, but this might have been something either (a) before the addition of zfs_panic_recover or (b) something that required both..