07:20:37 [illumos-gate] 5190 nvlist_lookup_nvpair should return ENOENT if there is no target NVP, but it returns EINVAL -- Toomas Soome 07:29:21 eh, this issue was created 2014:) 14:16:16 [illumos-gate] 16435 util-tests setup fixes -- Bill Sommerfeld 17:02:33 Not that I understand much of it, but in freebsd they added a bunch of SIMD optimized string functions in libc recently and also have SSE2 as a baseline: http://fuz.su/~fuz/freebsd/2023-04-05_libc-proposal.txt and https://github.com/freebsd/freebsd-src/commits?author=clausecker 17:11:12 I was under the impression that any CPU built that supports x86-64 also supports SSE2 17:11:44 but I don't have exhaustive knowlege - perhaps there's some weirdo system out there that boots illumos but somehow lacks SSE2 in 32-bit mode. 17:12:52 given the added diversity on the 64-bit side it would make sense to use the v1/v2/v3/v4/... thing. 17:18:43 I'm guessing that the Intel APX thing (more integer registers, 3-operand integer ops) will get designated as x86-64 v5 or v6 at some point) 17:19:24 I think there are two disjoint things to think about. 17:19:37 So if you just want a faster version of x in libc, we should really just use object capabilities. 17:19:42 Which will cause rtld to pick the right thing. 17:20:04 Most of what we want is just optimized bcopy/mem/str* functions. 17:20:24 The v1/etc. things are there, but also they're very Intel-centric and most folks don't want to build 4 copies of a library. 17:20:49 For example, libmd uses this to enable the sha optimized instructions and not play lofs games. 17:33:39 and the other? 18:35:29 sommerfeld: And the other what? 18:35:59 my understanding was that the separate full copies of libc were to avoid having to have hundreds of object capability functions to handle the different system call instructions on AMD vs Intel 18:36:09 the two disjoint things - one being "faster version of x selected via object capabilities" 18:36:36 Yeah, I can get not wanting that for the specifics of libc system calls. 18:36:49 Less certain it's relevant for say amd64 and string / memory functions. 18:37:52 sommerfeld: Sorry for being incomplete there. There are two kind of different things I see here. One is I want to use an optimized version of function X which is often a performance sensitive thing (memory, string functions). The other is I want the compiler to build with -march=XXX so the baseline compiler is taking advantage of newer things. 18:37:53 right - using hw capabilities for individual function optimizations makes sense 18:38:48 I don't think we necessairily need a libsmbios.so.1 that's build at the different modern psABI definitions. 18:39:27 But do we probably want a more modern bcopy that handles small and large copies well using faster instrucitons, probably. 18:39:51 (With the caveat of having to be careful with AVX-512 for small copies on early Intel AVX-512 systems). 18:41:18 Does that make more sense? 18:42:10 yes. It occurs to me that rather than lofs games another approach would be to use pkg facets. A distribution could decide how much is worth recompiling. 18:42:52 You only need lofs for libc because it's somewhat special. 18:43:00 rtld already deals with the idea of checking arch-specific things. 18:43:11 But again, I don't think that's infra that's worth it in illumos for most things. 18:43:25 At least, from a headache to value perspective. 18:45:01 IIRC, rtld understands the idea of isalist(7) to do all this and even isaexec works this way. 18:45:34 But there are multiple ways to skin the cat, but I think what's actually valuable is the hard part: high-perf asm functions. 18:47:52 For raising the baseline it's also an engineering time vs. recompile time tradeoff. Per binary analysis to see where it's worthwhile vs. just recompile & repackage the world N different ways (for small N). 18:50:41 there can definitely be big payoffs from tweaking specific leaf functions to go fast using modern extensions. 19:42:54 (and one of these days I'm going to get myself a faster build machine) 19:49:00 lofs for libc happens because of "somewhat special" in general, but also because the syscall path changing means that an awful lot of libc gets sucked into the capabilities bits if you do it that way 19:49:24 long ago (and I don't' know if I have it still) I had a symcaps-y libc, it wasn't great to achieve 19:49:49 (and if you already have 3 libcs, why symcap the rest inside them, when it will always swing the same way) 19:53:19 there is also probably a way to implement it sparc-ily, if you were willing to reintroduce libc_psr 19:56:13 rmustacc: if I remember, the high-perf asm for common stuff is something you could just ask the CPU vendor for 19:56:32 I remember AMD's implementation breaking shit for a while in the opensolaris days 19:56:47 (probably the same memcpy v. memmove thing that we hit on arm) 20:57:31 So I'm attempting to wrap up 16192 and I realized that there's a potential ABI gotcha: some compilers actually believe you when you declare a function as __NORETURN, and will sometimes generate code that will exhibit undefined behavior (return into things that aren't code) if the NORETURN function actually returns.. 20:58:16 and the assfail() and assfail3() in libfakekernel will return if aok is set nonzero. 20:59:39 so marking a function __NORETURN is a point-of-no-return in the ABI in two ways - both on the surface level, and in terms of being a change that can't be undone easily. 21:12:10 does that imply the noreturn function means it doesn't push a (valid?) return address? 21:14:10 um, test.c:8:1: warning: 'noreturn' function does return -- this should ring some alarms, no? 21:16:01 tsoome: interesting that `int __attribute__((noreturn)) foo(void);` is fine 21:16:16 (as long as it doesn't actually return that int) 21:17:17 The caller of the noreturn function uses a call, not a jmp, to get there. 21:17:33 ah, so we just stumble out of the bottom? 21:17:38 yep. 21:19:24 https://godbolt.org/z/1MjfW5GWv 21:20:47 clang will happily generate code that would return to a string constant: https://godbolt.org/z/v7he1nnG8 21:21:32 ouch 21:22:51 tsoome: right, if the caller and callee are compiled the same way with the same prototypes in scope, it will catch the error 21:24:44 but if they're not (if something built with a noreturn assfail in scope is linked with the not-noreturn libfakekernel implementation) then setting aok in libfakekernel sets you up for unhappiness. 21:27:46 the reverse mix is of course compatible (nobody says a function not declared noreturn ever has to return) 21:29:00 this as well: https://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/os/printf.c?r=eb633035#316 21:29:24 I figured I have seen aok around recently:) 21:29:38 tsoome: yes, that's what libfakekernel is copying. 21:31:50 mkay, makes sense. 21:44:07 richlowe: gcc doc says "It does not make sense for a noreturn function to have a return type other than void." but gcc doesn't enforce it or even warn about it. 22:35:08 good lord, aok 22:36:09 also, good lord, aask 22:39:41 also TIL some memory doesn't get zeroed on reboot on SPARC 22:39:43 (panicbuf, etc) 22:41:32 I have this vague recollection that setting aok managed to be useful in recovering data from a badly scrambled zfs pool. 22:48:34 sommerfeld: Are you thinking perhaps of zfs_panic_recover() and zfs_recover 22:49:52 While I think the idea is dubious in general, that one at least is definitely not a noreturn function haha 22:50:38 I feel like the utility of being able to properly annotate assfail*() as being noreturn is higher than the utility of aok generally 22:54:36 jclulow: (re: zfs_recover). hmm. perhaps, but this might have been something either (a) before the addition of zfs_panic_recover or (b) something that required both..