-
rmustacc
danmcd: No, you shouldn't have to introduct a new value per se. There isn't another save mechanism really.
-
rmustacc
There are definitely challenges if you need amx, but we shouldn't be enabling that.
-
danmcd
Oh I don't need amx, but for some reason, cpuid_get_xsave_size() is returning 0x2b00 (11008) and it's landing as _plt_save_size .
-
danmcd
Node's large use of threads, memory management, etc. appear to put us into a situation where we don't have enough thread-stack to make that happen.
-
danmcd
ld.so.1`_plt_save_size: 2b00
-
danmcd
THat's on Sapphire Rapids (AMX available...)
-
danmcd
Alder Lake is smaller:
-
danmcd
ld.so.1`_plt_save_size: a88
-
danmcd
Ooops, not alder lake, that's Crystal Lake (11th Gen core)
-
danmcd
Same on alder lake.
-
danmcd
This explains my segv.
-
danmcd
cpuid_get_xsave_size() returns MAX of our xsave_state structure size OR what the CPU reports
-
danmcd
`cpuid_info0.cpi_xsave.xsav_max_size` ==> this gets huge on AMX-equipped machines, and the block comment section about coping with AMX in the future lists the long set of problems.
-
danmcd
We don't support AMX yet, so maybe just maybe we should make sure cpuid_get_xsave_size() is a bit more conservative?
-
Woodstock
shouldn't it return MIN instead?
-
danmcd
@Woodstock if we're making it static, possibly? I'd like a 3rd opinion, and this coming week is Thanksgiving in the US so some of those opinion-holders may be travelling.
-
danmcd
There's big commentary about coping with AMX in $UTS/intel/os/fpu.c. (99.99% sure it's a genuine rmustacc Epic Block Comment(TM).)
-
Woodstock
danmcd: i haven't looked at the code in quite a while, it just seemed odd that we'd be using a value _larger_ than our structure
-
danmcd
Intuitively it makes sense. It was introduce in -gate with fenix illumos#8534 (and friends).
-
fenix
FEATURE 8534: Want AVX-512 Support (Closed)
-
fenix
-
danmcd
I know getting MAX/MIN wrong is not uncommon, but IF it was gotten wrong, we had enough thread-stack-size to cope until AMX-machines showed up.
-
danmcd
Also, it's quite possible the Node {,ab}usage we have in Triton might be doing things like creating threads with smaller stack sizes.
-
danmcd
I'm gonna try a dirty hotpatch trick right now in my kernel...
-
danmcd
Yeah it stopped the bleeding.
-
danmcd
mdb -ke "cpuid_info0::print -athd cpi_xsave.xsav_max_size" ==> How much is YOUR CPU reporting it needs at most for XSAVE
-
danmcd
My reports above from _plt_save_size come right from the kernel's saved CPU state, which is set per-process at exec() time.
-
danmcd
My hotpatch was to shrink Sapphire Rapids xsav_max_size to Alder Lake sizes (2696 vs. 11008). The other possible hotpatch is to literally fix cpuid_get_xsave_size() to put MIN in instead of MAX.
-
danmcd
(cmovq.ae becomes cmovq.be)
-
danmcd
If @Woodstock is correct about cpuid_get_xsave_size() (s/MAX/MIN/), we should file a bug and fix this. If he's not, we need to know why.
-
danmcd
Pardon the paste: A quick what-I-Can-see CPU survey (HDC is Zen2 so not included as there's a Zen2 already)
-
danmcd
note, sizeof (struct xsave_state) == 832
-
danmcd
VMware emulated kneecap ==> 832
-
danmcd
Kebecloud nodes: Xeon E[35] - Gen3 (Haswell/Haswell-E) ==> 832
-
danmcd
mango: EPYC 7302P (Zen2) ==> 896
-
danmcd
My big NUC: Core Gen11 Mobile (Tiger Lake) ==> 2696
-
danmcd
My small NUC: N100 (Alder Lake) ==> 2696
-
danmcd
Shiny new Box: Xeon Scalable Gen4 ==> 11008
-
sommerfeld
danmcd: would not be surprised if MAX of what-hardware-reports and our struct size is intentional (on the theory that hardware may write up to N bytes so we need that much space reserved)
-
sommerfeld
(if all bells and whistles are enabled, that is).
-
sommerfeld
-
fenix
→ FEATURE 8534: Want AVX-512 Support (Closed)
-
sommerfeld
-
fenix
→ FEATURE 8536: xsave area should size dynamically, based on CPU features (Closed)
-
sommerfeld
AVX512 adds almost 2k to the xsave area in each LWP. Intel clearly intends for the xsave area to be dynamically sized based on the features available in the current CPU. We should do that instead of blindly just increasing the size as we did for AVX.
-
sommerfeld
(that was a quote from 8536)
-
sommerfeld
(are their BIOS settings to disable AMX and does that shrink the reported max save size?)
-
gitomat
[illumos-gate] 17748 installboot: potential null pointer dereference -- Toomas Soome <tsoome⊙mc>