-
rmustacc
sjorge: What cpu are you advertising?
-
rmustacc
While I know it's some kind of Intel, the specific one's important.
-
richlowe
you know when you see a bug that makes your brain itch?
-
richlowe
-
fenix
→
BUG 16662: illumos should not be dishonest about radixes (New)
-
richlowe
alanc: since you enjoy these, or seem to
-
jbk
richlowe: those make me channel chief inspector dreyfus.. especially when i see a message with a value like '12' but is actually hex w/o the 0x
-
sjorge
rmustacc Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz CPU
-
andyf
sjorge - 0x140 on Intel is MSR_MISC_FEATURE_ENABLES, so it seems like the guest is trying to disable all of the "misc features". bhyve doesn't support/emulate that MSR so it ends in that trace. I'd start by trying to find where it does it in the Linux source, and when/why it was introduced, then we can decide if we should tweak bhyve to do something here (like accept writing a 0 as a NOP).
-
andyf
It might also be best to leave it as-is so the guest knows it is not supported, but I am not a fan of boot messages containing backtraces when running under bhyve if we can do something better.
-
richlowe
-
richlowe
-
richlowe
sorry
-
richlowe
it seems the cpuid faulting stuff is it?
-
richlowe
-
richlowe
obvious searching doesn't find any other writes
-
richlowe
-
richlowe
but that's hardly recent
-
richlowe
-
Meths
Is there a doc for the correct process when replacing NVMe drives? Just managed to wedge a SmartOS server swapping disks. Simply pulled the old disk and inserted the new. Then 'nvmeadm list' hung. After a forced reboot things look okay and new disk is showing in the list output.
-
neuroserve
-
rmustacc
Meths: Would have expected it to work, assuming a hotplugable slot. If you get that kind of hang again, a crash dump would help.
-
rmustacc
If we didn't get notified about the removal, there will be limitations to what we can do.
-
Meths
Thanks neuroserve, interesting video. May try using the cfgadm commands next time.
-
Meths
rmustacc: Will do on the dump if we duplicate it. The logs seemed to show things being notified and picking up on the activity. We seemed to get notice of the pci-level and disk device level removal. On insertion it only logged activity at the pci layer.
-
rmustacc
Hmm, what bits were you using?
-
rmustacc
As there was a bunch to make sure the device was auto-onlined, but either way that shouldn't have led to an nvmeadm list hang.
-
Meths
The drives are Samsung 8TB, AMD EPYCs and it was running 20220310, now following the reboot we're on 20240627.
-
Meths
SmartOS that is.
-
rmustacc
Let me double check but I think a bunch of the fixes and auto-online stuff postdates that smartos release.
-
rmustacc
Yeah, illumos#15942 for example.
-
rmustacc
And likely illumos#15689.
-
sjorge
andyf I'm still on Ubuntu LTS 22.04 (as you can't upgrade to 24.04 yet) so I was really surprised to see it, it does seem 'harmless' but yeah the trace is just nasty
-
sjorge
Nice digging from richlowe, is that something we already pulled in?
-
sjorge
I thank from the trace its the one in intel.c ?
-
sjorge
*think