00:35:23 sjorge: What cpu are you advertising? 00:35:44 While I know it's some kind of Intel, the specific one's important. 03:32:45 you know when you see a bug that makes your brain itch? 03:32:47 https://www.illumos.org/issues/16662 03:32:48 → BUG 16662: illumos should not be dishonest about radixes (New) 03:33:42 alanc: since you enjoy these, or seem to 03:51:40 richlowe: those make me channel chief inspector dreyfus.. especially when i see a message with a value like '12' but is actually hex w/o the 0x 06:21:49 rmustacc Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz CPU 09:02:24 sjorge - 0x140 on Intel is MSR_MISC_FEATURE_ENABLES, so it seems like the guest is trying to disable all of the "misc features". bhyve doesn't support/emulate that MSR so it ends in that trace. I'd start by trying to find where it does it in the Linux source, and when/why it was introduced, then we can decide if we should tweak bhyve to do something here (like accept writing a 0 as a NOP). 09:05:12 It might also be best to leave it as-is so the guest knows it is not supported, but I am not a fan of boot messages containing backtraces when running under bhyve if we can do something better. 09:10:56 there's also https://lists.freebsd.org/archives/dev-commits-src-branches/2023-January/009071.html 09:13:22 uh, https://lore.kernel.org/kvm/20190329135422.15046-3-xiaoyao.li⊙lic/T/ 09:13:25 sorry 09:17:25 it seems the cpuid faulting stuff is it? 09:20:00 https://github.com/torvalds/linux/blob/0c3836482481200ead7b416ca80c68a29cfdaabd/arch/x86/kernel/process.c#L341 09:20:26 obvious searching doesn't find any other writes 09:27:01 and https://github.com/torvalds/linux/blob/0c3836482481200ead7b416ca80c68a29cfdaabd/arch/x86/kernel/cpu/intel.c#L587 09:27:49 but that's hardly recent 09:28:33 https://github.com/torvalds/linux/commit/90218ac77d0582eaf2d0872d8d900cbd5bf1f205 apparently 13:20:00 Is there a doc for the correct process when replacing NVMe drives? Just managed to wedge a SmartOS server swapping disks. Simply pulled the old disk and inserted the new. Then 'nvmeadm list' hung. After a forced reboot things look okay and new disk is showing in the list output. 13:46:13 Meths : I can only remember https://www.youtube.com/watch?v=UICLVrtHOUc&list=PLfHkpKdowDoi6pWLkwdSpCQs2obgRX4VZ&index=16 regarding NVMEs 14:53:29 Meths: Would have expected it to work, assuming a hotplugable slot. If you get that kind of hang again, a crash dump would help. 14:54:07 If we didn't get notified about the removal, there will be limitations to what we can do. 14:58:46 Thanks neuroserve, interesting video. May try using the cfgadm commands next time. 15:00:22 rmustacc: Will do on the dump if we duplicate it. The logs seemed to show things being notified and picking up on the activity. We seemed to get notice of the pci-level and disk device level removal. On insertion it only logged activity at the pci layer. 15:20:38 Hmm, what bits were you using? 15:21:06 As there was a bunch to make sure the device was auto-onlined, but either way that shouldn't have led to an nvmeadm list hang. 16:38:42 The drives are Samsung 8TB, AMD EPYCs and it was running 20220310, now following the reboot we're on 20240627. 16:40:46 SmartOS that is. 16:53:02 Let me double check but I think a bunch of the fixes and auto-online stuff postdates that smartos release. 16:53:31 Yeah, illumos#15942 for example. 16:54:02 And likely illumos#15689. 18:29:36 andyf I'm still on Ubuntu LTS 22.04 (as you can't upgrade to 24.04 yet) so I was really surprised to see it, it does seem 'harmless' but yeah the trace is just nasty 18:31:12 Nice digging from richlowe, is that something we already pulled in? 18:31:45 I thank from the trace its the one in intel.c ? 18:31:47 *think