00:09:45 nice.. hopefully some illumos people can make it 13:44:03 hrm... this is a potentially interesting state on this machine 13:50:46 it appears to have deadlocked while starting up the cpus... 13:51:14 I think while issuing the cross call to start a cpu 15:28:27 also, ::cpustack on a particular CPU is failing with EINVAL which seems odd 16:40:15 The place this ipv6 fastpath/smartos issue has taken me are fascinating 16:40:16 https://src.illumos.org/source/xref/illumos-gate/usr/src/lib/libdlpi/common/libdlpi.c?r=bbf21555&mo=12251&fi=437#453-459 16:40:58 (that's not the problem, just gave me a laugh) 16:45:57 Given that token ring lost, and we have no way to test it even if it didn't, I imagine we could remove any vestiges of support. I don't even know how TR works or if we even have drivers; and linux pulled their support for it in 2012. 16:46:25 the key thing is to make sure it doesn't fall off the end of the wire :) 16:46:41 it being the token 16:48:34 in theory it's supposed to behave better in that you don't really get collisions with token ring, but I believe in the end between switches becoming a thing (vs. hubs) and moar bandwith, those advantages lost out vs. the complexity (plus wasn't it an IBM thing meaning they charged more for it?) 16:52:05 From what wikipedia says it lost to fast/gig ethernet. 16:52:06 https://en.wikipedia.org/wiki/Token_Ring#1985_IBM_launch 16:52:29 I'd say that's squarely in the dead column. 16:56:15 jbk: some aspects of various token rings had some advantage over original CSMA/CD broadcast 10MB/s ethernet. But ethernet switching (now ubiquitous) firmly beats relaying packets through half the nodes on the network on average.. 16:58:36 (there were many token rings. Proteon had one. Apollo had another (with 4K+headers with header/data split which made zero-copy page-flipping efficient). IBM's was big for a while. FDDI finally got clobbered by gig ethernet. 17:16:41 ahh fddi.. i remember briefly dealing with that back in the early 00's 17:16:54 sprint was using it for their backup network at the time 17:20:42 so the latest bits on this HPE server is making me wonder if we're deadlocking the APICs somehow... 17:49:32 i still feel like i'm fumbling a bit in the dark, but at least now i think i'm maybe seeing the outlines of some things :) 17:51:10 * nomad hands jbk an Emulex trade-show swag flashlight 17:51:35 i'd rather have a patch that fixes this :) 18:36:41 does ::switch in kmdb take the cpuid, or the address of a cpu_t ? 19:13:57 I think the addr. ::help text seems to indicate that. 19:14:14 `::walk cpu` *should* get you appropriate addresses. 19:15:02 Ahhh so does `::cpuinfo` , @jbk 19:15:12 yeah, didn't help.. still get Errno 22 trying to get at those CPUs registers... 19:15:38 Ouch ouch ouch. 19:16:07 I don't suppose use of `::stacks` can tell you if something in a CPU-initialization function? 19:16:36 i've been trying to gather any data that seems potentially interesting in the ticket (not sure if you've seen it), though my unfamiliarity with the interrupt stuff at this low level means I'm not always sure what is or is not interesting 19:17:59 i see one cpu ONPROC stuck in disp_lock_exit 19:18:17 "stuck" how? 19:18:43 disp locks could cause you trouble 19:19:17 it's been sitting there for 1+ hour 19:19:23 in a call to lock_clear_splx 19:20:05 at least doing a ::findstack on the value of that CPU's cpu_thread value 19:20:31 (which I don't know if that's interesting, or if there might be something else more useful) 19:20:36 this is from kmdb 19:21:38 one thing I'm wondering is that writing to an apic register in apic mode is apparently a serializing instruction, in x2apic mode it is explicitly not 19:22:01 and so now are we maybe tacitly been relying on the apic behavior and have just gotten lucky? 19:23:13 i mean, people have been using x2apic mode for a while, so i kinda discount that... but also given this is a dual CPU (64 core/ea) system... maaaaaaaaybe (a very big maybe) we haven't done much with x2apics and multi-cpus and that could explain why it's worked in the past??? 19:25:42 jbk: disp locks can really ruin your day here 19:26:41 (and all of this code is unmodified from upstream) 19:27:00 I can't remember how to tell post-morten if you have taken a disp lock regularly, or while high 19:28:31 i guess as a bit of a shot in the dark i can try inserting mfence; lfence before any x2apic msr writes just as a test... 19:39:16 we do it for cross calls (at least atomic_or_ulong() is used for that... i'm assuming that's ok.. but it's also using the address on the stack) 19:39:25 (x2apic_send_ipi) 19:39:29 but not the other writes 19:44:16 richlowe: we've not made any changes here from stock illumos-gate... 19:52:14 and i've never dealt with disp locks before, so that'd be a whole new thing to dig into... 20:03:37 I have seen the ticket updates @jbk. 20:04:33 * danmcd wonders if X2APIC stuff like this is yet-another-reason why Oxide built their own HW ? 20:05:11 you would have to build way more of your own hardware than they did to avoid this 20:05:19 Oh damn. 20:08:04 yeah, it's built into the CPU itself IIUC 20:08:53 according to intel's docs, there's only a small number of differences between the apic and x2apic 20:09:14 the big one being using msrs instead of MMIO as well as supporting 32-bit apic ids 20:09:17 instead of 8-bit 20:09:59 (so there's one register access that's a bit different between the two because apic you need to shift & mask to get the id 20:10:09 there's also one bit in a register that's no longer relevant 20:11:06 so it _shouldn't_ work that different from the apic code 20:18:33 (I guess I should say each core gets it's own apic) 20:27:02 each core or each thread? (haven't looked at apics since before hyperthreading...) 20:35:47 hrm.. the intel docs aren't very clear on that.. it could be every thread 20:36:07 but either way, it's not a discrete component from the cpu package 21:01:58 I don't have access the the PCI or PCIe spec -- but in the apic manual, it describes the MSI message as having an 8-bit destination id... for apic ids < 256, these values match 21:02:34 for the apic values > 256, it seems to be truncating the value...(and maybe relying on the IO-APIC???) 21:05:36 this now sounds familiar to me in another context 21:05:41 and I think you might be absolutely screwed 21:05:46 gimme a sec 21:07:19 when I set ddidebug to 0x401, it hangs pretty early starting the CPUs, with the added lfence and sfence instructions to apix_regops.c, it now gets back to where it was at least with mlxcx 21:22:39 it appears to be using a 32-bit 'address' in the msi-x table for the device.. 21:22:51 but i feel like i'm maybe missing a step in here somewhere 21:34:51 [illumos-gate] 15972 ZFS getattr could avoid many kidmap_cache_lookup_X calls -- Gordon Ross 21:34:51 [illumos-gate] 17677 SMB server on ZFS can avoid many kidmap_getXbyY calls -- Gordon Ross 21:34:51 [illumos-gate] 17678 zfs could avoid kidmap when there are no subgroups -- Matt Barden 21:35:12 could it be related to interrupt remapping/virtualization? 21:44:07 do we need to maybe limit interrupts for a device to cores within a single physical CPU 21:44:10 ? 21:49:55 as a side note, I'd really like us to prefer using C99-style initialization for the 'ops' structs (where we collect a bunch of function pointers that are meant to be overridden) 21:50:09 would make it a lot easier to see where they're getting assigned (or at least a lot less of a pain) 21:50:36 since we seem to override specific ones.. 21:57:19 https://stackoverflow.com/questions/67028147/how-does-msi-x-triggers-interrupt-handlers-is-there-a-need-to-poll-the-chosen-m has text which suggests that there is something in the iommu which will map the 8-bit cpu id in MSI into a 32-bit x2apic processor id. 21:58:29 rzezeski: I give you permission to sunset token ring :P 22:00:18 jclulow: thank you, it's on my list 22:02:49 sommerfeld: that matches what I'm trying to remember too, but the people involved aren't available at the moment 23:03:45 hrm.. the intel vt for directed i/o document does suggest that in x2apic mode, the upper 24 bits of the apic id go in the upper 40 bits of the address register... 23:04:19 worth a shot, though it does lead to 'is the support of this on the device tied to any specific capabilities?' 23:05:07 (i.e. is it possible to have devices that don't support all 64-bits of the msi address register?)