00:28:40 i don't suppose anyone has a system with the intel iommu enabled that could peek at some dev_info stuff w/ mdb? 00:29:52 i can't figure out what the actual intent was or what the correct state should be... 00:30:58 the system panics a few calls deep from immu_dvma_device_setup() 00:31:04 it's trying to sort out the domains 00:31:35 and when it hits the ISA bridge 00:32:10 it's trying to find out the iommu for the parent 00:32:29 which is the root complex 00:32:50 so immu_dmar_get_immu fails which panics 00:35:43 in a sense (being a bit loose here) the root complex is the iommu 00:36:21 so for devices, we stash a struct in the device's dev_info node that (among other things) points at the iommu to use 00:36:36 that seems to all be correct 00:37:16 but what's not clear is if the root complex should also be pointing at it's iommu 01:31:13 richlowe: interesting, because I had 32 test names that did not have matching files and test runner said nothing 01:31:53 but I'll reverify that tomorrow to make sure I didn't miss something 02:30:18 Does anybody have exoeriences with Lenovo X280? I'd like to get a semi-modern ultraportable computer with NVME + supported GPU + WLAN. According to the HCL Intel UHD620 should be the latest gen Intel GPU, so it means Intel Core CPU 7-8th gen. 02:30:18 X280 have Intel 8th gen CPU, so that should be OK. Wifi can be replaced, Intel 6205 is the latest supported wlan chip, which exists in NGFF / M.2 format. So the chances are good for a well supported system, but no idea about audio. 02:40:12 I have a same gen T480s at hand, I'll test OpenIndiana with it to get a better picture. 13:48:40 is redmine not responding for anyone else? 13:50:22 oh there it is... i wonder if the AI bots are out in force today... 14:02:36 in some time AI bots can help you to find bugs in your list like google helps you to find data on your home storages 14:03:55 i'd prefer if they'd just umm.. fornicate with themselves and stop DDOSing infrastructure that people want to use... 14:05:52 well, it was a joke (or sarcasm) 14:07:01 yeah, it kinda reminds me of an old joke about andersen consulting (now known as accenture) 14:07:54 indeed, the site is very slow 14:11:05 sometimes it's the software itself (redmine), but other times, some AI bot decides it needs to pummel the system into the ground... usually we end up blocking some IPs for a while until they go away... 14:12:24 +1 for white list 14:32:00 jbk: re the IOMMU and parents.... So as I understand it, each segment is connected to a "root port" on a root complex; an RC may have multiple root ports, but each segment/port is going to, generally speaking, have its own IOMMU. 14:34:02 What makes the "root complex" is kind of ad hoc and ill-defined; so for your ISA<->PCI bridge hanging directly off of the RC, I think that may not be a totally accurate picture of the configuration: presumably the bridge is really on some port on the RC. But it's honestly all a little bit fuzzy. 14:44:58 looking more at the code last night... what's happening is (I think the isa bridge might just be the 'first' victim).. it's trying to lookup the domain for the device... it does that by traversing up the device tree to find the topmost device in the domain... the traversal code stops at the root nexus, but I _think_ it probably needs to stop if it reaches the root complex 14:47:11 since I think bridges show up with a different name in the device tree, so at least for testing, i'm just making the traversal stop when it reaches a device with a binding name of 'pciex_root_complex' that's a child of the root nexus (the latter check might be redundant, but that can be refined if it makes things work) 14:49:48 my concern was 'how was this working before' 14:52:05 but i guess this is all a very lightly treaded area in illumos 14:52:14 btw, are you still stuck in `MUTEX_SYNC` with AP startup? 14:52:38 that seemed to only happen when I would enable some extra ddidebug flags 14:52:44 so I think that might be a separate problem 14:53:31 i was setting ddidebug to 0x401 so I could see the msi-x assignments (among other things) 14:53:40 But you still have the problem of NMI only working one time? 14:53:44 and that seemed to trigger it pretty reliably 14:54:30 without that, the system panics early on because of the domain issue I just described, so I don't know yet :) 14:55:21 so hopefully I can maybe thread the needle and then swing back to dig into both of those 14:56:33 I have an hypothesis about the MUTEX_SYNC thing; still reading the code, though. 14:56:52 i should be able to easily recreate it though given that 14:58:31 so given the (current) oneshot restriction with kmdb, I could poke around and give you the info if that's something you'd be interested in... 15:09:44 I wouldn't mind getting some information, but don't want to waste your time if it's not convenient. 15:24:16 i've got a meeting w/ a customer right now, but i'm free the rest of the day if you have some time a bit later... I'd love to at least understand what's going on, even if the fix turned out to be messy or particularly hairy.. 15:29:59 turns out i have an extra week w/ this box, so i've got a little more time than I thought to try to get things booting... 15:34:45 Yeah, I've got some time later. Happy to chat when you're free; also happy to do a live call if that's more efficient. I'd like to (at least) understand what's up with NMIs and AP startup. 15:52:41 ok 18:17:39 grr.. pet peeve of mine .. if you encounter an unexpected/unknown value, log what the value actually is... 18:18:45 btw, I have some time now to chat about this stuff if you're free. 18:19:01 sure 18:19:08 Cooll 18:19:16 i could even setup a zoom if you want... 19:19:14 jclulow: Please ban https://www.illumos.org/users/14175 I just deleted two link SPAM posts pointing to some 3dChess site 19:19:49 josh is real busy rn 19:20:04 I think anyone with admin privs might be able to put them in a role that can't do things 19:20:11 but I don't recall who that is 19:20:19 I think danmcd and cross have done the role thing for people though 19:21:05 Huh what? I don't have sufficient privs on illumos.org's redmine unless someone gave it to me? 19:23:44 danmcd: you can't change people's role to "developer" and things? 19:24:07 danmcd: 'cos if you can, you could take the roles away that lets spammers file/comment... 19:26:28 I cannot. 19:26:43 I can only see a user profile on the page toasterson cited above. 19:28:33 drat 19:39:22 I only know josh as admin 19:40:14 the role to do that part used to be more spread around I thought. 19:40:42 but I guess I was wrong about where it spread to, based on dan's feedback 21:17:39 Not I; I am definitely not on the cool kids list for things like that. :-) 21:25:43 jbk, Woodstock: I think what I missed in iommu conversations is that intel maybe worked ok because you (Hans) fixed a bunch of stuff about 8 years ago 21:40:00 richlowe: yes, i did 21:40:53 but i stopped working on the intel iommu stuff since it wasn't useful for bhyve and pci passthru, and it would actually conflict with it 22:02:04 Woodstock: do you by any chance recall if the hardare had more than one iommu? 22:04:08 this one has 20.. and the current thing I'm trying to sort out is it's calling add_avintr() for each iommu, but all but the first one fails... I think we're ignoring some failures somewhere in the mix, and end up causing a gpf 22:04:50 which I gues means I get to dig into the av interrupt stuff too... :) 22:08:56 err no the first 10 succeed, then the no available IPI messages start 22:14:10 hrm.. 22:18:25 ahh.. so it's trying to allocate a vector for each iommu 22:18:33 and it's basically running out 22:19:15 i guess what I need to figure out is if it really needs to do that, or just use the same vector, and just add a handler for each iommu 22:20:45 or I guess allocate 1 vector, and add handlers for the iommus to that vector 22:24:49 or i guess an ipi vector to be specific since those seem to be handled separately by the code 23:09:06 ok.. i _think_ we probably just want one vector... though immu_intr_register() suggests that all of the interrupts for all the iommus will go to (os) cpu 0... which seems suboptimal (but functional).. i would think you'd want to spread them out across the available cpus 23:20:42 ... though I suppose you probably don't want to be doing much 'i/o paging' (for lack of a better term.. e.g. page faults on dma memory to create mappings)