00:09:14 fwiw, I got OmniOS booting OKish on linode (nano 1g vms, paravirt mode) 00:09:57 shutdown -i6 -y just halts but the watchdog from linode brings it back up 02:13:34 a bit of an into the weeds question, but is there typically anything wired up on an LPC bus where it'd make sense to add acc checking w/ fault management? 10:42:07 sjorge runlevel 6 is reboot;) 10:42:47 sjorge or you meant that the expected reboot did not work and instead you got the kick from watchdog 11:34:29 tsoome: yes, I expect it to reboot but it doesn't 15:08:11 jbk: I wouldn't be doing much of anytihng to change the LPC bus, especially since it's gone on most new platforms. 15:08:47 jbk: Also, what mechansim in the protocol would you use? Just things that are checking for the ranges you send there or something else? 15:48:52 rmustacc: Not sure if you answered this already, if so my apologies. FM is offlining a NVME device, which is resulting in me not being able to get the Serial Number / Location of the device that is offline now. I saw in your IPD to cache those values I think.... is it possible to get that info today, or do we need additional work? 15:56:35 i mean for a device on the LPC bus -- some drivers will include the DDI_FLAGERR_ACC in ddi_device_acc_attr.devacc_attr_access and then will call ddi_fm_acc_err_get presumably after one or more calls to ddi_{get,put}XX() 15:57:47 so maybe stating it differently, for a device that sits on an LPC bus (or behind an PCI-LPC bridge) will ddi_fm_acc_err_get actually be able to do anything? 15:59:31 since the isa module itself I don't think has any FM support, presumably it needs to be involved for any children that would want to use ddi_fm_acc_err_get() 16:02:37 [illumos-gate] 16085 use modern libxml2 API -- Andy Fiddaman 16:59:47 jbk: So in this case you would expect that the nexus hierarchy would not set any fm capable flags in ddi_fm_init() and therefore the driver shouldn't acually be setting DDI_FLAGERR_ACC in anything or at least it won't actually happen. 16:59:58 But I'm not exactly sure. 17:00:20 The request to ddi_fm_init() children of the isa module should go to it and it shouldn't be setting flags I imagine. 17:00:45 Smithx10: I didn't really do anything about caching. 17:01:39 Sorry maybe I didnt understand this right 17:01:39 Providing interfaces that make it easy to snapshot information and then consume it when the device is no longer present. For example, the smbios -w or pcieadm save-cfgspace commands make it so we can capture data on a target system in a way that it can be sliced and decided on an entirely different system later. 17:02:00 There was an ipd about adding similar logic to nvmeadm. 17:02:05 ahhh ok 17:02:17 But in this case, the question would be does your system have a topo map that reflects the locations? 17:02:49 If so, I'd expect that the FMRI was there. 17:02:53 I've noticed diskinfo -P doesn't show locations and grepping for serial numbers in fmtopo doesn't result in finding the device 17:03:25 Then that means your systems don't have a topo map that has those mappings together. 17:03:41 And likely therefore fma doesn't have that either. 17:05:00 I think i recall that it was because that info was nested and I don't think we walked down to get that info. 17:05:11 Probably gonna just have the guy stand next to the array and run a scrub and find the non blinking light lol 17:07:06 Do you have another system with an identical layout? 17:07:40 we have a few of the same SKUs yea 17:07:58 so far all the differnet SKU we run have no luck getting info 17:14:41 OK, well you should be able to map that bridge to a slot. 17:14:51 Who makes the system? 17:34:28 HP, and SM 17:36:19 https://gist.github.com/Smithx10/141fe77316876782b17c4f22dcd632b9 18:15:57 OK, so what we should be able to do is to based on the /devices path that is retired figure out what slot that is. 18:16:03 Because it'll correspond to a particular bridge. 18:22:44 On the 3 servers, what should I find || grep for in those paths to find it? 19:14:26 [illumos-gate] 16098 properly escape backslashes per mdoc -- Robert Mustacchi 19:15:19 Smithx10: I guess you probably want to start with finding the /devices path and go from there. 19:19:01 @rmustacc like this ? https://gist.github.com/Smithx10/d5c5ca1b76a5d68fe0f98c03f3fe3818 19:19:21 the nvme device doesn't show up in /dev/rdsk since fm took it out 19:19:29 Right, but you'll have the parent. 19:19:43 And you should have the retired note in prtconf and realted. 19:19:45 *related 19:20:38 I get this from fmadm faulty https://gist.github.com/Smithx10/0b4c4572fdb05978d37d16376e4861cc 19:23:22 @rmustacc https://gist.github.com/Smithx10/b45e5275fa2c4d958f61c13ce4d2b015 19:24:33 Sorry, I probably can't quite hand hold you through this right now. But the basic thing is each bridge has a slot and the same bridge should be in the same slot on both systems. 19:24:36 So you can get it that way. 19:25:21 ok, np 19:25:34 thanks for the point in the direction 19:57:42 Booted into linux and was able to do https://gist.github.com/Smithx10/cb34d4256b3d1653be7772ed25a90f3d 20:04:35 i wonder where it's getting 'physical slot: 11' from 20:05:23 jbk: I believe that is the PCI slot number, not sure if it maps to the physical drive bay number 20:06:03 It doesn't per se, but it will be consistent. 20:06:57 the parent device /pci@c1,0/pci1022,1483@3,4 appears to be a bridge 20:07:17 When the DC hand gets there ill power this box down and he can go through the 6 drives that are off and find the match 20:15:23 i know i've asked about it, but at some point I need to dig in and see if it's possible to override those labels based on the HW path so you can get something more useful than 'MB' if there's an actual label on the MB for the device (it does assume a system firmware update won't muck up the path, but that's at least something that can be checked and dealt with) 20:20:52 Depends on the enumerator. 20:21:00 But with a map you should be able to. 20:25:15 speaking of that, should dimm enumeration include the serial#, manuf into in the fm topo? (I see on mine it just shows the rank size, etc) 20:25:47 (wondering if something's broken and i need to dig in or not) 20:25:57 smbios shows it, so the info exists 20:28:05 one thing i was going to do for work was a small program that walks the fm topology to produce a HW inventory (sort of like fmtopo, but a bit more focused instead of the giant dump of everything) 20:28:59 though supplement with some additional info (e.g. we use lots of aggrs, so it'd nice to include the aggr config as well as IP config of the interfaces) 20:29:49 lol, this stuff seems rather complicated.... 20:30:32 you'd think mapping the drive bay # to the disk inside it wouldn't be so brutal 20:31:24 you'd think 20:31:42 but then if it's a SCSI disk, usually there's a completely separate device you have to talk to 20:31:53 and then map back what it reports back to a disk 20:32:03 yea, its pretty brutal lol 20:32:42 and the standard for that is roughly 'we'll just kinda document what everyone does, but no guarantee some manufacturer won't do something different that'll break things' 20:32:48 one day i'll type diskinfo -P and it will just work :) 20:33:37 also from SKU to SKU nvme information will be in my oob or not 20:33:38 that also goes for the indicator lights 20:34:31 then for SATA it's different 20:34:40 and I haven't looked, but I'm guessing NVMe is also different 20:35:24 SATA at least, the indicator lights (if supported) are (more or less) associated with the disk (you talk to the same 'thing' to turn the lights on/off as you do to read/write to the disk, unlike scsi) 20:35:44 but AFAIK, there's nothing in there to enumerate a disk to some physical location 20:35:56 you have to basically just manually say 'port 0 corresponds to slot X', etc. 20:36:30 yea, I tried checking the Manual for that SM and it has nothing documented "_" 20:37:10 Next time a SKU comes in, we probably gonna make this apart of the process during first install 20:37:54 unfortunately, there's no easy way to add a topo map after the fact in smartos -- it more or less needs to be added to the repo 21:46:17 jbk: So the problem is that the memory controller doesn't know that information or provide you usually with a way to get it via i2c. So you need a way to map things together to get both the memory controller and smbios info. 21:47:54 is that something that could be done via a topo map? the one i'm using does reference the smbios enumerator, but no idea really if i'm specifying things correctly 21:52:37 You'd need additional code to glue it, but if you said you knew for certain that for example the location tag is guaranteed to map to a given location in the memory controller, probably. 21:52:56 A lot of these gotchas are why I've tried to experiment with designing the tree differently at Oxide, the PCIe tree we've prototyped, etc. 22:01:06 i guess i need to see what info i can get from the mc 22:01:27 the smbios info at least has enough to map a DIMM back to both a physical location and physical address 22:06:18 oh hrm.. this might be part of the problem.. for some reason.. x86gentopo_legacy is getting set to 1.. guess I need to figure out why 22:24:14 I'd expect that for everything? 22:24:24 Unless you had some random specific sun platform.