17:10:07 [illumos-gate] 16241 lmrc shouldn't panic on unknown MFI completion status -- Hans Rosenfeld 18:26:45 [illumos-gate] 16265 sh: parameter defaults to 'int' -- Toomas Soome 18:29:59 [illumos-gate] 16255 lpadmin: type of 'sig' defaults to 'int' -- Toomas Soome 18:35:00 [illumos-gate] 16262 pmadm: 'oflag' defaults to 'int' -- Toomas Soome 20:19:01 hmm.. it might be nice to be able to control the debug/log level of syseventd from smf (right now you have to either modify the method script or run it by hand) 20:42:45 it appears for a multi-pathed disk if all of the paths of a disk goes away, we never actually generate an EC_DEV_REMOVE event, instead just path offline events (that don't appear to be consumed by anything in illumos-gate)... anyone happen to know the rationale for that? 21:03:43 i'm looking at a scenario where a pool w/ redundancy (mirrors or raidz[2] -- shouldn't matter as long as there's something) and a disk is pulled in error and re-inserted 21:04:24 for a multi-path disk, it appears the dev_remove and dev_add sysevents are never generated.. just the LU online/offline ones 21:05:36 it'd be nice (perhaps if autoreplace is enabled on the pool) that the vdev is automatically onlined (assuming the state was removed and not faulted) 21:12:47 jbk: I've observed an alternate problem -- single-pathed disk that's a member of a ZFS mirror goes away temporarily, and it goes into state REMOVED rather than OFFLINE, which makes resilver take a lot longer (no DTL). 21:13:09 So the question is how did it go away? 21:14:37 in my scenario, it was physically pulled (imagine this was done by mistake because and enclosure's indicator lights aren't the most well thought out) 21:16:08 vs. a disk failing 21:16:43 sommerfeld: IIRC, i think that's the ldi callback in vdev_disk.c that's making that specific state change 21:17:04 in my case I haven't isolated what actually happened in detail - but generally the disk came back after a system power cycle. 21:18:22 jbk: Yeah, was thinking more so about sommerfeld's case. 21:18:57 So I think the thing we'd want to figure out there sommerfeld is trying to distinguish presence from electrical connectivity. 21:19:05 Which depending on the HBA / device may not be possible. 21:20:52 in my case SATA hardware connected to an LSI SAS controller. 21:22:17 but don't let my sketchy hobbyist hardware derail jbk's issue 21:25:16 well, the first question is, do we need to distinguish all paths offline versus device removed cases? 21:27:36 In general it is helpful to distinguish the two where you have out of band presence. 21:27:56 For example, consider a single-path connection to a PCIe device. Knowing that the device is offline can happy due to power control, link state, etc. 21:28:06 However, removed is only triggered when you actually have physically removed it. 21:29:26 yes, well, thats assuming that you can either detect the removal event or removal is initiated via command - that is, you let system to know that dis device will be removed. 21:29:45 s/dis/this/ 21:29:47 Sure, it's interface specific ultimately. 21:34:43 sommerfeld: if you're using smartos, there is a sysevent utility (it might be worth upstreaming that) that can be handy to monitor what events are firing -- that might help narrow down what's driving the state changes (though obviously won't tell you why, but might at least make it easier to zero-in) 21:36:18 my inclination with this is to (at least for my purposes) to treat the LU online event like a EC_DEV_ADD event... 21:36:26 like this: https://pastebin.com/hsKr7WEW ?:) 21:39:27 All sysevents that fired are recorded in the fmdump info log, fwiw. 21:39:34 jbk: so this is something that has only happened maybe once or twice a year (and not recently, so I'm probably jinxing myself..) 21:39:43 So no need to actively monitor them. 21:46:48 rmustacc: i didn't know that, but now that brings up another question... running syseventd -d, it was reporting ESC_SUN_MP_LU_{ADD,REMOVE} events, but those aren't in the fmd info log 21:47:12 Dunno, just most things I looked for and have used are there. 21:47:16 You'll have to dig, sorry. 21:47:27 but I see EC_dev_remove.disk and EC_dev_add.disk events in the fmd info log, but did not see those from syseventd 21:47:42 yeah that's what i expected :) 21:47:45 (to dig) 22:21:56 ... i do wonder if maybe the multipath bits are maybe causing those to be somehow suppressed in a way taht doesn't impact fmd 22:54:52 side note: usr/src/uts/common/os/log_sysevent.c looks like it could probably use id_space_t instead of vmem_alloc() and the like directly