05:02:08 hey there, omnios folks -- i've got an installation that has been hanging on me, and I'm trying to troubleshoot it. It is omnios-r151052-dbe4644ba92, running under ESXi-7.0U3q. It has been hanging every couple weeks, and I've made the changes necessary to get it to drop into kernel debugger via nmi so I can get a proper dump. 05:03:02 I think it might be hardware related - there's a LSI hba passed through: 0b:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS3224 PCI-Express Fusion-MPT SAS-3 (rev 01) 05:03:45 but i was wondering if anyone would be able to take a look at the dump file (or whatever you would like me to export from it) and maybe point me in the right direction? thanks! 05:06:02 (somewhat related, it'd be great to be able to read the temperature off of the hba - I've been looking around for a way to do this in OmniOS and cannot find anything. 05:06:21 I did find that someone wrote some FreeBSD code to do it though: https://www.truenas.com/community/threads/broadcom-avago-lsi-hba-card-monitoring-ioc-temperature.94933/ -- but I'm not smart enough to port it over :) ) 05:11:00 A starting point if you don't want to make the dump available would be ::stacks -m mpt_sas 05:11:47 For the sensor, I'd need to track down some slightly better docs, but we could probably add a ksensor. 05:11:56 Which would expose it in topo and related. 05:13:55 I'd be happy to make the dump available -- but it is 1.5G, is there a way to pare it down? 05:15:25 Maybe throw it through zstd compression? 05:27:02 you know I think the dump didn't complete properly. dumpadm -e says I need ~11.5G for a complete dump and rpool/dump doesn't have anywhere near that amount of space 05:27:27 You have a file with the name vmdump.X where X is a number? 05:27:54 The dump estimation logic looks at active memory to make that call, so it changes. 05:28:25 If you have that file you can run 'savecore -f /path/to/vmdump.X .' which should save the dump in its different parts that can easily be used by mdb. 05:33:01 sure - it runs until 99% and then: 0:22 99% donesavecore: stream tag 518 not in range 1..1 05:33:01 savecore: bad summary magic 2ccdaac4 17:38:09 (i've expanded rpool and increased the dump size, so hopefully next time there's a complete dump?!) 17:39:01 It's also worth checking what content you have selected for the dump. You probably want to see "Dump content: kernel pages" in the output of `dumpadm` 17:39:23 At least initially it is probably not worth capturing more than that and it will just inflate the dump size 17:39:38 yup, it is set for just "kernel pages" 17:40:51 sorry, when I said "dump size" i should have been more accurate and said that i increased the size of rpool/dump using zfs set 18:15:40 dumpadm -e will give you size estimate 19:21:00 That estimate is not great, though, especially if you run it early in boot. It'll grow depending on what your machine is doing. 19:28:03 yep, for sure - when I ran it last night it told me I needed a bit more than 11G, when I ran it after boot it was saying 1.9G. I made my rpool/dump 12G, hoping that's enough for the next time it hangs on me. 19:29:48 btw, i briefly looked at the ioctls that illumos has for mpt_sas vs the ones in the FreeBSD tree -- it might be that getting the temperature off of LSI cards might require adding some ioctls to the driver? 19:30:33 (here's freebsd for easy reference: https://github.com/freebsd/freebsd-src/blob/releng/14.2/sys/dev/mps/mps_ioctl.h) 20:08:12 swinokur: We would just phrase it as a ksensor rather than ioctls that same way. 20:08:37 The same way we expose all the other temp sensors and related bits from ASICs.