#illumos

13:01

aquamo4k

jbk: and others, your talk about talking to mainframes, back in early '90s we build a system that used a pho-printer queue to get data from the sunos to mainframes :-)
13:02

aquamo4k

lpd (or lprNG was the translator) :-)
13:29

jbk

i had another 'fun' mainframe experience..
13:30

jbk

they were doing a data migration which involved downloading whatever the mainframe equivalent of exported tables was, transferring them over a wan link and importing them into the new database (running on Solaris 10)
13:31

jbk

the group had originally decided that they should just mount that dataset over the WAN with NFS4
13:31

jbk

72 hours into their 48 hour transfer window (rehearsal) they realized they had a problem
13:32

jbk

also.. for some reason, the mainframe 'spontaneously' stopped accepting nfs4 mounts and had to resort to nfs3
13:33

jbk

because at the time ssh still had fixed sized windows which limited it's throughput, I had to concoct this rube goldbergesque setup
13:34

jbk

where another system in the same buildnig as the mainframe would instead use ftp, pipe it through gzip (default level), then pipe it through a connection across the wan to the destination system that'd uncompress and save the file
13:35

jbk

it took 4 hours
13:35

jbk

when the group was told 'ok, you can import everythign now' they thought we were joking
17:01

jbk

hrm.. there's no way to display all longest prefix matching routes for a given IP, is there?
17:01

jbk

route get IP just picks one (when there's more than one)
17:05

sommerfeld

jbk: yeah, I think I've noticed that.
20:53

rzezeski

I had a zfs deadman panic today on an otherwise idle system running DEBUG bits of my latest cxgbe work. This is my first time ever seeing such a panic. zpool scrub showed no issues. Looking at the stack I see that intrd had an ioctl in progress and was sitting in apix_wait_till_seen(). I'm wondering if IRM and/or DEBUG bits can induce this failure mode?
20:56

richlowe

I ran debug bits for a while, and things were not fun, but they did not crash
20:56

richlowe

rzezeski: apix was on the stack above ZFS, like we'd pinned an i/o thread for interrupt?
20:57

richlowe

or at the time it died interrupt management things were happening (but in another thread)
20:57

richlowe

(I can't tell which you mean)
20:58

rzezeski

gist.github.com/rzezeski/175396b3d8423dff66db9b2bfec28515
21:01

jbk

i'm actually impressed intrd was doing something...
21:03

richlowe

oh, it's actually r'ing ints
21:03

richlowe

I have never seen that
21:03

richlowe

rzezeski: I guess I would look at IRM first
21:04

richlowe

though the combination of IRM|DEBUG can only be more uncommon
21:07

rzezeski

Yea, I have a lot of stuff on my plate, so I don't necessarily have time to learn how all this works. That's why I thought I'd get a spot check in here and see if IRM could put one in this position given that I have no reason to suspect an actual hardware/software issue on this host. I guess if I could figure out what interrupt it was trying to remove/remap, and if that interrupt was related to one of my nvme devices...
21:08

richlowe

I think that's not that complicated, but someone like rm would be the one to try as far as remembering the x86 interrupt support well
21:08

richlowe

maybe not well enough for intrd though...
21:08

rzezeski

And I know DEBUG loves to throw random curveballs like trying to unload modules randomly, I figured this could be another thing like that. But I guess it could also point to a legit bug that's just a lot harder to hit in non-debug
21:08

richlowe

if I were wanting to make forward progress, I'd disable intrd etc, and keep DEBUG
21:09

rzezeski

I don't remember ever enabling it in the first place. I honestly know almost nothing about it.
21:10

rzezeski

okay, yea looks like it's enabled on my stock omnios bits
21:17

richlowe

yeah, it's enabled by default, but usually not particularly active
21:17

richlowe

or at all active
21:17

rzezeski

richlowe: I have a way of activating things, the computer gods hate me
21:18

rzezeski

fucking yak farm over here
21:24

rzezeski

okay, mapped the apix_vector_t argument back to the device info, looks like its for ixgbe, which makes sense since it participates in IRM
21:30

richlowe

rmustacc: do you remember why `CTASSERT` doesn't stringify `x`?
21:32

rzezeski

So with this deadman panic there should be an outstanding I/O, right? I wonder if that would give me any clues?
21:33

jbk

yeah
21:33

jbk

well mostly
21:33

richlowe

Yes there is outstanding I/O, maybe it would give you clues
21:33

jbk

technically, it could get stuck in the zfs i/o scheduler without ever being issued to the disk
21:34

jbk

though it'd be unlikely to happen on an idle system
21:34

richlowe

I think if we are in the mood to remap interrupts the system isn't, to our view, idle
21:34

jbk

since the deadman basically looks at when it gets sent to the scheduler and then when it completes
21:35

rzezeski

richlowe: of the 16 CPUs, 14 were idle, two were in sched
21:36

jbk

(and if a disk is busy or slow enough, you can get into a situation where zio to higher LBAs get starved because zios to lower numbered LBAs keep cutting in front of them)
21:36

rzezeski

I was literally out for a walk with nothing of significance running on the host (AFAIK)
21:37

richlowe

rmustacc: gist.github.com/richlowe/398a1ddf8a0db48685e7445f736fd8d8 makes the sky turn clear and the sun shine brighter, so I'm just worried about something adverse I haven't thought of
21:38

rzezeski

::zfs_dbgmsg shows two "slow spa_sync" messages followed by a "SLOW IO".
21:38

richlowe

rzezeski: jclulow is someone I would talk to
21:38

richlowe

but that's almost always true, unfortunately
21:39

jbk

i wonder if maybe the HBA lost a completion interrupt (though I would think it would then time out)
21:39

rzezeski

oh and I see in the deadman function that right before it panics it writes this SLOW IO message to the debug log
21:39

jbk

which the HBAs are usually pretty (sometimes extremely) vocal about I/O timeouts
21:40

jbk

did you do any sort of scrub prior?
21:40

rzezeski

nope
21:40

rzezeski

my pool consist of two NVME M.2 devices
21:41

jbk

we had a customer that kept hitting that
21:41

jbk

because the i/o throttling in the scrub code needs help
21:42

jbk

(for night court fans, I dubbed the drive behavior 'tortoise nervosa' :P)
21:42

jbk

that was contributing to it
22:14

rzezeski

for the morbidly curious I added more info from mdb/zdb in that gist, but at this point can't say I'm any closer to understanding why this would have happened, I guess I'll just have to write up an issue and move on
22:21

jbk

maybe i missed it, but what is that sched thread on cpu 10 (...d7c20) doing?
22:44

jbk

random question.. ISTR (though I could easily be confusing and misremembering).. didn't someone write a utility that'd print the device tree a bit like prtconf, but interactive?
23:14

sommerfeld

richlowe: well the good thing about CTASSERT is that if it compiles you don't have to worry about it again so it seems like a low-risk change...

a day ago

« a day earlier

a day later »