#illumos

05:58

yuripv

are smartos issues visible publicly? wondering about the OS-6216 (TritonDataCenter/illumos-joyent 80caa2e)
06:43

rmustacc

yuripv: smartos.org/bugview/OS-6216
06:43

fenix

→ OS-6216: VOP_ACCESS() use in sdev_readdir() leads to deadlock (Resolved) | joyent/illumos-joyent 80caa2e
06:43

yuripv

rmustacc: thank you!
10:09

jlevon`

yuripv: oh god the horror
10:47

yuripv

jlevon`: why? :)
10:47

yuripv

we seem to be hitting that issue intermittently
11:48

jimklimov

got ZFS crashes: pasteboard.co/1gedb0wwniJJ.png (after an update 2 days ago), pasteboard.co/aBtHf4N3xvsg.png (from possibly a month-old OI, mid-December'ish)
11:49

jimklimov

A pool scrub from linux rescue ISO (the only one DigitalOcean lets use) did not complain. I think the crashes may be triggered by znapzend activity, but can't vouch for that.
11:53

jimklimov

the later one is from OpenIndiana Hipster 2023.10 Version illumos-00f13cd38a 64-bit (per early boot prompt)
11:55

jimklimov

disabled znapzend and launched a scrub from OI... will see if it crashes or finds anything that illumos codebase dislikes...
12:24

jimklimov

so, illumos zpool scrub also passed well
12:24

otis

did you try to import without zpool cache file?
12:24

otis

i vaguely remember that i've seen mysterious crashes with solaris 11 zfs that were connected to zpool cache somehow
12:26

jimklimov

I think rpool goes imported without one - got nowhere to read it from yet?
12:26

otis

ah ok, it goes on rpool.
12:26

otis

(i did not check yout pastebins)
12:27

jimklimov

but FWIW, /etc/zfs/zpool.cache is updated today on the running system (probably during boot, roughly same age as uptime)
12:30

otis

i was looking for screenshot from "my" crashes (i surely have some, but can't seem to find them)
12:35

jlevon`

yuripv: just sdev is a locking horrowshow
14:07

jimklimov

Checking that the fault was related to a running znapzend service - and indeed it seems to have been due to its clean-up of older snapshots (in manual run with debug), as seen in topmost line at pasteboard.co/OdhyUQgFbPoG.png . The `zpool scrub` is clean however...
14:09

jimklimov

Locking up during reboot (claimed in earlier discussions to be something between illumos kernel and QEMU running the VM) is annoying - gotta power-cycle the VM or suffer an outage of about an hour or two until it does reboot unattended. But to get such screenshots it is actually helpful :D
14:13

jimklimov

I'll try to clean that snapshot away manually, maybe with the DigitalOcean Linux recovery ISO, in hopes that it is the only such troublemaker
14:26

jimklimov

that was fun... I told DigitalOcean to turn off the vm, it said it did. I changed the boot device and turned it on - and the console still showed the ZFS stacktrace and attempt to reboot. Power-cycling helped, though...
14:34

jimklimov

so, linux zfs had no qualms dropping that snapshot chain
14:34

jimklimov

trying znapzend from OI again...
14:53

jimklimov

yeah, seems that one snap was somehow special; now znapzend steps through a lot of other obsolete iterations without crashing the kernel
15:12

yuripv

jlevon: luckily for me, you already fixed at least this issue :D
15:22

jlevon

true
16:02

jimklimov

yep, so not the whole run of znapzend went well
16:02

jimklimov

having competing implementations is useful :)
16:40

sommerfeld

jimklimov: traceback mentions zio_ddt_free - do you have dedup enabled on any filesystems in the pool?
20:19

jclulow

jimklimov: illumos.org/issues/14526
20:19

fenix

→ BUG 14526: illumos guest hangs on reboot under QEMU 6.0.0 (New)
20:19

jclulow

You are most welcome to debug it!
20:20

jclulow

It's not surprising that it fucks up the DO control plane FWIW. I expect they're just proxying a request to reboot through to QEMU through the monitor protocol, and it's pretty clear that QEMU has a ridiculous bug that wedges the whole emulation stack in this case.
20:21

jclulow

If you turn it off, they probably go and actually kill the process and then fire it up on a new server when you start it up again.

2 years ago

« a day earlier

a day later »

today »