-
rmustacc
I think it should be okay to keep builtins around and take other snapshots as things change and on an as needed basis.
-
rmustacc
At least, that would be my general disposition / expectation. But I dunno.
-
tozhu
hello all, a simple question regarding to ZFS: if there is only SSD for a machine, eg: 12 * 7.6T ssd, do you advice slog? and what’s the raid is recommand? mirror / raidz1 or raidz2 ? best wishes
-
Nixkernal
No just use slog if you can at least mirror it, however ArcL2 is ok.
openzfs.github.io/openzfs-docs/Basic%20Concepts/RAIDZ.html
-
Nixkernal
No recomdation depends on your usage pattern/availability etc
-
jbk
generally, having a separate slog device is useful when it noticably faster than the rest of the storage in the pool
-
jbk
e.g. SSD slog, HDD disks in the pool
-
jbk
if it's already SSD, having a separate slog device is likely not to be as useful
-
jbk
since it's the log, so (broadly speaking) I/Os get written there first
-
jbk
so if it's the same speed as the rest of the disks in your pool, you're artifically bottlenecking yourself
-
jbk
(I've seen this happen where someone created the log device on another HDD and the complained their pool was slow :P)
-
jbk
heh.. i think i'm going to have to knock some heads while i'm at the office next week
-
papertigers
jbk: although I believe having a slog that's at least as fast as the pool can still be useful for cutting down on fragmentation iirc
-
jbk
but it kills your performance unless the pool is basically a single disk
-
jbk
because you end up only being able to write as fast as that slog device
-
nomad
again, that's usage dependent.
-
nomad
Though I can't think of a usage where having a single-drive write speed is optimal.
-
pilonsi
Hello! I was trying to passthrough the USB ports in my server to a bhyve VM as a dirty hack to install Windows Server 2022, I mapped the PCI address of the USB controller in /etc/ppt_aliases and /etc/ppt_matches and after rebooting the active BE fails to boot complaining that it cannot read the pool label and therefore it cannot mount root.
-
pilonsi
I'm able to boot with an older BE and mount the failing root zfs filesystem and access it. So far I've deleted the /etc/ppt_* files, deleted /etc/zfs/zpool.cache, I have booted from an install ISO and imported and exported the pool, disabled and enabled the BE, created a new BE based on it and trying to boot it... But I still get the same error.
-
pilonsi
It's not critical since I could go back to the old BE, reinstall some packages and copy the config files from the broken BE, but I really would like to know what I messed up and how to fix it.
-
pilonsi
The full error message in the console is NOTICE: Performing full ZFS device scan!
-
pilonsi
NOTICE: Cannot read the pool label from '/pci@0,0/pci103c,870c@17/disk@2,0:b'
-
pilonsi
NOTICE: spa_import_rootpool: error 5
-
pilonsi
Cannot mount root on /pci@0,0/pci103c,870c@17/disk@2,0:b fstype zfs
-
jbk
hrm.. 5 is EIO which isn't particularly helpful
-
richlowe
that's EIO trying to read the label from that device
-
pilonsi
I'm running OmniOS r151046
-
richlowe
it might mean the device in the pool config is not correct, and we can't find where it now is
-
pilonsi
But I'm able to boot from the old BE which is just a different FS on the same pool
-
richlowe
have you scrubbed the pool?
-
pilonsi
No
-
pilonsi
Should I?
-
richlowe
my only idea is maybe it really has got issues reading the new BE, if everything else semes to work, but you might want to wait for someone more expert to agree with me (or not)
-
pilonsi
Could be.. I hadn't thought about it... Because in the 1st reboot it must have been accessing the filesystem, then read the /etc/ppt_* files, mapped the pci thing which apparently also had the root disk hanging off it to the ppt device and immediately lost access to it
-
pilonsi
I'll try the scrub, thank you!
-
richlowe
I'm not sure if ppt info is stored anywhere else that may have not got nuked, pmooney would be the one who knew that
-
richlowe
and jbk probably your best bet of people currently awake for zfs
-
pmooney
or sjorge, honestly
-
pmooney
I've not had a big hand in the ppt stuff, besides continuing maintenance
-
jbk
yeah, i've not done anything really with ppt
-
jbk
and i think Woodstock is on vacation (ISTR he did a lot of the PPT work)
-
pmooney
yeah, he did a bunch of the porting work
-
jbk
you could maybe try booting w/ kmdb and poking around with ::prtconf to see if that device is there
-
jbk
though I have no idea if that was getting passed-thru what it would look like in that
-
jbk
or what to look for
-
pilonsi
The scrub worked!
-
pilonsi
It reported 0 errors, but I rebooted just in case and it booted the main BE fine
-
pilonsi
It's behaving a little weird, I have no IP address now but that probably is a separate issue
-
pilonsi
I'll get to fix that and keep an eye on it, thank you very much for your help!
-
pilonsi
I have been using illumos for a few weeks only but it really feels very solid and reliable, I'm liking it a lot
-
richlowe
that is... odd
-
pilonsi
Maybe the scrub made it refresh some internal device identifiers or something?.. I don't know the inner workings
-
jbk
yeah..
-
jbk
the only think I can think of (and this is a stretch) is that maybe the disk name (e.g. c0t0d0) changed because of the pass-thru so maybe the cache file was wrong?
-
pilonsi
The ip thing was me disconnecting the ethernet cable while I was fiddling with cables to connect a monitor and keyboard to it to debug why it was not booting... :)
-
jbk
though i think it should still be able to figure things out
-
jbk
pilonsi: it's always nice when it's a simple fix :)
-
pilonsi
Yeah, and the other be could boot fine off that disk as well.. its strange
-
pilonsi
Anyways I should start getting used to creating backup bes before I do weird thing
-
jbk
well the cache is per-BE
-
jbk
though also... that might end up in the ramdisk...
-
jbk
not sure offhand
-
jbk
(just thinking out loud)
-
pilonsi
I did delete the /etc/zfs/zpool.cache file just in case, but it made no effect
-
pilonsi
Unless it has some other cache, pool information, etc.. files stored under / ?
-
vetal
pilonsi: you deleted from mounted dataset of failed BE or booted dataset?
-
pilonsi
from mounted dataset of failed be
-
vetal
pilonsi: Can you check whether 'prtconf -v' find that device path '/pci@0,0/pci103c,870c@17/disk@2,0:b' ?
-
vetal
pilonsi: something like this: prtconf -v |grep -A 2 'dev_path=/pci@0,0/pci103c,870c@17/disk@2,0:b'
-
pilonsi
dev_path=/pci@0,0/pci103c,870c@17/disk@2,0:b
-
pilonsi
spectype=blk type=minor
-
pilonsi
dev_link=/dev/dsk/c1t2d0s1
-
pilonsi
dev_path=/pci@0,0/pci103c,870c@17/disk@2,0:b,raw
-
pilonsi
spectype=chr type=minor
-
pilonsi
dev_link=/dev/rdsk/c1t2d0s1
-
vetal
pilonsi: BTW, could you check that /etc/path_to_inst in both datasets have that path '/pci@0,0/pci103c,870c@17/disk@2,0' ?
-
vetal
pilonsi: Also I would suggest to recreate '/platform/i86pc/amd64/boot_archive' in the failing dataset
-
vetal
pilonsi: but save a copy before.
-
pilonsi
In both /etc/path_to_inst I get the same:
-
pilonsi
"/pci@0,0/pci103c,870c@17/disk@2,0" 1 "sd"
-
pilonsi
Why do you think I should recreate the the boot archive in the dataset that was failing?
-
pilonsi
Even if it's already booting ok?
-
pilonsi
(And how could I do it?)
-
richlowe
`bootadm update-archive` creates/updates the archive, but if it's outdated on boot we recreate it and reboot again (if we get far enough)
-
pilonsi
I didn't know about the bootadm tool.. What is the boot archive? Is it a Solaris/illumos specific thing?
-
richlowe
it's (just about) what linux calls an initrd
-
pilonsi
Thanks, I figured.. There is still some illumos terminology I have to get used to
-
tozhu
jbk: Thank you
-
jbk
?
-
jbk
what did i do? :)
-
tozhu
thank for the answer to setup a slog for all SSD device :)
-
jbk
ahh ok