#illumos

01:08

yuripv

looks like my subscription to advocates list is "awaiting approval" since March, could someone approve it, pretty please?
01:09

yuripv

(and allow an rti message through)
05:01

tozhu

hello all, a simple question regarding to illumos compile progress, I see there is no ‘-O2’ option, can we add ‘-O2’ for illumos compile?
11:23

andyf

In general, we are careful to favour observability over optimisation, which is why -O is used - and even then we turn off some of the optimisations that -O enables. It's quite likely that some of the things that -O2 adds would be beneficial, but we'd have to validate that they don't break things like DTrace fbt probes.
11:25

yuripv

looking at nightly.log, there are a lot of "-O2" in gcc invocations
11:34

yuripv

so it looks like we build 32bit object with -O and 64bit ones with -O2
11:37

andyf

Ah right - my searching for -O\d was not working, but they're specified with the studio flags, and converted by cw
11:38

andyf

Makefile.master - amd64_COPTFLAG= -xO3
11:39

yuripv

yeah, I didn't look at Makefile.master as there's just too much magic involved in translation
11:40

yuripv

oh, tozhu isn't here
15:10

yuripv

rmustacc: is having both debug/non-debug builds a new requirement for rti? i'm pretty sure i didn't do both previously
15:11

rmustacc

No, not new.
15:12

rmustacc

illumos.org/docs/contributing/#submitting-a-patch
15:13

yuripv

it doesn't say anything about having both thouhg
15:14

yuripv

(it would be good to have that mentioned explicitly)
15:14

rmustacc

Happy to improve it. That's what full has always meant to me.
15:15

yuripv

my understanding was "non-incremental" :)
15:15

yuripv

and "with all checks enabled"
15:15

rmustacc

Feel free to make a pr with verbage you'd find clarifying.
15:16

rmustacc

Or if not, I'll try to get to it at some point. But I can't promise you'll be happier. ;)
15:18

rmustacc

Given there are different warnings that are checked and one can make one build work but not the other, that's why it's there.
15:25

yuripv

understandable, it's just first time I was asked for both
15:31

rmustacc

Not everyone always checks, especially when we end up doing builds before pushing anyways. Sorry for the confusion.
15:39

tsoome_

cmpldev() in getaudit_addr() is failing for 32-bit consumers:(
15:41

tsoome_

cmpldev:entry int64_t 0x21600000006b3 cmpldev:return int64_t 0 -- value passed on call and return code.
15:46

rmustacc

What's the exact path you're going?
15:50

tsoome_

I found that getaudit_addr() call in userland is failing (like with auditconfig -getaudit); so I did trace a bit and found that getaudit_addr() syscall in kernel is returning getaudit_addr:return int64_t 0x4f (EOVERFLOW), and there are exactly 2 cases where we can return it -- the argument len is correct, but we do get error from cmpldev().
15:52

tsoome_

so cmpldev() does dev >> L_BITSMINOR to get major (getting value 0x21600) and as L_AMXMAJ32 is 0x3fff, the if (major > L_MAXMAJ32 is true and we return 0 (and NODEV32).
15:54

tsoome_

so, the interesting question is, is the 0x21600 something we have to accept as is, or can it be further translated....
15:55

tsoome_

or... should getaudit_addr() syscall handle the NODEV32 better than just erroring out...
15:56

rmustacc

What does the device correspond to?
15:59

tsoome_

huh... we are translating cmpldev(&dev, ainfo->ai_termid.at_port) --- should this be pts?
16:00

tsoome_

hm...
16:04

tsoome_

it does not really make sense as such, need to see how they are getting this value in first place....
17:12

tsoome_

eh:D well. this terminal id is not getting translated to dev the because it is not dev, it is network address with port numbers and remote ip. as simple as that.
17:14

gitomat

[illumos-gate] 15261 "ExpDataSN mismatch in SCSI Response" error on FreeBSD 13.1 initiators -- Yuri Pankov <ypankov⊙dc>
17:15

tsoome_

uint_t at_type = 0x10 (thats AU_IPv6)
19:32

gitomat

[illumos-gate] 16729 disambiguate generic & pci memlist functions -- Luqman Aden <luqman⊙oc>
19:56

gitomat

[illumos-gate] 14916 ehci_qh_pool_size is probably too low -- Joshua M. Clulow <josh⊙so>
21:26

richlowe

I realized I recently said I'd never seen the zfs 'bn > dn_maxblkid' issue be more than two blocks over. That's not true. I just saw 3
21:40

tsoome_

got it with zfs send?
21:45

richlowe

yes
21:45

richlowe

it's always send/recv
21:45

richlowe

I just don't want to have talked about the bug and left wrong info in the IRC logs without a correction :)
21:47

richlowe

it feels like it's using the large...thingy feature without turning it on? or without checking if it's off?
21:47

richlowe

is it large_blocks or large_dnode?
21:49

richlowe

if I understand `zpool upgrade` properly, I don't have either enabled on the source pool
21:50

richlowe

so it's almost like `zfs send` is inventing a big something-or-other, or `zfs recv` is deciding to create one when it should not
21:50

hadfl

there is also another zfs issue that causes even more havoc than just printing a warning message. if there are labels left at the end of the device. zpool expansion will blow up.
21:53

richlowe

yes, that one too!
21:56

hadfl

it's particularly annoying as for aarch64 we are obviously dd'ing new images to the same device over and over again for testing. the only workaround so far is to do a `zpool create -f ... <dev>`; `zpool destroy ...`; `zpool labelclear <dev>`
22:07

tsoome_

"if there are labels left at the end of the device" -- left how? overwriting pool without doing zpool destroy first?
22:09

richlowe

labelclear is the important part
22:10

richlowe

zfs puts labels at the start and end, if the end one still exists (we think), zfs expansion chokes irrecoverably.
22:10

richlowe

I say "we think", the workaround I'm pretty sure is 100% reliable
22:11

richlowe

so in this case, you have an sd card, you dd your new image to it, plug it back in, and assume it'll autoexpand, instead it'll take irrecoverable errors the moment it tries.
22:11

richlowe

iff the trailing label still exists
22:12

hadfl

zpool create -f; zpool destory; is just to make labelclear work. as rich pointed out this is the thing that helps
22:13

hadfl

tsoome_, if you want to test it, use a device, dd a zpool to it, expand it. dd the same zpool to the larger device again and try to expand it again
22:19

tsoome_

so that in this scenario you would overwrite labels at the beginning of the disk with new image, but labels at the end are left intact as your image will hope to use its own label copies from the end of the image?
22:21

hadfl

yeah, the end of the image (i.e. end of the new zpool) is not the end of the device where there are still old labels left
22:25

tsoome_

there are few problems with left over labels --- first one, the last two are searched based on the size of the device. Now, if you are overwriting the same image again and again, it means the pool, device etc guids in those labels will match. And worse, as the source of the pool is from the same image, the transaction group numbers on "old" image are larger than ones in the "new" image, therefore the "old" labels
22:25

tsoome_

are more recent than the ones from the image.... and you get very confused pool:)
22:34

hadfl

well, it does not necessarily need to be the same image. but i guess any assembled image has fewer transactions than a pool that was in use. since this issue seems to be known and "explainable" what's the best workaround?
22:34

hadfl

i doubt the re-imaging a device is a non-standard use-case
22:35

richlowe

it's also immediately and terminally fatal to the pool
22:37

tsoome_

well, if it is not the same image, you have 2 first labels from your new image and 2 last ones from old pool and thats kind of worst case because which ones we should believe to be true? at least "good old" SVM was using majority based voting with metadb, so 50% available metadb replicas did stop the boot, and you had to fix it manually.
22:39

hadfl

the interesting part is that everything is happy as long as we don't try to expand the pool
22:39

jclulow

In this case you do actually have four labels that make sense, FWIW, they're just not in the right spot according to the slice
22:39

jclulow

Right
22:39

richlowe

so ok, but what happens in practice is everything works (it read a start-of-disk label), until you try to expand, when if it sees an end-of-disk label _then_ it shits the bed.
22:39

hadfl

i can dd a new pool with old labels on the device as many times i want
22:39

hadfl

everything works
22:39

hadfl

as long as i don't want to expand the pool
22:39

jclulow

As far as I know, the pool knows how big the used region of the slice is; I'm sure it's just getting confused on re-open
22:40

richlowe

right and then it uses the labels that don't even agree with the labels it was using a second ago
22:40

jclulow

So I think it would be pretty legitimate to have ZFS locate the labels where it expects them to be, as it seems to be doing (except for expand) and just trust those -- and to not screw up if there's random data at the end of the slice
22:41

tsoome_

richlowe and that is bug, it must make sure the newly found labels are not poisoned.
22:41

jclulow

I suspect what we really ought to do, prior to expansion, is not even look -- just erase the target label region
22:42

jclulow

if that fails, we don't try to do anything else
22:42

jclulow

but otherwise we know that we can't then become confused
22:43

tsoome_

yep, the fact that we are in process of expanding means that we should not care about what is there.
22:43

jclulow

yeah.
23:14

tozhu

andyf, yuripv thank you for the answer regarding to ‘-O2’ compile option, thank you

2 years ago

« a day earlier

2 days later »

today »