00:44:19 looks like a reboot between runs is advisable to clear stuff that got mounted by tests and not unmounted. 00:45:21 Yes, and I've also had occasions where the dump device is left set to a volume on the testpool, preventing it being destroyed. 00:50:10 Is that specific to the ZFS tets that you're seeing that? 00:53:15 yes. 02:13:57 you can also clean up /var/tmp between runs without having to reboot 02:29:42 I saw at least one left-over mount point. 17:21:03 It really seems like the kind of test suite that you don't want to run on a machine with things you care about 17:21:20 i.e., it should probably run in a throw-away VM 17:21:28 haha no.. 17:21:50 which reminds me.. at some point i need to finish setting that up 17:22:37 i managed to get pkg(5) working in a smartos zone and i was going to create a dedicated test vm separate from the one used to build/dev 17:23:51 it's silly, but i don't want to give up my current omni vm because by some cosmic conincidence, the UUID ended up with the first 5 digits of pi (hence the name) :) 17:25:32 jclulow: indeed, but these issues should be tracked down and fixed because they cause damage to other tests in the same suite! 17:28:27 oh, to be clear, I am also a broken record on the subject of needing someone to make a complete catalogue of flakes in the test suite 17:28:59 I just think, even if it's 100% reliable, that I would still only expect to run it in a throw-away location 17:29:19 Unlike, say, the libc test suite or whatever 18:09:00 yeah, josh has been... vocal about how this needs fixing 18:09:14 the challenge is that it requires such a throwaway setup to run, and many of us don't have one 18:09:24 or have one unique enough it's hard to tell if the problem is the setup, or the tests 18:13:03 jbk: it _is_ silly, but whenever I get a "good" commit hash, I'm sad to lose it 18:13:19 if zfs tests are 100% success, then it must be safe to run; failing tests are obviously potentially harmful. 18:14:57 you need to make that more obviously deadpan 18:16:30 in my test vm, I have list of 9 failing tests, but it may not be 100% true. 18:20:44 yeah, i have my list -- though I thought I filed tickets for all of them (to help others) even if I wasn't able to root cause all of them 18:22:04 for example, in my vm, I have FAIL l2arc tests, while andyf had them PASS. 18:22:49 yeah, some are unreliable 18:23:06 danmcd runs the ZFS tests as part of the smartos release process 18:23:13 so is a good person to talk to 18:24:12 yep, that too. 18:26:03 A think that rmustacc did with nvme, which is wonderful, is split his test suite into "destructive" and "not" 18:26:33 that would help the zfs tests in so many ways (even if there are not that many non-destructive) 18:27:09 ^^^ 18:28:01 I have a VMware Fusion VM specifically tailored for taking the abuse of such a run. It has 6 virtual SATA drives just for ZFS test, a virtual NVMe drive that appears to pass all but one corner-case test I'm chalking up to VMware madness... 18:28:25 ... and it's called "new-from-scratch-SmartOS" so its usage is right there on the label. 18:28:40 Some are very unreliable. 18:28:40 I was using iscsi luns to get trim tests running in vmware fusion vm. 18:29:11 for trim tests, this one is only one failing: Test: /opt/zfs-tests/tests/functional/cli_root/zpool_trim/zpool_trim_verify_trimmed (run as root) [00:02] [FAIL] 18:30:10 I have a test setup in a bhyve vm with multiple host-zvols imported as additional disks. 18:30:34 yeah, I have an illumos-image-builder setup that did it (though not with me) 18:31:01 it's a bit hinky to use, but jclulow's image builder set up to build a fresh disk out of your proto area is a _wonderful_ test setup. 18:31:46 and I'm queuing up a couple test fixes for some of the tests that are failing for me to go along with the vdev_indirect fix I'm testing. 18:32:11 thats nice. 18:32:11 I feel confident they will be appreciated as much as, if not somehow more than, the fix 18:33:34 yeah, with all due respect to the work that'I don't think I'd ever put myself in a position where I'd want to use zpool remove on a system I cared about. 18:33:42 err, that escaped early. 18:33:59 with all due respect to the folks who did the work on it, I don't think I'd ever put myself in a position where I'd want to use zpool remove on a system I cared about. 18:34:43 Yeah, in 20240418 there were 24 skips all ZFS trim related. 18:34:50 It's good that the capability exists. 18:35:15 49 failures (not counting fenix OS-8542 ): 18:35:16 OS-8542: SmartOS bash's built-in echo breaks illumos#16437 (Open) 18:35:17 ↳ https://smartos.org/bugview/OS-8542 18:35:43 4 are BHYVE ones that wont' work due to nested-virt stuff 18:36:48 1 mapfile due to the bizarre way we seem to build libgcc 18:36:50 but it's not needed often enough that it is exercised enough to be dependable compared to alternatives like zfs send|zfs recv 18:37:46 2 aforementioned NVMe test failures 18:37:56 2 ksensor failures (also likely VMware-caused) 18:38:01 and 40 ZFS failures. 18:38:11 ksensor should work despite vmware. 18:38:54 But likely https://www.illumos.org/issues/15873 which I saw had been filed the other day. 18:38:55 → BUG 15873: ksensor tests should accept the ksensor_test driver already being installed (New) 18:39:06 That is the ksensor tests don't rely on any actual hardware thing. 18:39:11 Oh, and it's one ksensor failu7re not 2 The other non-zfs failure is svr4 packaging, which we don't ship. 18:39:12 40 still? huh, could you mail me a link? 18:39:18 Hang on... 18:39:26 @rmustacc /opt/os-tests/tests/ksensor/ksensor_basic.32 18:39:46 I was kind of hoping that since we both run tests on fusion vm, the resulting list should be similar:D 18:40:11 ``` 18:40:15 @rmustacc 18:40:21 ksensor_basic.32: TEST FAILED: failed to open /dev/sensors/test/test.volt.0.1: No such file or directory 18:40:31 tsoome: list forthcoming... 18:40:32 danmcd: the ksensor failures are a race condition 18:40:51 I'd talked to robert about this on arm, the problem is that the pre-test hook loads the test module, but doesn't (can't?) wait until they're ready 18:41:03 so if you're unlucky, the actual tests then run before the test module is initialized 18:41:06 Oh yes!!! That's right, I have to do that DIFFERENTLY becuayse it assumes it's there. 18:41:17 (Perhaps I should put a delay in?) 18:41:31 I'd have thought about this harder, but I thought it was unique to emulated arm, I'm sorry. 18:41:47 I even filed a bug for it I think... 18:42:20 danmcd: if there's something that you can poll for, a timed polling loop would be good. but a fixed sleep is going to be either too long (in which case it wastes test time) or too short (in which case it's flaky..) 18:43:08 and if there isn't something you can poll for for driver init, maybe there's a missing interface... 18:45:39 the devices existing, would be the obvious thing. 18:45:42 fenix illumos#14886 18:45:43 BUG 14886: vmm_drv_test needs to be less IPS-dependent (New) 18:45:43 ↳ https://www.illumos.org/issues/14886 18:45:51 but I'm not sure if that comprimises what the tests are testing 18:45:57 (And it was an inconsistency...) 18:46:04 ANYWAY... I owe tsoome a list of tests. 18:46:37 danmcd: that's vmm _not_ doing what ksensor does (14886) 18:46:56 what you want in 14886 is what ksensor does, what you want in ksensor is something to poll for readyness of the test driver _because_ it does what you want 18:47:07 (I hope that made sense) 18:48:28 I had forgotten the bit on ARM and a race there. I can try to look at fixing it up. Sorry about that. It is unfortunate, but there is probably a cheap way to guarantee it's ready with devfsadm. 18:49:31 me too, really, it didn't occur to me anyone else could lose that race 18:49:53 But I expect a devfsadm call should block on that much, but hard to say. 18:49:53 is there any reason our various kernel printf functions don't support '%#x'? 18:51:42 Probably just a lack of traditional support for the alternate form / need. 18:52:03 rzezeski: I have been asked before in code review not to use it, because people didn't immediately know what it meant 18:52:30 perhaps related 18:52:53 (and to be fair, outside of %x, I don't know what it means for any other format) 18:53:45 Okay, well FWIW it would help to have it because stuff like the cxgbe common code uses it all over the place. And it means instead of getting useful data in the log you just get the character 'x'. 18:54:04 Yeah, to be clear, I'm very in favour of it existing and working and being used 18:54:32 We may find it confusing, but for better or worse we got other's code in our gate that like to use it. 18:55:49 richlowe: How hard is this to support (I know nothing about printf stuff, I thought it was part of the compiler)? I could just add it as part of this T7 wad. 18:55:52 rzezeski: %#x goes back at least as far as C89 (I have the ansi spec in dead tree form). 18:57:02 rzezeski: it's implemented in C as part of the library in userland and in the kernel for kernel. 18:57:14 rzezeski - if the kernel printf functions you're using are tagged with __KPRINTFLIKE, you will also need to teach gcc that it's ok 18:57:27 rzezeski it would be nice to have:) we use it in few places in loader ;) 18:57:49 rzezeski: take a look at usr/src/uts/common/os/printf.c 18:57:59 and unfortunately then wait for everyone to upgrade their gcc 18:58:00 sommerfeld: k, well maybe I'll see if I can get it working 18:58:15 rzezeski: it should be easy for simple cases like that 18:58:26 but I've been wrong in _ndprnt-like places before 18:58:39 sommerfeld: though i don't think that's used in userland, libc has it's own (slightly convoluted) _doprnt 18:58:45 or whatever it's called 18:59:05 it does, userland supports %#x obviously, because standards. 18:59:22 the kernel has a different non-standard printf 18:59:46 which is also where andyf's comment about possibly having to teach GCC that # is supported on %x comes from (on the other hand, there's a better than 50% chance that it thinks it's supported already...) 19:00:10 Well, yes, but I hoped the same about %h and %z (or something in that area) 19:00:22 I mean, if this has been around since C89, wouldn't everyone already know about it :) 19:00:25 I think they were supported by the kernel, but gcc didn't believe so 19:00:27 Well, I'm placing the bet because ryan says that his code is already using the format and it's just not working. 19:00:38 if gcc were grumpy, it would presumably be grumpy _now_ :) 19:00:54 Yea, this is code shared across FreeBSD/Linux/us 19:01:01 so I'm guessing they have it 19:01:19 no, the kprint stuff is illumos-specific 19:01:29 but it'd still be firing on your format strings already, if it were upset 19:01:40 sorry I can't words good 19:01:55 rzezeski: actually you want to fix vsnprintf in usr/src/common/util/string.c 19:02:04 Haha, yea, all I mean is, this code is compiled/running on other platforms. And I figure it works there since Chelsio choose to useit. 19:04:12 richlowe - good point. I don't see it in https://github.com/illumos/gcc/blob/il-10_4_0/gcc/config/sol2-c.c#L43-L75 but it might be somehow silently allowed 19:14:58 digging around in ancient history: looks like 7th edition unix doesn't document %#x while 4.2BSD documents it. 19:17:36 so it's probably established enough to be safe to use, i mean we're not postgres :) 19:20:04 we have absolute control of our compilers jbk, we can use all _kinds_ of things we want to, when they work. 19:20:24 the downsides are annoying, and I see why folks complain, but when you get the upside it sure is nice :) 19:22:16 and for the kernel/standlone printf which doesn't do floating point types, the # modifier only has to deal with #x and #o 19:25:36 computers are so much fun 19:26:02 it's a laugh a minute 20:18:44 ../../common/inet/ip/keysock.c:2457:26: error: unknown conversion type character '#' in format [-Werror=format=] 20:18:44 cmn_err(CE_NOTE, "Test %#x\n", keystack); 20:19:19 It looks like an easy patch to gcc though, if we end up adding support. 20:20:28 andyf: help me out here because I'm dumb, the cxgbe code is already using it and compiles just fine, it just results in bogus output. What am I not understanding? 20:20:55 Which printf-like functions is it using with that? 20:21:13 and is the error (-Wformat) possible turned off? 20:21:18 *possibly 20:21:54 snprintf and then passing that to vcmn_err 20:23:47 I see nothing about turning off the format warning 20:24:48 cxgb_printf() 20:24:52 that's what it calls 20:25:03 Perhaps the compiler can't follow the chain? 20:25:44 Unless cxgb_printfs is declared as printf-like, the compiler won't follow the chain. 20:26:12 KPRINTFLIKE in this case 20:26:14 If you were to mark cxgb_printf as __KPRINTFLIKE(3) or whatever, you'd probably get the error 20:26:15 (for it to fail) 20:26:37 but yeah, we should add support to the illumos gcc, even if you're safe right now 20:26:38 yeah, KPRINTFLIKE for kernel printf rules. 20:26:40 because then we can all use it 20:28:11 richlowe: yea, I mean I can go and update all of the cxgbe common code, but then the next person to update this driver might not realize we don't support # and reintroduce this issue. Or we could add the KPRINTFLIKE so it's impossible to miss. But it does add additional noise to the delta between us and upstream. It just seems like we should allow #. 20:28:58 Oh, definitely, we should allow it, and update our gcc branches. 20:32:10 yeah, I'm apparently being super unclear, I absolutely thoroughly want you to allow it (though it's not really up to me, obviously, it's up to you). We're just saying where the rest of the bear trap is hiding 20:32:50 unfortunately, the best person we have at translating what I'm trying to say has already said no for today :) 20:33:57 richlowe: I was also being unclear in that I was agreeing with you. haha. Consider this 3 votes for adding # support. 20:47:25 I may be too obsessive about this archaeology because I found and was just looking at 4.1BSD's doprnt.s, which has # support that isn't mentioned in the manpage. 20:47:38 (and, yes, it's in VAX assembler...) 20:51:13 the unix history repo does that to a person 20:53:49 and there is a doprnt.s.old that doesn't have it. so that looks very much like at least one version of %# was born in or around 4.1bsd 20:59:33 rzezeski: be glad that our printf core is in C, not assembler. 21:04:10 after my first build for debug and non-debug I could start doing incremental builds in nightly? 21:04:44 Or just use bldenv. 21:05:39 if you did debug _and_ non-debug, and don't have multi_proto set, incrementals are weird 21:05:59 in general if you're developing and what incrementals and convenience, only build one way or the other, until your RTI build 21:06:03 I'd just use bldenv after a nightly. incrementals might work ok if the nightly is either debug or non-debug but not both at the same time in the same tree 21:06:04 s/what/want/ 21:13:06 oh ok, thanks for the pointers. I'll read about bldenv and the whole workflow again, I usually was doing full rebuilds (omnios bi) but is taking around 3 hrs. 21:14:03 `bldenv ` will spawn a shell you can work in where 'make install' will work to put things into the proto area 21:14:27 much more convenient, I find, is `bldenv `, so like `bldenv ../foo 'make -wej 28 install'` 21:14:53 (the documentation is probably clearer and better at explaining) 21:20:30 richlowe thanks!, I'll spend more time in the docs 21:30:29 neirac: https://illumos.org/books/dev/workflow.html#incremental-building in case it helps. 21:53:22 rmustacc thanks, I was looking at the illumos.org/docs/developers/build. I 'll check the workflow next to have the whole picture.