07:52:46 jbk: that was my initial mistake not to update the perl version in illumos.sh, after fixing that I learned that OI dropped Sun/Solaris/Utils.pm from perl 5.36, but if I want to build illumos-gate with perl 5.38 it somehow still searches Sun/Solaris/Utils.pm in the 5.36 directories. that could be a bug in my perl because I still run an OI from Nov 2023, which I can't update due to strange effects to my OpenWRT bhyve VM that the ppt forwarding seems not to work 07:53:03 tl;dr: I will set up a dedicated build bhyve vm for illumos-gate :) 10:36:41 Agnar: Have you filed a bug about the ppt business? 10:37:07 Even if you don't know what's wrong, the symptoms, and when it was working and when it stopped working, etc, would be good 10:39:49 jclulow: not yet, I suspect a problem with openWRT as well. after my previous comment here, I decided to update OpenWRT before OI update and I'm going to reboot now into a new BE - let's see if it works then 10:52:02 ok, doesn't work. I'll open an issue 11:03:34 Thanks, I'll be interested to see what's going on there. 11:07:45 I have opened https://www.illumos.org/issues/16467 and set you andyf on the watchers 11:07:46 → BUG 16467: pass-through intel wifi stops working after upgrading OpenIndiana (New) 11:47:52 Thanks, I'll take a look at the weekend and see if I can spot anything in that last sync that might have caused it. 12:50:41 andyf: I used got the idea to copy the old ppt to the new BE and check if that works 12:50:48 andyf: will update the issue then 15:45:06 I got an HP Z4 G4 with a Xeon W-2245. It failed to boot OpenIndiana: pci: unable to get IRQ routing information, required buffer space of 291 entries exceeds max 15:45:24 Any ideas what went wrong? 15:58:15 we seem to have a hard coded limit of 255, though I'm not sure offhand why (e.g. just arbitrary 'this should be enough' limit that's no longer true, some deeper architectural requirement, etc.) 15:59:20 https://src.illumos.org/source/xref/illumos-gate/usr/src/uts/i86pc/os/pci_bios.c?r=cd0d4b40#129 15:59:55 though are you using efi boot or legacy bios? 16:02:30 Legacy BIOS 16:04:15 HP Z4 G4 is not so new. This machine is roughly 4 years old. 16:04:37 wacki: did it work before? 16:04:54 if there' an option for efi, it might be worth trying that 16:05:38 I just bought the machine. 16:06:22 It was my first attempt. With a new boot image from April 10th that I recently created. 16:06:54 I had to create a DVD because booting from USB stick didn't work at all. 16:07:11 might be worth trying older versions. i'm seeing some pci weirdness on one of my older machines that have started some time in '23. 16:11:14 I had the same issue with an OI 2023/05 boot medium. 16:14:29 With UEFI boot from DVD I don't get an error message but it is in maintenance mode now. 16:22:36 does it say what failed to drop it into maint mode? 16:47:10 By this chance, reading above messages: I failed a year ago to install OI illumos in EFI mode into Linux KVM/qemu virtual machine on Debian/testing. Only installing in BIOS Legacy mode was possible. 16:48:05 Maybe it was even 1,5 – 2,0 years ago. 16:53:13 wacki jbk thats most likely the problem with devfsadm being run late and it is missing device nodes to access install media 16:53:53 Aedil do you have more details, issue open perhaps? 16:58:37 tsoome: No, I was not able to install. I don't know any more specific details. 17:35:10 @tsoome I have created new boot media to have your slimsource fix included. 17:35:59 wacki and it is failing with it too? 17:37:18 wacki then we need to find out the device path and see why that discovery program can not detect it... 17:45:26 Woodstock: What are the symptoms of pci weirdness? 17:46:01 rmustacc: npe WARNING: nvme1: regs[1] does not have a valid MMIO address 17:46:17 rmustacc: nvme: WARNING: nvme1: failed to map regset 1 17:46:58 rmustacc: the system is a AMD Bulldozer and has two quite old (v1.0) intel nvme ssds 17:47:33 nvme0 works fine, only nvme1 has this problem 17:48:22 this box runs smartos, and running an old PI both nvme devices work 17:52:37 What is the value it ended up with there? 17:54:59 reg property: value=00090000.00000000.00000000.00000000.00000000.03090010.00000000.00000000.00000000.00004000 17:55:35 the one that works looks like this: value=00020000.00000000.00000000.00000000.00000000.03020010.00000000.00000000.00000000.00004000 17:57:17 When you say one that works you mean nvme0 vs. nvme1 as opposed to across the two boots. 18:02:15 yes 18:02:28 old: nvme0 and nvme1 work, new: nvme0 works, nvme1 doesn't 18:06:06 All good, just trying to make sure that those were for two different devices which is what I'd expect given the firstu32. 18:06:07 i haven't bisected it yet, but the "old" PI was at least a year old, iirc 18:06:09 *first u32 18:07:47 There were a few major changes to the PCIe subsystem and boot path from boht Andy and myself. 18:08:38 Between 15743 and 15587 (and friends). 18:16:20 Woodstock - can you grab "assigned-addresses" too? 18:18:09 working: value=83020010.00000000.fe510000.00000000.00004000 18:18:22 broken: value=83090010.00000000.00000000.00000000.00004000 18:19:58 OK, so we didn't assign an address, aha. 18:23:24 tsoome: Do you have instructions for me? 18:24:06 Booting with pci_autocinfig:pci_debug_boot = 1 might be informative if that's easy enough to do. 18:25:13 How could I do that with my ISO image? 18:25:23 Sorry, that was for Woodstock 18:25:33 :) 18:27:50 and *pci_autoconfig 18:29:48 wacki if you got shell prompt, then rmformat command would hopefully show the device name for cd/dvd 18:30:40 I will check it. 18:30:44 wacki from it you can get physical device path 18:31:57 wacki I can not remember the name of the tool the miniroot is using to scan for install media... should find the name from /lib/svc/method/* scripts 18:34:54 rmformat is not available 18:59:18 does dmesg list dvd? 19:11:16 dmesg is not available 20:05:40 uh, such an rain... 20:06:29 wacki oh right, thats really early startup and most commands are missing... 20:07:55 Yes, and in case of EFI boot without any hints. 20:08:55 is prtconf there? 20:13:47 yes 20:13:56 But no "more"! 20:14:35 the tools to list cd and usb devices are /sbi/listcd and /sbin/listusb 20:15:36 Both are silent 20:16:31 basically, if they can not find cd/dvd, it hints that we are missing driver or this driver is not attached, or those list* commands are somehow buggy. 20:17:32 in loader, lsdev -v should print out the device paths in UEFI mode, that should give some hint where from to look in /devices tree 20:23:38 PciRoot (0x0) / Pci(0x17, 0x0) / Sata(0x4, 0x0, 0x0) 20:28:53 Is it worth to try to boot from a USB mounted cd drive? 20:29:47 definitely 20:30:03 jbk: hmm, well another drive died and evidently that config entry I had added didn't keep the LED lit 20:32:48 ahhh crap, typod it to be dev.io not vdev.io 20:35:42 This looks promising. It only supports legacy boot from the usb cd drive and thus throws the IRQ routing error but it continues to load and I am in the installation menu now. 20:39:52 [illumos-gate] 16459 want emlxs to support Oracle branded LP adapters -- Olaf Bohlen 20:43:04 wacki: so just slow? 20:44:56 jbk: so an interesting piece of the puzzle. Restarting fmd after setting the correct events to subscribe to made the fmtopo command take an eternity (and also not turn on the LED automatically) 20:45:27 is there something accidentally quadratic (or perhaps there's some nasty XML tree parsing going on that's slow) that's per IO error or something? 20:45:58 when the faulted drive was yanked suddenly fmtopo performed reasonably (not fast, mind you, but seconds not a minute or so) 20:46:48 it seems more likely it was probably trying to query the bad drive for something and was stuck there 20:47:26 sd.c will i think re-try up to something like 6 minutes for certain requests 20:47:43 which is probably not great if that's holding up other tuff 20:48:05 hah, sorry just realized you weren't the person on the ticket at the time, but I think you were joined in the conversation. Looks like jclulow had been the one 20:50:24 that' good, because i was trying to dig to remind myself of the specifics and was starting to come up blank :) 20:50:29 I can't say that it is slow. It's like always when installing from a DVD medium. 20:50:30 https://pastebin.com/78UZLakR seems to indicate the disk hanging on some SCSI commands, but fmadm considered it faulted, I'd have thought it would have given up 20:50:59 Is there something special I need to do to make something link against 64-bit libumem? 20:51:09 Despite the error message everything else seems to work. I have already rebooted the machine after the installation. 20:51:41 I have something I'm converting from 32 to 64-bit that has `LIBS = -lumem` and `$(PROG): $(OBJ:%=obj/%) $(LIBS)`. 20:51:42 bahamat: not as far as I know.. sometimes you need -L[/usr]/lib/amd64 20:52:04 Something is implicitly turning that `-lumem` into `/lib/libumem.so`. 20:52:06 or /usr/local 20:52:52 I could hard code `/lib/64/libumem.so`, but it seems like if it's already using magic, there should be some magic to say no, the 64-bit one. 20:52:55 how are you linking? calling gcc or ld directly? 20:53:32 jbk: It's here: https://github.com/TritonDataCenter/zfs_snapshot_tar/blob/master/Makefile#L32-L38 20:55:04 if you add '-m64' to CFLAGS, does that make any difference? 20:55:22 I tried that already, and no. 20:56:15 jclulow: (or jbk) anything else I should look at that might make the https://www.illumos.org/issues/16353 issue more useful? 20:56:16 → BUG 16353: disk fault lights should come on for ZFS vdev-level faults (New) 20:56:20 tsoome: Thanks for your help! 20:56:23 jbk: I get this: ld: fatal: file /lib/libumem.so: wrong ELF class: ELFCLASS32 20:56:37 Which is the same error without using `-m64` 20:56:49 Which, yeah, that makes sense. 20:56:58 it sounds like something' messed with the version of gcc being invoked 20:57:22 you try adding the -L/lib/amd64 and see if that at least works around the issue 20:57:25 It's just gcc 20:57:28 wacki yw. it is still good idea to try to figure out why it was failing with sata 20:57:42 to CFLAGS? or LIBS? 20:58:02 LIBS i guess based on the makefile 20:58:36 (it'd usually be LDFLAGS, but that's not being used here, and if it's just for investigation...) 20:58:39 Yes, you are right. It would also be good to get rid of the IRQ limitation. 20:59:02 jbk: make: *** No rule to make target '-L/usr/amd64', needed by 'zfs_snapshot_tar'. Stop 20:59:19 Maybe we can try to investigate further on another day. I will need to go to bed soon. 20:59:28 Everything I can think of to "magic" it to 64-bit gets that same error. 20:59:29 oh hrm.. LIBS is in the dep line 20:59:34 Yeah. 20:59:39 could just add it before $(LIBS) in teh recipie line 20:59:41 and on multi-arch it works. 20:59:46 but are you using the strap gcc or the pkgsrc gcc? 20:59:57 pkgsrc 21:00:29 might try the strap one (I'm assuming there's a smartos-live repo somewhere) to see if it makes any difference 21:00:54 my guess (though I've not delved too deeply into this stuff with gcc) that maybe something's not quite right with a spec file 21:01:10 IIRC that's how gcc figures out how to invoke ld 21:01:14 that didn't work either :-( 21:01:33 Something is still implicitly sticing /lib/libumem.so in there. 21:02:02 jbk: This is being built as a sub-component of cn-agent, so there's no strap available. 21:02:21 but that gcc isn't working either? 21:02:26 mostly as a test 21:02:29 if that gcc works 21:02:32 but pkgsrc doesn't 21:02:39 then you have something to compare 21:02:43 If I hard code it with /lib/64/libumem.so it compiles and works fine. 21:02:51 I just feels like the wrong thing to do. 21:03:55 are you running make on that directly 21:03:56 ? 21:04:00 Yes 21:04:14 at least in my smartos zone i used to build smartos-live on my home machine 21:04:20 it seems to DTRT: 21:04:21 Like this: make CTFCONVERT=/bin/true CTFMERGE=/bin/true STRIP=/bin/true 21:05:00 It does the right thing if I use 15.4.1-multiarch. 21:05:09 Building 32-bit is fine. 21:05:12 https://pastebin.com/M4Z82She 21:05:50 It's not `-lumem` in itself that's the problem 21:06:10 It's that something is magically inserting `/lib/libumem.so` 21:08:06 `env` show anything? 21:08:32 No references to umem or LIBS in my env. 21:11:06 what does 'gcc -dumpspecs' show (might need to paste and add in some newlines to make it readable) 21:11:33 nothing mentioning umem 21:12:22 If I do this: LIBS = deps/libarchive/.libs/libarchive.a 21:12:33 and this: $(CC) $(CFLAGS) -o $@ $^ -lumem $(LIBS) 21:12:34 then it works 21:12:48 Which, maybe that's how it was always supposed to be? 21:13:19 Even compiling on multiarch, I get this message: ld: warning: file /lib/libumem.so: attempted multiple inclusion of file 21:13:27 But when I do it this new way, I don't get that message. 21:15:30 the $^ is maybe a factor 21:15:40 that's all the prerequisites IIRC 21:16:10 which means you end up with something like $(CC) $(CFLAGS) $(OBJ:%=obj/%) $(LIBS) $(LIBS) 21:16:16 Yeah, and it's weird that `-lumem` would be listed as a prereq. 21:16:59 maybe trying replaceing $^ with $(OBJ:%=obj/%) 21:17:11 well 21:18:20 and maybe just add the libarchive dep explicitly 21:20:57 I'm starting to think that $(LIBS) should never have been on the $(CC) line. 21:22:45 Doing this: LIBS = deps/libarchive/.libs/libarchive.a 21:22:56 and: $(CC) $(CFLAGS) -o $@ $^ -lumem 21:22:58 also works. 22:59:40 bahamat: Yes I think that is probably a mistake I made there in #4 22:59:57 It used to be a list of archives and then I committed a category error 23:00:22 Haha. That makes much more sense now! 23:00:40 I would do something like LDLIBS=-lumem and include that _just_ in the $(CC) args 23:00:56 and then remove LIBS from the $(CC) args, and just leave it in the dependency list 23:01:15 This is what I have so far: https://github.com/TritonDataCenter/zfs_snapshot_tar/pull/7/files 23:02:04 That seems to work, but I think doing the LDLIBS thing is a good idea. 23:02:06 I wouldn't put it in CFLAGS because it shouldn't get passed to all the *.o compilation 23:02:13 Even if it appears to be working 23:03:04 But yeah otherwise I think you're in the right place 23:04:17 Ok, how's that now? 23:17:07 bahamat: lgtm! 23:17:17 jclulow: Awesome, thanks!! 23:25:40 bahamat: You're welcome! Somewhat impressive that zfs_snapshot_tar lives on haha 23:26:42 With this, I finally move cn-agent out of node-bitrot state. 23:27:24 I mean, it's still only v6, but getting to this point is going to make getting things onto a modern version of node much easier. 23:28:31 I do wonder if it would not be easier to rewrite cn-agent in a modern language haha