-
Agnar
jbk: that was my initial mistake not to update the perl version in illumos.sh, after fixing that I learned that OI dropped Sun/Solaris/Utils.pm from perl 5.36, but if I want to build illumos-gate with perl 5.38 it somehow still searches Sun/Solaris/Utils.pm in the 5.36 directories. that could be a bug in my perl because I still run an OI from Nov 2023, which I can't update due to strange effects to my OpenWRT bhyve VM that the ppt forwarding seems not to work
-
Agnar
tl;dr: I will set up a dedicated build bhyve vm for illumos-gate :)
-
jclulow
Agnar: Have you filed a bug about the ppt business?
-
jclulow
Even if you don't know what's wrong, the symptoms, and when it was working and when it stopped working, etc, would be good
-
Agnar
jclulow: not yet, I suspect a problem with openWRT as well. after my previous comment here, I decided to update OpenWRT before OI update and I'm going to reboot now into a new BE - let's see if it works then
-
Agnar
ok, doesn't work. I'll open an issue
-
andyf
Thanks, I'll be interested to see what's going on there.
-
Agnar
I have opened
illumos.org/issues/16467 and set you andyf on the watchers
-
fenix
→
BUG 16467: pass-through intel wifi stops working after upgrading OpenIndiana (New)
-
andyf
Thanks, I'll take a look at the weekend and see if I can spot anything in that last sync that might have caused it.
-
Agnar
andyf: I used got the idea to copy the old ppt to the new BE and check if that works
-
Agnar
andyf: will update the issue then
-
wacki
I got an HP Z4 G4 with a Xeon W-2245. It failed to boot OpenIndiana: pci: unable to get IRQ routing information, required buffer space of 291 entries exceeds max
-
wacki
Any ideas what went wrong?
-
jbk
we seem to have a hard coded limit of 255, though I'm not sure offhand why (e.g. just arbitrary 'this should be enough' limit that's no longer true, some deeper architectural requirement, etc.)
-
denk
-
jbk
though are you using efi boot or legacy bios?
-
wacki
Legacy BIOS
-
wacki
HP Z4 G4 is not so new. This machine is roughly 4 years old.
-
Woodstock
wacki: did it work before?
-
jbk
if there' an option for efi, it might be worth trying that
-
wacki
I just bought the machine.
-
wacki
It was my first attempt. With a new boot image from April 10th that I recently created.
-
wacki
I had to create a DVD because booting from USB stick didn't work at all.
-
Woodstock
might be worth trying older versions. i'm seeing some pci weirdness on one of my older machines that have started some time in '23.
-
wacki
I had the same issue with an OI 2023/05 boot medium.
-
wacki
With UEFI boot from DVD I don't get an error message but it is in maintenance mode now.
-
jbk
does it say what failed to drop it into maint mode?
-
Aedil
By this chance, reading above messages: I failed a year ago to install OI illumos in EFI mode into Linux KVM/qemu virtual machine on Debian/testing. Only installing in BIOS Legacy mode was possible.
-
Aedil
Maybe it was even 1,5 – 2,0 years ago.
-
tsoome
wacki jbk thats most likely the problem with devfsadm being run late and it is missing device nodes to access install media
-
tsoome
Aedil do you have more details, issue open perhaps?
-
Aedil
tsoome: No, I was not able to install. I don't know any more specific details.
-
wacki
@tsoome I have created new boot media to have your slimsource fix included.
-
tsoome
wacki and it is failing with it too?
-
tsoome
wacki then we need to find out the device path and see why that discovery program can not detect it...
-
rmustacc
Woodstock: What are the symptoms of pci weirdness?
-
Woodstock
rmustacc: npe WARNING: nvme1: regs[1] does not have a valid MMIO address
-
Woodstock
rmustacc: nvme: WARNING: nvme1: failed to map regset 1
-
Woodstock
rmustacc: the system is a AMD Bulldozer and has two quite old (v1.0) intel nvme ssds
-
Woodstock
nvme0 works fine, only nvme1 has this problem
-
Woodstock
this box runs smartos, and running an old PI both nvme devices work
-
rmustacc
What is the value it ended up with there?
-
Woodstock
reg property: value=00090000.00000000.00000000.00000000.00000000.03090010.00000000.00000000.00000000.00004000
-
Woodstock
the one that works looks like this: value=00020000.00000000.00000000.00000000.00000000.03020010.00000000.00000000.00000000.00004000
-
rmustacc
When you say one that works you mean nvme0 vs. nvme1 as opposed to across the two boots.
-
Woodstock
yes
-
Woodstock
old: nvme0 and nvme1 work, new: nvme0 works, nvme1 doesn't
-
rmustacc
All good, just trying to make sure that those were for two different devices which is what I'd expect given the firstu32.
-
Woodstock
i haven't bisected it yet, but the "old" PI was at least a year old, iirc
-
rmustacc
*first u32
-
rmustacc
There were a few major changes to the PCIe subsystem and boot path from boht Andy and myself.
-
rmustacc
Between 15743 and 15587 (and friends).
-
andyf
Woodstock - can you grab "assigned-addresses" too?
-
Woodstock
working: value=83020010.00000000.fe510000.00000000.00004000
-
Woodstock
broken: value=83090010.00000000.00000000.00000000.00004000
-
rmustacc
OK, so we didn't assign an address, aha.
-
wacki
tsoome: Do you have instructions for me?
-
andyf
Booting with pci_autocinfig:pci_debug_boot = 1 might be informative if that's easy enough to do.
-
wacki
How could I do that with my ISO image?
-
andyf
Sorry, that was for Woodstock
-
wacki
:)
-
andyf
and *pci_autoconfig
-
tsoome
wacki if you got shell prompt, then rmformat command would hopefully show the device name for cd/dvd
-
wacki
I will check it.
-
tsoome
wacki from it you can get physical device path
-
tsoome
wacki I can not remember the name of the tool the miniroot is using to scan for install media... should find the name from /lib/svc/method/* scripts
-
wacki
rmformat is not available
-
tsoome
does dmesg list dvd?
-
wacki
dmesg is not available
-
tsoome
uh, such an rain...
-
tsoome
wacki oh right, thats really early startup and most commands are missing...
-
wacki
Yes, and in case of EFI boot without any hints.
-
tsoome
is prtconf there?
-
wacki
yes
-
wacki
But no "more"!
-
tsoome
the tools to list cd and usb devices are /sbi/listcd and /sbin/listusb
-
wacki
Both are silent
-
tsoome
basically, if they can not find cd/dvd, it hints that we are missing driver or this driver is not attached, or those list* commands are somehow buggy.
-
tsoome
in loader, lsdev -v should print out the device paths in UEFI mode, that should give some hint where from to look in /devices tree
-
wacki
PciRoot (0x0) / Pci(0x17, 0x0) / Sata(0x4, 0x0, 0x0)
-
wacki
Is it worth to try to boot from a USB mounted cd drive?
-
tsoome
definitely
-
KungFuJesus
jbk: hmm, well another drive died and evidently that config entry I had added didn't keep the LED lit
-
KungFuJesus
ahhh crap, typod it to be dev.io not vdev.io
-
wacki
This looks promising. It only supports legacy boot from the usb cd drive and thus throws the IRQ routing error but it continues to load and I am in the installation menu now.
-
gitomat
[illumos-gate] 16459 want emlxs to support Oracle branded LP adapters -- Olaf Bohlen <olbohlen⊙ed>
-
jbk
wacki: so just slow?
-
KungFuJesus
jbk: so an interesting piece of the puzzle. Restarting fmd after setting the correct events to subscribe to made the fmtopo command take an eternity (and also not turn on the LED automatically)
-
KungFuJesus
is there something accidentally quadratic (or perhaps there's some nasty XML tree parsing going on that's slow) that's per IO error or something?
-
KungFuJesus
when the faulted drive was yanked suddenly fmtopo performed reasonably (not fast, mind you, but seconds not a minute or so)
-
jbk
it seems more likely it was probably trying to query the bad drive for something and was stuck there
-
jbk
sd.c will i think re-try up to something like 6 minutes for certain requests
-
jbk
which is probably not great if that's holding up other tuff
-
KungFuJesus
hah, sorry just realized you weren't the person on the ticket at the time, but I think you were joined in the conversation. Looks like jclulow had been the one
-
jbk
that' good, because i was trying to dig to remind myself of the specifics and was starting to come up blank :)
-
wacki
I can't say that it is slow. It's like always when installing from a DVD medium.
-
KungFuJesus
pastebin.com/78UZLakR seems to indicate the disk hanging on some SCSI commands, but fmadm considered it faulted, I'd have thought it would have given up
-
bahamat
Is there something special I need to do to make something link against 64-bit libumem?
-
wacki
Despite the error message everything else seems to work. I have already rebooted the machine after the installation.
-
bahamat
I have something I'm converting from 32 to 64-bit that has `LIBS = -lumem` and `$(PROG): $(OBJ:%=obj/%) $(LIBS)`.
-
jbk
bahamat: not as far as I know.. sometimes you need -L[/usr]/lib/amd64
-
bahamat
Something is implicitly turning that `-lumem` into `/lib/libumem.so`.
-
jbk
or /usr/local
-
bahamat
I could hard code `/lib/64/libumem.so`, but it seems like if it's already using magic, there should be some magic to say no, the 64-bit one.
-
jbk
how are you linking? calling gcc or ld directly?
-
bahamat
-
jbk
if you add '-m64' to CFLAGS, does that make any difference?
-
bahamat
I tried that already, and no.
-
KungFuJesus
jclulow: (or jbk) anything else I should look at that might make the
illumos.org/issues/16353 issue more useful?
-
fenix
→
BUG 16353: disk fault lights should come on for ZFS vdev-level faults (New)
-
wacki
tsoome: Thanks for your help!
-
bahamat
jbk: I get this: ld: fatal: file /lib/libumem.so: wrong ELF class: ELFCLASS32
-
bahamat
Which is the same error without using `-m64`
-
bahamat
Which, yeah, that makes sense.
-
jbk
it sounds like something' messed with the version of gcc being invoked
-
jbk
you try adding the -L/lib/amd64 and see if that at least works around the issue
-
bahamat
It's just gcc
-
tsoome
wacki yw. it is still good idea to try to figure out why it was failing with sata
-
bahamat
to CFLAGS? or LIBS?
-
jbk
LIBS i guess based on the makefile
-
jbk
(it'd usually be LDFLAGS, but that's not being used here, and if it's just for investigation...)
-
wacki
Yes, you are right. It would also be good to get rid of the IRQ limitation.
-
bahamat
jbk: make: *** No rule to make target '-L/usr/amd64', needed by 'zfs_snapshot_tar'. Stop
-
wacki
Maybe we can try to investigate further on another day. I will need to go to bed soon.
-
bahamat
Everything I can think of to "magic" it to 64-bit gets that same error.
-
jbk
oh hrm.. LIBS is in the dep line
-
bahamat
Yeah.
-
jbk
could just add it before $(LIBS) in teh recipie line
-
bahamat
and on multi-arch it works.
-
jbk
but are you using the strap gcc or the pkgsrc gcc?
-
bahamat
pkgsrc
-
jbk
might try the strap one (I'm assuming there's a smartos-live repo somewhere) to see if it makes any difference
-
jbk
my guess (though I've not delved too deeply into this stuff with gcc) that maybe something's not quite right with a spec file
-
jbk
IIRC that's how gcc figures out how to invoke ld
-
bahamat
that didn't work either :-(
-
bahamat
Something is still implicitly sticing /lib/libumem.so in there.
-
bahamat
jbk: This is being built as a sub-component of cn-agent, so there's no strap available.
-
jbk
but that gcc isn't working either?
-
jbk
mostly as a test
-
jbk
if that gcc works
-
jbk
but pkgsrc doesn't
-
jbk
then you have something to compare
-
bahamat
If I hard code it with /lib/64/libumem.so it compiles and works fine.
-
bahamat
I just feels like the wrong thing to do.
-
jbk
are you running make on that directly
-
jbk
?
-
bahamat
Yes
-
jbk
at least in my smartos zone i used to build smartos-live on my home machine
-
jbk
it seems to DTRT:
-
bahamat
Like this: make CTFCONVERT=/bin/true CTFMERGE=/bin/true STRIP=/bin/true
-
bahamat
It does the right thing if I use 15.4.1-multiarch.
-
bahamat
Building 32-bit is fine.
-
jbk
-
bahamat
It's not `-lumem` in itself that's the problem
-
bahamat
It's that something is magically inserting `/lib/libumem.so`
-
jbk
`env` show anything?
-
bahamat
No references to umem or LIBS in my env.
-
jbk
what does 'gcc -dumpspecs' show (might need to paste and add in some newlines to make it readable)
-
bahamat
nothing mentioning umem
-
bahamat
If I do this: LIBS = deps/libarchive/.libs/libarchive.a
-
bahamat
and this: $(CC) $(CFLAGS) -o $@ $^ -lumem $(LIBS)
-
bahamat
then it works
-
bahamat
Which, maybe that's how it was always supposed to be?
-
bahamat
Even compiling on multiarch, I get this message: ld: warning: file /lib/libumem.so: attempted multiple inclusion of file
-
bahamat
But when I do it this new way, I don't get that message.
-
jbk
the $^ is maybe a factor
-
jbk
that's all the prerequisites IIRC
-
jbk
which means you end up with something like $(CC) $(CFLAGS) $(OBJ:%=obj/%) $(LIBS) $(LIBS)
-
bahamat
Yeah, and it's weird that `-lumem` would be listed as a prereq.
-
jbk
maybe trying replaceing $^ with $(OBJ:%=obj/%)
-
jbk
well
-
jbk
and maybe just add the libarchive dep explicitly
-
bahamat
I'm starting to think that $(LIBS) should never have been on the $(CC) line.
-
bahamat
Doing this: LIBS = deps/libarchive/.libs/libarchive.a
-
bahamat
and: $(CC) $(CFLAGS) -o $@ $^ -lumem
-
bahamat
also works.
-
jclulow
bahamat: Yes I think that is probably a mistake I made there in #4
-
jclulow
It used to be a list of archives and then I committed a category error
-
bahamat
Haha. That makes much more sense now!
-
jclulow
I would do something like LDLIBS=-lumem and include that _just_ in the $(CC) args
-
jclulow
and then remove LIBS from the $(CC) args, and just leave it in the dependency list
-
bahamat
-
bahamat
That seems to work, but I think doing the LDLIBS thing is a good idea.
-
jclulow
I wouldn't put it in CFLAGS because it shouldn't get passed to all the *.o compilation
-
jclulow
Even if it appears to be working
-
jclulow
But yeah otherwise I think you're in the right place
-
bahamat
Ok, how's that now?
-
jclulow
bahamat: lgtm!
-
bahamat
jclulow: Awesome, thanks!!
-
jclulow
bahamat: You're welcome! Somewhat impressive that zfs_snapshot_tar lives on haha
-
bahamat
With this, I finally move cn-agent out of node-bitrot state.
-
bahamat
I mean, it's still only v6, but getting to this point is going to make getting things onto a modern version of node much easier.
-
jclulow
I do wonder if it would not be easier to rewrite cn-agent in a modern language haha