-
SamuelMarks
hi
-
SamuelMarks
Can someone show me the output of their `uname`?
-
nomad
; uname
-
nomad
SunOS
-
nomad
that's on OmniOS 151046 (LTS)
-
andyf
The complete `uname -a` is "SunOS web0 5.11 omnios-r151046-c6bc13f710 i86pc i386 i86pc"
-
andyf
and `uname -o` outputs "illumos"
-
gitomat
[illumos-gate] 16206 lmrc can race against itself, causing a panic in lmrc_tran_start() -- Hans Rosenfeld <rosenfeld⊙gho>
-
Woodstock
danmcd: are you around?
-
danmcd
Woodstock: yes
-
danmcd
(pardon latency)
-
Woodstock
danmcd: it seems 16118 doesn't build without 16115
-
danmcd
Ouch?
-
danmcd
fenix illumos#16118
-
fenix
BUG 16118: Fix problems with libnwam communication between 32-bit and 64-bit applications (Closed)
-
fenix
-
danmcd
fenix illumos#16115
-
fenix
BUG 16115: Want 64-bit libnwam (Pending RTI)
-
fenix
-
danmcd
I built locally from scratch with 16118; did so before I actually pushed it.
-
danmcd
Do you perhaps need to `make clobber` ? Is this a bit of a flag day?
-
Woodstock
i got a build failure in 'dmake check' as amd64/libnwam.so was missing
-
Woodstock
hm maybe
-
danmcd
I'm about to merge with your just-pushed into illumos-joyent fix. I'll see if SmartOS build ALSO builds or not.
-
danmcd
So I'm rebuilding -gate on HDC and just-merged illumos-joyent right now.
-
danmcd
Oh damn.
-
danmcd
I missed it.
-
danmcd
==== cstyle/hdrchk errors ====
-
danmcd
dmake: Warning: Command failed for target `lib'
-
danmcd
dmake: Warning: Command failed for target `libnwam-nodepend'
-
danmcd
dmake: Warning: Don't know how to make target `amd64/libnwam.so.1'
-
danmcd
dmake: Warning: Target `check' not remade because of errors
-
rmustacc
The TYPECHECK target won't work if we're not building both librarierds.
-
rmustacc
*libraries
-
danmcd
@Woodstock
-
danmcd
Yeah.
-
danmcd
So back it out?
-
rmustacc
If you delete $(TYPECHECK) from check then that should work and a general question then to follow up on the mail_msg we had with the clean build.
-
Woodstock
danmcd: it might be easier to just push 16115
-
danmcd
It's up for RTI, isn't it?
-
Woodstock
yes
-
Woodstock
i wrote it originally, but marcel has taken care of review and RTI
-
danmcd
Hang on.
-
Woodstock
so i can't just approve it :)
-
danmcd
In OI for 2mos, huh?
-
danmcd
You're pushing it though.
-
danmcd
Thanks @Woodstock, and sorry for not seeing this when I built. (I checked mail_msg directly on the build box and was sloppy).
-
danmcd
@rmustacc worth an e-mail?
-
gitomat
[illumos-gate] 16115 Want 64-bit libnwam -- Hans Rosenfeld <rosenfeld⊙gho>
-
rmustacc
danmcd: Yes, it's worth two e-mails. One as a heads up. The second as a question as to how the build didn't fail for the submitter.
-
rmustacc
And to be clear, stuff like this happens, so just want to make sure folks are aware for the next time.
-
SamuelMarks
Can someone show me the output of their `uname`?
-
rmustacc
What options do you want?
-
rmustacc
(Also several people did yesterday and its in the logs after your ask, fyi)
-
danmcd
What are you trying to figure out SamuelMarks?
-
SamuelMarks
danmcd - Trying to see if it still reads "SunOS" somewhere
-
danmcd
Ahhh,.
-
danmcd
Yes, `uname -a` on two different distros starts with "SunOS".
-
SamuelMarks
rmustacc - darn I went to bed; log link?
-
SamuelMarks
danmcd - SmartOS and OpenIndiana?
-
danmcd
SmartOS and OmniOS
-
danmcd
-
SamuelMarks
oh 2.5 hours later!
-
SamuelMarks
Thanks that's helpful
-
SamuelMarks
nomad andyf - Thanks
-
danmcd
Okay, with 16115 pulled in no more `make check `breakage. Thanks @Woodstock, and sorry @carba for not catching this myself pre-push.
-
rmustacc
If there's something else you need, don't hesitate to ask SamuelMarks.
-
SamuelMarks
thx
-
SamuelMarks
I remember seeing a weird runner years ago that spun up a vm inside macos to run illumos for testing under github actions - what's the recommended CI tool for illumos testing?
-
Woodstock
rmustacc: do you know a method to reserve 32bit dma memory at boot, or alternatively force driver to attach as early as possible?
-
-
SamuelMarks
OpenIndiana
-
rmustacc
Woodstock: Not off hand no. I think I'd figure out if other devices grabbing from that 32-bit DMA region need to be or if we biased towards the wrong arena or not.
-
rmustacc
You can force attach early with some modifications I guess.
-
rmustacc
Otherwise, I think the longer term answer is IOMMU.
-
tsoome
Woodstock ddi-forceattach(9p)?
-
rmustacc
The driver.conf file won't be read ahead of the driver normally being processed.
-
rmustacc
It also doesn't prevent detach.
-
rmustacc
It just forces it to reattach, IIRC.
-
Woodstock
hm
-
rmustacc
Other things that you can maybe do will depend on the BIOS/platform.
-
rmustacc
For example, we move config space up to be above 4 GiB on our platforms to save 32-bit MMIO space.
-
Woodstock
i see that mlxcx allocates gigabytes worth of dma memory, and lmrc attach is starving trying to get 37mb for SGL chains :)
-
rmustacc
So mlxcx supports full 64-bit addresses.
-
Woodstock
yes
-
rmustacc
So I think my question is why isn't it getting 64-bit DMA?'
-
Woodstock
is there a way to find out?
-
Woodstock
that is, whether it's 32bit or 64bit dma?
-
Woodstock
the driver looks like it wants 64bit dma
-
rmustacc
Probably by looking at the PAs and the the kmem cache or vmem segments?
-
rmustacc
Dunno.
-
rmustacc
There's a way. How painful I dunno.
-
Woodstock
lmrc could do 64bit dma, too, at least in theory. but i haven't seen a controller which switches that config bit on.
-
Woodstock
hm yes
-
rmustacc
Probably can walk all the DMA handles it allocated.
-
rmustacc
I don't know of anything off hand.
-
jbk
Woodstock: and then some :) we had a server where 64GB of ram wasn't enough for mlxcx
-
rmustacc
That seems problematic. Where's arekinath to ship us some DRAM?
-
danmcd
Now I'm wondering about 64-bit DMA being enabled in the BIOS or not on my own test boxes...
-
jbk
thankfully, the integrator seems to have forgotten a few DIMMs, so once it had 256gb of ram, it was fine
-
jbk
(it had multiple mlxcx ports aggregated together + vnics, which wasn't helping with memory usage)
-
sommerfeld
jbk: too many oversized rx queues?
-
jbk
well i think each instance is allocating <# CPUs> rings * a few thousand DMA buffers per ring
-
jbk
plus vnics on top of that I think with mlxcx will also allocate additional rings
-
sommerfeld
yep, too many oversized rx queues at least for most workloads..
-
jbk
not sure offhand if the mlxcx driver lets the hw segment inbound rx packets or not
-
jbk
if not, jumbo frames would be even worse
-
jbk
always fun when dladm hangs for 10 minutes because it's blocked while the kernel (essentially) defrags (if only it had a graphic :P)
-
antranigv
oh, more users here!
-
antranigv
can anyone explain this? I have no idea where to start googling :)
antranigv.am/misc/omnios_installation_err.jpg
-
sjorge
do we support intel chips either E and P cores?
-
danmcd
You mean with them mixed?
-
rmustacc
We should boot on it. The scheduler isn't doing anything special with it.
-
rmustacc
Though I have not had any such system that i've tried to boot on.,
-
sommerfeld
antranigv: what hardware? the IDE errors suggest old (but it may be a SATA put into IDE emulation mode which you should probably turn off in the BIOS)
-
antranigv
sommerfeld you are right! it was in IDE mode. we changed it to AHCI and now works all fine
-
sommerfeld
ah, good that you figured it out...
-
tsoome
cmdk needs some love...
-
antranigv
damn this rge0 interface
-
antranigv
I can't even see the ARP packets going out
-
neirac
antranigv I have a rge card it works, what problem do you have?
-
antranigv
neirac nothing is coming out. I can't even see the ARP packets at the other end. lemme check the switch
-
neirac
antranigv if you see the nic with dladm show-phys, means the driver attached and should work, I'm using one right now in omnios
-
antranigv
neirac rge0 ethernet unknown 0 unknown rge0; this is the output of dladm
-
rmustacc
Have you created an ip interface on top of it?
-
antranigv
rmustacc yes! I can see it in `ipadm`
-
rmustacc
OK, then I'm a bit surprised that dladm says the link is in an unkonwn state (versus explicitly up/down), but given that it's not up, I guess it's not really a surprise then that you're not seeing packets.
-
antranigv
rmustacc bad driver?
-
rmustacc
Don't know.
-
rmustacc
Not enough information to make even a speculative guess.
-
antranigv
rmustacc how can I debug such issue? sorry, completely #NewHere
-
rmustacc
So what'd probably help someone is to gist/pastebin/whatever (please don't dump into the chat), the full output of ipadm and dladm.
-
antranigv
I'll get to it. it's hard to do that when there's no inet :D
-
rmustacc
Sure, I understand. Or an image or something.
-
rmustacc
Just trying to make sure we understand the state of the system.
-
rmustacc
The other thing I'd see is if there is anything in /var/adm/messages from the rge driver.
-
antranigv
okay, so I'm not in front of the display, I asked my intern to run some commands, make an image and send it. I will upload them all in a single page
-
antranigv
I've never seen anything like this in the last 5 days that I've been playing around with illumos xD
-
rmustacc
It's not something I know right away either. So it's basically going to first be about confirming the basic set up, links plugged in, the driver internally called the MAC mc_start(9E) entry point, etc.
-
danmcd
antranigv: You can always `grep rge /var/adm/messages` and see if rge(4D) whined at all? Long shot, but sometimes one gets lucky...
-
antranigv
okay, so.
-
antranigv
ipadm
-
antranigv
dladm
-
antranigv
dladm show-ether
-
antranigv
dladm show-phys
-
antranigv
grepping the messages
-
antranigv
should I get any more info?
-
ararat
ping
-
ararat
test
-
rmustacc
Can you also grab /usr/lib/pci/pcieadm show-devs output antranigv?
-
rmustacc
That'll just let us confirm what exact rge device we're working with.
-
antranigv
sure!
-
antranigv
ararat can you run these commands in the server? send the images to me, I will upload them to a server
-
antranigv
ararat to be specific, the file we're grepping (you were not here) is the following: grep rge /var/adm/messages
-
neirac
antranigv this is the output of prtconf -vp and this is the nic I have
termbin.com/yodo check for Ethernet controller and name, mine is name: 'pci1462,7721'
-
rmustacc
If you're trying to get the name neirac /usr/lib/pci/pcieadm show-devs rge is a friendlier way there.
-
rmustacc
The name can be a little deceptive as that's defaulting to a device subsystem ID.
-
rmustacc
So there's a lot more variance in it.
-
neirac
rmustacc let me try that
-
rmustacc
No real need as all the info is in the output you had.
-
rmustacc
The device-name string is more useful there and the primary device id as opposed to the subsystem tells us what particular chipset it is.
-
neirac
rmustacc I don't find pcieadm in omnios is that in a package?
-
rmustacc
Yeah, pkg:/diagnostic/pci.
-
neirac
awesome 1/0/0 PCIe Gen 1x1 rge0 RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
-
neirac
looks a lot nicer
-
antranigv
ararat thank you!
-
antranigv
here's the images
-
antranigv
-
antranigv
and as much output as I could
-
antranigv
rmustacc turns out pcieadm was in packages, so we were not able to install atm
-
antranigv
neirac looks like we both have the same card
-
neirac
but the model is the same? like the device-id ?
-
antranigv
neirac yup!
-
antranigv
neirac the revision is different
-
antranigv
but the device-id is the same: 00008168
-
ararat
test
-
antranigv
neirac we'll try rebooting, sometime that works :)))
-
antranigv
ararat can you reboot the machine please?
-
neirac
what I see related to the error message regarding MSI is
src.illumos.org/source/xref/illumos…n/io/rge/rge_main.c?r=0dc2366f#1731 , I don't see how to disable MSI in rge
-
ararat
sure
-
rmustacc
I don't see an error there.
-
antranigv
neirac in BSD land, that would be using sysctls.
-
rmustacc
Just a note that we opted to use MSI interrupts, whichi is generally fine.
-
rmustacc
There is nothing in that to suggest there as an interrupt delivery problem.
-
neirac
rmustacc oh ok, just greped the first thing I saw in the picturs
-
ararat
ping still doesn't work
-
ararat
for .1
-
antranigv
ararat I wonder if there's something in BIOS settings (AGAIN) that can help us, but probably not
-
antranigv
ararat is there another network card on the desktop? or is that the only NIC?
-
neirac
rmustacc is there a way to check with dladm or other command if a cable is connected?
-
ararat
yhy
-
antranigv
neirac the cable is connected for sure in our case, as rebooting into Linux, we can get an IP via DHCP. but I think the LINK is the answer to your question, no?
-
ararat
`cannot plumb rge0: already exists`
-
rmustacc
neirac: Ethernet PHYs don't usually distinguish between a cable being present and a carrier there.
-
rmustacc
Unlike an SFF tranceiver, there isn't a dedicated presence pin that I know of.
-
neirac
rmustacc thanks
-
neirac
I don't know what else to check based on the pictures, maybe rem_drv rge and add_drv?
-
neirac
antranigv run rem_drv rge and then add_drv rge, and check dmesg
-
rmustacc
That's not going to work while there's a valid ip interface.
-
antranigv
I think we can delete the IP interface
-
antranigv
and delete the if
-
antranigv
and then do that
-
ararat
rem_drv requires reboot
-
antranigv
rebooting
-
ararat_
69.55/24
-
antranigv
ararat_ still can't ping it
-
antranigv
ararat_ what's the new state now, after you rebooted?
-
antranigv
disabled?
-
ararat_
yed
-
ararat_
yes
-
antranigv
can you try neirac 's suggestion now and run `add_drv rge` ?
-
ararat_
it says can't driver set in system but cant attach
-
ararat_
driver set but cant attach*
-
rmustacc
Unfortunately, using rem_drv manually removes all the PCI alias.
-
rmustacc
So a standalone add_drv won't get it to attach.
-
rmustacc
I'm not sure what this was going to accomplish, but I can try to go figure out the right syntax for you in a few minutes.
-
antranigv
rmustacc danke <3
-
antranigv
rmustacc wanna get this done soon, as ararat_ is still in the data center trying to have illumos up and running :DDD ararat_ is it cold in there? xD
-
ararat_
a lil bit, it's fine
-
rmustacc
update_drv -i '"pcie10ec,8168"' rge
-
neirac
rmustacc I thought rem_drv rge only removed that driver
-
rmustacc
It does, but it removes all the aliases that were installed.
-
neirac
rmustacc then add_drive -i add that alias to /etc/driver_aliases and tries to attach ?
-
rmustacc
You would need to put that in the add_drv line again.
-
rmustacc
But in general, I also don't really remove/add the driver for this unless I have a theory for what's going on because you end up fighting a bit against packaging.
-
neirac
rmustacc is there any other way to debug when a driver attaches ?
-
neirac
I mean force to attach/reattach
-
rmustacc
Sorry, I'm not in a place where I can lead a debugging session. But generally with stuff like this I use mdb to understand the driver's state and then DTrace.
-
antranigv
ararat_ after running `update_drv -i '"pcie10ec,8168"' rge` let us know what the output is. if there' no output, then send me a picture of `dmesg`
-
rmustacc
In terms of loading and unloading, modunload and modload will probalby show up. But if we don't understand anything, I'm not sure that an add_drv/rem_drv will do anything that different from a reboot.
-
ararat_
at least one of m/i/P/p must be specified with -a amd -d
-
rmustacc
Ah, sorry, I forgot the -a.
-
rmustacc
So toss that in before teh -i.
-
ararat_
failed to attach
-
antranigv
ararat_ update_drv -a -i '"pcie10ec,8168"' rge
-
antranigv
damn
-
rmustacc
Ah, sorry, one other typo there it looks like. Sorry, typing this out without a system to try this on.
-
rmustacc
update_drv -a -i '"pciex10ec,8168"' rge
-
rmustacc
Basically, the alais was pciex, not pcie. My mistake.
-
ararat_
ahah
-
ararat_
no error
-
antranigv
ararat_ can you send me the output of `dladm show-phys` ?
-
ararat_
rge0 Ethernet unknown 0 unknown rge0
-
ararat_
link media state speed duplex device
-
antranigv
damn it
-
antranigv
today's not our lucky day