04:33:53 hi 04:34:02 Can someone show me the output of their `uname`? 05:10:40 ; uname 05:10:40 SunOS 05:11:04 that's on OmniOS 151046 (LTS) 10:44:57 The complete `uname -a` is "SunOS web0 5.11 omnios-r151046-c6bc13f710 i86pc i386 i86pc" 10:45:15 and `uname -o` outputs "illumos" 15:01:18 [illumos-gate] 16206 lmrc can race against itself, causing a panic in lmrc_tran_start() -- Hans Rosenfeld 15:03:22 danmcd: are you around? 15:30:38 Woodstock: yes 15:31:10 (pardon latency) 15:36:59 danmcd: it seems 16118 doesn't build without 16115 15:37:19 Ouch? 15:37:35 fenix illumos#16118 15:37:36 BUG 16118: Fix problems with libnwam communication between 32-bit and 64-bit applications (Closed) 15:37:36 ↳ https://www.illumos.org/issues/16118 | https://code.illumos.org/c/illumos-gate/+/3188 15:37:44 fenix illumos#16115 15:37:44 BUG 16115: Want 64-bit libnwam (Pending RTI) 15:37:44 ↳ https://www.illumos.org/issues/16115 | https://code.illumos.org/c/illumos-gate/+/3186 15:38:17 I built locally from scratch with 16118; did so before I actually pushed it. 15:38:31 Do you perhaps need to `make clobber` ? Is this a bit of a flag day? 15:38:47 i got a build failure in 'dmake check' as amd64/libnwam.so was missing 15:38:54 hm maybe 15:39:28 I'm about to merge with your just-pushed into illumos-joyent fix. I'll see if SmartOS build ALSO builds or not. 15:44:12 So I'm rebuilding -gate on HDC and just-merged illumos-joyent right now. 15:48:15 Oh damn. 15:48:17 I missed it. 15:48:20 ==== cstyle/hdrchk errors ==== 15:48:20 dmake: Warning: Command failed for target `lib' 15:48:20 dmake: Warning: Command failed for target `libnwam-nodepend' 15:48:22 dmake: Warning: Don't know how to make target `amd64/libnwam.so.1' 15:48:24 dmake: Warning: Target `check' not remade because of errors 15:48:39 The TYPECHECK target won't work if we're not building both librarierds. 15:48:43 *libraries 15:48:44 @Woodstock 15:48:49 Yeah. 15:49:32 So back it out? 15:49:33 If you delete $(TYPECHECK) from check then that should work and a general question then to follow up on the mail_msg we had with the clean build. 15:49:41 danmcd: it might be easier to just push 16115 15:49:48 It's up for RTI, isn't it? 15:49:52 yes 15:50:06 i wrote it originally, but marcel has taken care of review and RTI 15:50:07 Hang on. 15:50:11 so i can't just approve it :) 15:50:25 In OI for 2mos, huh? 15:51:22 You're pushing it though. 15:53:49 Thanks @Woodstock, and sorry for not seeing this when I built. (I checked mail_msg directly on the build box and was sloppy). 15:54:02 @rmustacc worth an e-mail? 15:57:07 [illumos-gate] 16115 Want 64-bit libnwam -- Hans Rosenfeld 16:08:18 danmcd: Yes, it's worth two e-mails. One as a heads up. The second as a question as to how the build didn't fail for the submitter. 16:09:19 And to be clear, stuff like this happens, so just want to make sure folks are aware for the next time. 16:10:45 Can someone show me the output of their `uname`? 16:11:11 What options do you want? 16:11:28 (Also several people did yesterday and its in the logs after your ask, fyi) 16:14:26 What are you trying to figure out SamuelMarks? 16:19:14 danmcd - Trying to see if it still reads "SunOS" somewhere 16:19:23 Ahhh,. 16:19:51 Yes, `uname -a` on two different distros starts with "SunOS". 16:19:52 rmustacc - darn I went to bed; log link? 16:20:04 danmcd - SmartOS and OpenIndiana? 16:20:14 SmartOS and OmniOS 16:20:17 ANd log link: https://log.omnios.org/illumos 16:20:41 oh 2.5 hours later! 16:20:48 Thanks that's helpful 16:21:04 nomad andyf - Thanks 16:26:12 Okay, with 16115 pulled in no more `make check `breakage. Thanks @Woodstock, and sorry @carba for not catching this myself pre-push. 16:34:15 If there's something else you need, don't hesitate to ask SamuelMarks. 16:47:23 thx 16:49:28 I remember seeing a weird runner years ago that spun up a vm inside macos to run illumos for testing under github actions - what's the recommended CI tool for illumos testing? 16:51:30 rmustacc: do you know a method to reserve 32bit dma memory at boot, or alternatively force driver to attach as early as possible? 16:51:34 * SamuelMarks is reading https://github.com/rust-lang/libc/issues/1405 which links to https://github.com/twpayne/chezmoi/blob/v2.9.3/.github/workflows/main.yml#L173-L191 which runs vagrant 16:51:57 OpenIndiana 16:53:32 Woodstock: Not off hand no. I think I'd figure out if other devices grabbing from that 32-bit DMA region need to be or if we biased towards the wrong arena or not. 16:53:43 You can force attach early with some modifications I guess. 16:53:52 Otherwise, I think the longer term answer is IOMMU. 16:54:14 Woodstock ddi-forceattach(9p)? 16:55:18 The driver.conf file won't be read ahead of the driver normally being processed. 16:55:26 It also doesn't prevent detach. 16:55:40 It just forces it to reattach, IIRC. 16:55:41 hm 16:56:22 Other things that you can maybe do will depend on the BIOS/platform. 16:56:41 For example, we move config space up to be above 4 GiB on our platforms to save 32-bit MMIO space. 16:57:01 i see that mlxcx allocates gigabytes worth of dma memory, and lmrc attach is starving trying to get 37mb for SGL chains :) 16:57:16 So mlxcx supports full 64-bit addresses. 16:57:19 yes 16:57:24 So I think my question is why isn't it getting 64-bit DMA?' 16:57:33 is there a way to find out? 16:57:40 that is, whether it's 32bit or 64bit dma? 16:57:55 the driver looks like it wants 64bit dma 16:58:05 Probably by looking at the PAs and the the kmem cache or vmem segments? 16:58:06 Dunno. 16:58:22 There's a way. How painful I dunno. 16:58:24 lmrc could do 64bit dma, too, at least in theory. but i haven't seen a controller which switches that config bit on. 16:58:36 hm yes 16:58:39 Probably can walk all the DMA handles it allocated. 16:58:45 I don't know of anything off hand. 16:59:55 Woodstock: and then some :) we had a server where 64GB of ram wasn't enough for mlxcx 17:01:45 That seems problematic. Where's arekinath to ship us some DRAM? 17:19:24 Now I'm wondering about 64-bit DMA being enabled in the BIOS or not on my own test boxes... 17:20:07 thankfully, the integrator seems to have forgotten a few DIMMs, so once it had 256gb of ram, it was fine 17:20:58 (it had multiple mlxcx ports aggregated together + vnics, which wasn't helping with memory usage) 17:21:14 jbk: too many oversized rx queues? 17:22:13 well i think each instance is allocating <# CPUs> rings * a few thousand DMA buffers per ring 17:22:40 plus vnics on top of that I think with mlxcx will also allocate additional rings 17:25:13 yep, too many oversized rx queues at least for most workloads.. 18:26:03 not sure offhand if the mlxcx driver lets the hw segment inbound rx packets or not 18:26:19 if not, jumbo frames would be even worse 18:27:19 always fun when dladm hangs for 10 minutes because it's blocked while the kernel (essentially) defrags (if only it had a graphic :P) 20:06:56 oh, more users here! 20:07:08 can anyone explain this? I have no idea where to start googling :) https://antranigv.am/misc/omnios_installation_err.jpg 20:40:25 do we support intel chips either E and P cores? 20:41:41 You mean with them mixed? 20:42:24 We should boot on it. The scheduler isn't doing anything special with it. 20:42:51 Though I have not had any such system that i've tried to boot on., 20:43:29 antranigv: what hardware? the IDE errors suggest old (but it may be a SATA put into IDE emulation mode which you should probably turn off in the BIOS) 20:44:02 sommerfeld you are right! it was in IDE mode. we changed it to AHCI and now works all fine 20:44:48 ah, good that you figured it out... 20:59:31 cmdk needs some love... 21:13:19 damn this rge0 interface 21:13:31 I can't even see the ARP packets going out 21:18:24 antranigv I have a rge card it works, what problem do you have? 21:19:29 neirac nothing is coming out. I can't even see the ARP packets at the other end. lemme check the switch 21:20:38 antranigv if you see the nic with dladm show-phys, means the driver attached and should work, I'm using one right now in omnios 21:26:06 neirac rge0 ethernet unknown 0 unknown rge0; this is the output of dladm 21:29:55 Have you created an ip interface on top of it? 21:31:31 rmustacc yes! I can see it in `ipadm` 21:34:00 OK, then I'm a bit surprised that dladm says the link is in an unkonwn state (versus explicitly up/down), but given that it's not up, I guess it's not really a surprise then that you're not seeing packets. 21:35:51 rmustacc bad driver? 21:36:39 Don't know. 21:37:09 Not enough information to make even a speculative guess. 21:38:45 rmustacc how can I debug such issue? sorry, completely #NewHere 21:42:25 So what'd probably help someone is to gist/pastebin/whatever (please don't dump into the chat), the full output of ipadm and dladm. 21:43:04 I'll get to it. it's hard to do that when there's no inet :D 21:44:43 Sure, I understand. Or an image or something. 21:44:51 Just trying to make sure we understand the state of the system. 21:45:09 The other thing I'd see is if there is anything in /var/adm/messages from the rge driver. 21:50:49 okay, so I'm not in front of the display, I asked my intern to run some commands, make an image and send it. I will upload them all in a single page 21:51:28 I've never seen anything like this in the last 5 days that I've been playing around with illumos xD 21:53:17 It's not something I know right away either. So it's basically going to first be about confirming the basic set up, links plugged in, the driver internally called the MAC mc_start(9E) entry point, etc. 21:57:41 antranigv: You can always `grep rge /var/adm/messages` and see if rge(4D) whined at all? Long shot, but sometimes one gets lucky... 22:00:11 okay, so. 22:00:12 ipadm 22:00:16 dladm 22:00:20 dladm show-ether 22:00:27 dladm show-phys 22:00:31 grepping the messages 22:00:36 should I get any more info? 22:01:12 ping 22:01:16 test 22:01:27 Can you also grab /usr/lib/pci/pcieadm show-devs output antranigv? 22:01:37 That'll just let us confirm what exact rge device we're working with. 22:01:44 sure! 22:02:04 ararat can you run these commands in the server? send the images to me, I will upload them to a server 22:02:32 ararat to be specific, the file we're grepping (you were not here) is the following: grep rge /var/adm/messages 22:14:46 antranigv this is the output of prtconf -vp and this is the nic I have https://termbin.com/yodo check for Ethernet controller and name, mine is name: 'pci1462,7721' 22:15:57 If you're trying to get the name neirac /usr/lib/pci/pcieadm show-devs rge is a friendlier way there. 22:16:07 The name can be a little deceptive as that's defaulting to a device subsystem ID. 22:16:13 So there's a lot more variance in it. 22:18:10 rmustacc let me try that 22:18:21 No real need as all the info is in the output you had. 22:18:55 The device-name string is more useful there and the primary device id as opposed to the subsystem tells us what particular chipset it is. 22:21:13 rmustacc I don't find pcieadm in omnios is that in a package? 22:21:51 Yeah, pkg:/diagnostic/pci. 22:26:40 awesome 1/0/0 PCIe Gen 1x1 rge0 RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller 22:27:31 looks a lot nicer 22:27:57 ararat thank you! 22:28:01 here's the images 22:28:02 https://notes.bsd.am/illumos/rge.html 22:28:10 and as much output as I could 22:28:38 rmustacc turns out pcieadm was in packages, so we were not able to install atm 22:29:43 neirac looks like we both have the same card 22:36:13 but the model is the same? like the device-id ? 22:37:51 neirac yup! 22:38:49 neirac the revision is different 22:39:02 but the device-id is the same: 00008168 22:48:06 test 22:48:14 neirac we'll try rebooting, sometime that works :))) 22:48:32 ararat can you reboot the machine please? 22:48:50 what I see related to the error message regarding MSI is https://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/io/rge/rge_main.c?r=0dc2366f#1731 , I don't see how to disable MSI in rge 22:49:02 sure 22:49:35 I don't see an error there. 22:49:45 neirac in BSD land, that would be using sysctls. 22:49:46 Just a note that we opted to use MSI interrupts, whichi is generally fine. 22:50:04 There is nothing in that to suggest there as an interrupt delivery problem. 22:52:59 rmustacc oh ok, just greped the first thing I saw in the picturs 22:53:08 ping still doesn't work 22:53:19 for .1 22:54:14 ararat I wonder if there's something in BIOS settings (AGAIN) that can help us, but probably not 22:54:29 ararat is there another network card on the desktop? or is that the only NIC? 22:57:40 rmustacc is there a way to check with dladm or other command if a cable is connected? 22:58:45 yhy 23:03:27 neirac the cable is connected for sure in our case, as rebooting into Linux, we can get an IP via DHCP. but I think the LINK is the answer to your question, no? 23:07:28 `cannot plumb rge0: already exists` 23:07:42 neirac: Ethernet PHYs don't usually distinguish between a cable being present and a carrier there. 23:08:38 Unlike an SFF tranceiver, there isn't a dedicated presence pin that I know of. 23:09:15 rmustacc thanks 23:11:49 I don't know what else to check based on the pictures, maybe rem_drv rge and add_drv? 23:15:21 antranigv run rem_drv rge and then add_drv rge, and check dmesg 23:18:29 That's not going to work while there's a valid ip interface. 23:20:31 I think we can delete the IP interface 23:20:41 and delete the if 23:20:46 and then do that 23:21:10 rem_drv requires reboot 23:24:00 rebooting 23:32:42 69.55/24 23:35:01 ararat_ still can't ping it 23:35:11 ararat_ what's the new state now, after you rebooted? 23:35:12 disabled? 23:35:25 yed 23:35:30 yes 23:36:18 can you try neirac 's suggestion now and run `add_drv rge` ? 23:37:21 it says can't driver set in system but cant attach 23:37:35 driver set but cant attach* 23:40:45 Unfortunately, using rem_drv manually removes all the PCI alias. 23:40:53 So a standalone add_drv won't get it to attach. 23:41:07 I'm not sure what this was going to accomplish, but I can try to go figure out the right syntax for you in a few minutes. 23:42:46 rmustacc danke <3 23:43:17 rmustacc wanna get this done soon, as ararat_ is still in the data center trying to have illumos up and running :DDD ararat_ is it cold in there? xD 23:43:59 a lil bit, it's fine 23:46:11 update_drv -i '"pcie10ec,8168"' rge 23:46:37 rmustacc I thought rem_drv rge only removed that driver 23:47:08 It does, but it removes all the aliases that were installed. 23:47:50 rmustacc then add_drive -i add that alias to /etc/driver_aliases and tries to attach ? 23:48:04 You would need to put that in the add_drv line again. 23:48:24 But in general, I also don't really remove/add the driver for this unless I have a theory for what's going on because you end up fighting a bit against packaging. 23:49:02 rmustacc is there any other way to debug when a driver attaches ? 23:49:25 I mean force to attach/reattach 23:49:37 Sorry, I'm not in a place where I can lead a debugging session. But generally with stuff like this I use mdb to understand the driver's state and then DTrace. 23:50:03 ararat_ after running `update_drv -i '"pcie10ec,8168"' rge` let us know what the output is. if there' no output, then send me a picture of `dmesg` 23:50:20 In terms of loading and unloading, modunload and modload will probalby show up. But if we don't understand anything, I'm not sure that an add_drv/rem_drv will do anything that different from a reboot. 23:50:25 at least one of m/i/P/p must be specified with -a amd -d 23:50:48 Ah, sorry, I forgot the -a. 23:50:53 So toss that in before teh -i. 23:51:16 failed to attach 23:51:27 ararat_ update_drv -a -i '"pcie10ec,8168"' rge 23:51:31 damn 23:52:14 Ah, sorry, one other typo there it looks like. Sorry, typing this out without a system to try this on. 23:52:30 update_drv -a -i '"pciex10ec,8168"' rge 23:52:39 Basically, the alais was pciex, not pcie. My mistake. 23:53:46 ahah 23:53:48 no error 23:54:46 ararat_ can you send me the output of `dladm show-phys` ? 23:55:26 rge0 Ethernet unknown 0 unknown rge0 23:55:41 link media state speed duplex device 23:56:23 damn it 23:56:29 today's not our lucky day