00:02:44 btw, the OpenZFS 2.2 released with many new feature and improvement, if the ZFS in illumos kernel will get same feature and improvement in recently? 01:45:22 jbk: Thanks for the help, changed line 50 in usr/src/cmd/cmd-crypto/decrypt/Makefile from '$(ROOTDECLINK):' to '$(ROOTDECLINK): $(PROG)' , then compile and get the error as this : https://pastebin.com/m1ME4Uhh is there still some issue with some Makefile ? 01:46:27 i'm not sure.. 01:46:36 it's strange since i've been building that exact same commit just fine 01:51:43 though i need to rebuild some stuff anyway... and i have to run for a bit, so I'll clean and do a full build and see what happens 01:53:22 well don't _need_ to, but now that I'm able to talk to my tpm2 module, I want to see if I can get it working as a kernel RNG provider 02:18:49 jbk: Thank you, I’ll wait for the fix, thanks for the great help 03:48:48 hmm.. i just did a build which included that commit and didn't have any issues 04:20:06 jbk: the build with commit: e5c93d6afd576eba5ed58a0f188357e3cc604b61 , the ‘Makefile’ has changed on this commit 06:07:59 eee, install is missing all target there, decrypt is not getting built 06:10:26 that is, after make clobber, make install does not build anything. 08:20:21 Something tops sftp transfer speed from LX zone that is behind other gateway zone to 100Mbit and I don't know what causes it. 13:59:06 tozhu: looks like a fix for your issue is up for review now 13:59:13 i had the wrong line 13:59:48 jbk: Thank you, what’s the fix line? 14:00:18 I’m going to verify the fix in my env 14:02:13 https://code.illumos.org/c/illumos-gate/+/3132/2/usr/src/cmd/cmd-crypto/decrypt/Makefile 14:02:27 on line 48 -- add 'all' to the list of dependencies 14:08:03 okay, I have changed it manually, and is going to verify it now 14:43:09 Hi, I've just seen the "OS-8496 Deprecate docker registry access". I'm using docker images on standalone smartos - this goes away now? 14:43:10 https://smartos.org/bugview/OS-8496 14:47:31 jbk: I have verified, the fix works well 15:18:20 nahamu: did you get anywhere with further testing of the wireguard-go rebase? 15:29:20 @jbk & @tozhu (pardon latency) ==> That fix needs to be in -gate as well. 15:29:57 See illumos#16057 ( jinni ) 15:29:57 https://www.illumos.org/issues/16057 15:30:55 I'll be RTI-approving it today if it flies by. 15:51:09 yeah.. andyf fixed it and has it up for review 15:51:25 just this time slot every day is pretty much always booked 15:51:31 and often runs over 15:55:14 also, i really wish the materials on kcf had made it out... 18:08:25 jperkin: not yet, no. 18:09:19 But I don't think that the one I did is the one to ship. I want to rebase from a tagged version rather than arbitrary commits on master. 19:51:41 Does anyone have any of the Intel CPU models mentioned here? 19:51:48 https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00950.html 19:52:00 I don't, but would like to find someone who does. 20:03:18 Is there a way to get smartos to cough up the id numbers instead of the human name for the installed processors? 20:03:24 danmcd: We have Intel(R) Xeon(R) Silver 4310 CPU @ 2.10GHz 20:03:51 for testing from vendor. It's in semi-production now, but I can probably check with boss how long he wants to test it and squeeze some time before we return it 20:04:37 if you want to run smartos on it. otherwise it runs debian now and I have root to get some data 20:05:46 Perfect, that's Ice Lake: https://www.intel.com/content/www/us/en/products/sku/215277/intel-xeon-silver-4310-processor-18m-cache-2-10-ghz/specifications.html 20:06:02 How do you want your SmartOS? .iso? .usb? Platform image for Triton? 20:06:37 (I was hoping "today or tomorrow" since we cut SmartOS tomorrow night.) 20:07:00 pjustice: `psrinfo -vp` will emit something after the "GenuineIntel" at the bottom that will help. 20:07:08 It's supermicro, I can boot ISO, I'll check with the boss in the morning, but my guestimate is week or two ETA 20:07:44 Oh damn... that's gonna be too long. (I may put the updated ucode into SmartOS anyway if I can't get it into -gate in the next 24 hours). 20:08:03 other box we're testing has 'Intel(R) Xeon(R) E-2378 CPU @ 2.60GHz' 20:08:06 Ok, I have at least one 606A6 Ice Lake Xeon Scalable Silver 3309Y 20:08:07 but probably same timeframe 20:08:34 pjustice: I'm building SmartOS with this as I type. 20:08:52 i forget, do we have the CPUid and platform ID values somewhere exasily accessible? 20:08:56 i might have one of those CPUs 20:09:08 `psrinfo -vp` look for the string after "GenuineIntel" on the bottom 20:11:12 danmcd: I also have Intel(R) Xeon(R) Bronze 3106 CPU @ 1.70GHz that's not in production 20:11:22 (https://ark.intel.com/content/www/us/en/ark/products/123540/intel-xeon-bronze-3106-processor-11m-cache-1-70-ghz.html) 20:11:37 well it's a Xeon -D but it says '50663' 20:11:57 Nope. 20:12:12 For those who speak intel code names we need "Ice Lake" or later. 20:12:25 I have Xeon Ds of that flavor, jbk 20:12:53 x86 (GenuineIntel 50654 family 6 model 85 step 4 clock 2200 MHz) 20:12:54 Intel(r) Xeon(r) D-2123IT CPU @ 2.20GHz 20:14:16 Looks like all of the relevant processors are 6xxxx or higher. 20:14:41 I have two of the 606A6 family machines. 20:16:11 pjustice: what flavor of SmartOS can you try? Or do you want to use ucodeadm and apply them yourself? 20:16:25 (ISO, USB, or PI ?) 20:16:31 PI 20:16:38 Lemme push it to kebe.com 20:16:42 Does the ucodeadm require reboot? 20:18:28 You're packaging 0d0003B9? 20:18:49 You don't need to reboot with ucodeadm, IIRC. 20:19:04 (I haven't done it myself ever so I have no experience upon which to draw.) 20:19:09 PI is here: https://kebe.com/~danmcd/webrevs/16058/platform-20231113T204650Z.tgz 20:19:16 MD5 == b4a5ebc3da29a35187b2d88898bad14e 20:19:35 If you swap PI you will DEFINITELY need to reboot. 20:20:35 aye 20:20:38 pjustice: also before-and-after outputs of theses: 20:20:42 psrinfo -vp 20:20:55 ucodeadm -v 20:22:32 Would be nice to avoid a reboot if possible. Does ucodeadm get you enough info? 20:23:51 I think it might. Lemme check history. What are you running right now PI-wise, pjustice? 20:24:12 joyent_20230504T000449Z 20:25:48 Hmmm. I was hoping it'd be recent enough to include illumos#15846 in it ( jinni ) 20:25:48 https://www.illumos.org/issues/15846 20:25:56 Still... 20:26:38 Well, lemme declare an emergency reboot, then we can just be sure. :) 20:27:24 Thanks and sorry. 20:28:06 This is damned near what 20221116 will be (which in addition to being a SmartOS release, will ALSO be a Triton release, so it'll go on the `release` channel.) 20:37:37 danmcd: is that the Xeon gold family? 20:37:51 "Ice Lake"? 20:38:07 If it's the right *generation*. Yes, Ice Lake (not all Xeon Golds are Ice Lake, older ones are Skylake) 20:38:32 * danmcd hopes pjustice had a happy reboot... 20:39:26 wait I have "Sky Lake" does that work for what you need? I'm on the last "release" channel PI for triton on SmartOS 20:40:06 Skylake doesn't. 20:40:11 Skylake isn't affected. 20:40:35 pjustice: I hope you took "pre-reboot" output of `psrinfo -vp ; ucodeadm -v` 20:40:53 I have noticed recently a strange behavior when provisioning a new bhyve VM on E5- series (haswell/broadwell) and on "Sky Lake" where the VM just hangs on start until I do a `vmadm reboot UUID -F` then it boots up like it should. 20:41:25 Fedora38 is the guest. 20:41:43 vmadm console is just black 20:43:01 Does this guest use a fabric network? There's still a race between varpd coming fully up and VM boot. 20:43:17 You'll see something in /var/adm/messages on the GZ about it. 20:43:19 It does, on Intel 40GB NICs 20:43:29 It's device-independent. 20:44:05 what should I grep out? 20:44:08 It's about zones beating out varpd going to fully initialized. 20:44:17 I see different errors which could be what you're referring to 20:44:45 Something like this: 20:44:51 Interesting I hadn't seen that one but it would make sense 20:44:57 larry root: [ID 702911 daemon.error] zone fa7130e6-9907-4851-8980-72d2f4ed9588 failed to create overlay device sdc_overlay4385813 with command 'dladm create-overlay -e vxlan -s svp -p svp/host=portolan.kebecloud.work.kebe.com -p svp/underlay_ip=192.168.69.14 -p vxlan/listen_ip=192.168.69.14 -p mtu=8500 -v 4385813 sdc_overlay4385813 20:45:07 So "failed to create overlay device" 20:45:23 I dont have anything like that 20:45:29 2023-11-14T17:34:05.683912+00:00 40-a6-b7-22-54-20 ip: [ID 722105 kern.warning] WARNING: ip_interface_cleanup: cannot open /devices/pseudo/udp@0:udp: error 13 20:45:37 Okay so much for that theory. 20:45:48 2023-11-14T17:34:05.687992+00:00 40-a6-b7-22-54-20 mac: [ID 736570 kern.info] NOTICE: vnic1057 unregistered 20:45:51 2023-11-14T17:34:05.689636+00:00 40-a6-b7-22-54-20 mac: [ID 736570 kern.info] NOTICE: vnic1056 unregistered 20:45:53 (That's a shutdown warning, noisy but not helpful.) 20:45:54 2023-11-14T17:34:18.422486+00:00 40-a6-b7-22-54-20 genunix: [ID 408114 kern.info] /pseudo/zconsnex@1/zcons@0 (zcons0) online 20:45:57 2023-11-14T17:34:19.570779+00:00 40-a6-b7-22-54-20 mac: [ID 469746 kern.info] NOTICE: vnic1058 registered 20:46:00 2023-11-14T17:34:19.570829+00:00 40-a6-b7-22-54-20 mac: [ID 435574 kern.info] NOTICE: vnic1058 link up, 40000 Mbps, unknown duplex 20:46:03 2023-11-14T17:34:19.675923+00:00 40-a6-b7-22-54-20 mac: [ID 469746 kern.info] NOTICE: vnic1059 registered 20:46:06 2023-11-14T17:34:19.730053+00:00 40-a6-b7-22-54-20 genunix: [ID 408114 kern.info] /p 20:46:09 Thats the gist of them 20:46:12 Unknown duplex looks interesting 20:46:32 ah 20:47:14 Guess I'll figure out how to get a dump next time it happens. I already sdc-oneachnode rebooted all of the offenders. 20:47:28 I provisioned 20 guests at the same time and 2/3rds of them did this 20:47:44 on 20 different compute nodes with no workload at the time. 20:47:50 OH... this is post-provision. Yeah, probably losing races of a DIFFERENT flavor. 20:47:59 Might be more of a #triton problem. 20:48:14 * danmcd is nervious about not having heard from pjustice. 20:48:19 Yeah its definitely on triton. I provisioned them from node-triton in a for loop 20:49:00 frustrating because after provisioning I ran an anisble script to set them up. I find it everytime because the ansible_host cannot SSH into the IP 20:49:13 anyway, sorry to jack your thread 20:49:44 oh another question for you danmcd 20:50:43 Do you know remember the openbgpd issue that another illumoser had opening bgp messages to other neighbors? I dont remember exactly what the issue was but you had mentioned it was an illumos bug 20:51:04 I dont expect you to remember with that shitty description either lol 20:51:13 But if you do I was just wondering if it had been fixed yet 20:51:30 I have no memory of that right now, sorry. 20:51:40 Cool, no expectations 20:52:01 I think we were talking about that when both of us were still at joyetn 20:52:51 probably a bit late for jperkin to know pkgsrc status. I'm just tired of running quagga. 20:54:47 I did successfully port rustybgp a couple of years ago, along with gobgp client binaries. May look back into that too. 21:05:30 danmcd: before/after https://yagi.h-net.org/2023.4-IPU-psrinfo.txt 21:11:23 You forgot `ucodeadm -v` 21:11:46 If you have another same-CPU server you can run it on there for "before" and on your rebooted-PI one as "after" 21:12:20 Sorry for not seeing that sooner pjustice 21:12:32 * danmcd make sure he mentioned `ucodeadm -v`... yes he did. 21:18:25 Sorry, failed at reading. 21:18:30 File at the above link is updated. 21:18:50 But I seem to have the same ucode on both machines? (Only rebooted one.) 21:19:53 "before" ucodeadm comes from the one I didn't reboot. 21:23:15 No you don't. ...3b9(new) != ...389(old). This is good. 21:23:58 This also tracks with Intel data (and that MAN your old PI is old... you're missing an update): 21:24:01 | ICX-SP | Dx/M1 | 06-6a-06/87 | 0d0003a5 | 0d0003b9 | Xeon Scalable 21:24:01 Gen3 21:24:27 THANK YOU @pjustice !!! I owe you a reasonably-priced beverage of your choice. 21:24:33 * danmcd is updating the testing notes. 21:24:41 Did I mention blind? :) 21:25:26 danmcd: boss says the tests should be finished by end of next week. then the boxes are available. 21:25:33 old PI was May. 21:25:36 I thought. 21:25:39 Hang on... 21:26:01 You missed this from August: 21:26:02 15834 Update Intel microcode to 20230808 21:26:05 that explains it. 21:26:40 jvl: Not as urgent now. But please boot the latest smartos you can if you do. 21:27:15 Yeah, ok. That trafcks. 21:27:19 -f 21:27:34 danmcd: will do and get back to you. 21:28:09 pjustice: you're now mentioned (as is your URL) on the bug report for illumos#16058 ( fenix ) 21:28:10 https://www.illumos.org/issues/16058 21:30:39 Infamy! 21:31:20 Is your workload shared by adversarial parties (e.g. a public cloud)? IF SO you really really ought to update to this week's fresh PI. 21:31:34 As the REX prefix flaw here can lead to privilege escalation. 21:32:18 Kinda glad it affects Ice Lake and later (which we@MNX do not have as of yet). 21:35:26 I've got 3 racks of ice-lake coming ~30th november. Glad this was fixed first lol. 21:35:38 We don't have directly adversarial workloads, but otoh, we run PHP, so... 21:42:46 HAH! (php...) 21:42:59 What NICs are you using with those 3-racks-worth? 21:43:01 * danmcd is curious . 21:44:58 Us? Whatever was on board. 21:45:33 Oh.... I really would like to see the spec sheet on it. 21:45:44 I'm not sure, with no 100GB Intel support I tried to get Chelsio in place. But our mfg arrangement was difficult. We may be stuck with those. 21:45:44 * danmcd is worried if it's 800-series Intel 21:46:12 Hopefully we got the Chelsio's. They were supposed to procure them. But if not then I guess linux compute nodes it is 21:46:30 100GB? Chelsio is your best choice. I'm trying to get Mellanox 100G working (it works, but not very fast right now). 21:47:26 I love Chelsio. VP of sales was working very diligently with me but his last day was last friday. They're literally an uber drive from our integration facilities in San Jose but mfg refused to procure in the US. Had to be done in Taiwan 21:47:52 And by "I" I actually mean "Alex Wilson, Robert Mustacchi, and I". I just happen to have more motivation (if less clue). 21:48:23 and also had to be with their preferred distributor. It was a nightmare. So I either need to find 120 Intel 40GB NIC's used on Ebay and shelve these until 100GB is ready...or use linux :( 21:48:34 lol 21:49:34 I worked through the linux computenode setup issues. But I feel like it still has a ways to go. I dont even really want to use it either. But if push comes to shove. I can't have $1.5m of hardware sitting idle. 21:51:28 OCP was rolling out 1.6TB switching fabric this year. 21:51:42 danmcd: I have a question on how piadm installs the PI on all drives? last week, I stepped on a rake ... I had mirrored pool with 6 devices (3 mirrors) and needed to remove a device pair. All went well, but on reboot, the bootloader went directly into panic. I booted from an usb stick, removed all existing PIs and installed latest on those 4 remaining drives, but it was ... unexpected. 21:51:44 I think port speeds are 800GB aggs 21:52:55 jvl: You can't boot a pool with multiple vdevs. (3 mirrors == 3 vdevs) 21:53:20 That you were able to at all is, honestly, kinda miraculous, or a bug. 21:53:57 danmcd: intel dpdk is the hold back on drivers right? 21:54:30 eh? :) why am I always the one who doesn't know that something can't be done and does it? :) 21:54:32 Nope. There is an ice(4D) driver that exists in FreeBSD and Linux; it looks similar to i40e in some ways. THere may even be a prototype out there somehwere. 21:54:35 Just wondering if illumos community writes 1 if it would "theoretically" work on the up coming 400,800 etc 21:55:17 ah. I knew BSD/Linux had it already. Is the intention to port BSD version? 21:55:36 Probably depends on who's doing the work. When I started it, I skipped the common code. 21:55:41 But it became a non-thing for me. 21:55:56 I would not hold my breath that the same NIC logic would hold for a 200/400/800G part. 21:55:57 danmcd: https://pastebin.com/DkyMx0Z8 this boots fine 21:56:07 There is a case for not using the common code. (I think of 13230 in i40e and my blood pressure rises.) 21:56:27 jvl: seriously how did "piadm bootable" work? 21:56:34 The I/O engine is the same in i40e and ice, so finding a way to reuse some of the same logic would probably help a bunch. 21:56:35 Damn ... so this is a "big deal" then? 21:57:05 barfield: it's nontrivial. I'd rather spend my cycles on mlxcx(4D) all the same, thank you. 21:57:17 hehe 21:57:37 Or finding someone to regression test the 13230 changes. (It's a big test space, unfortunately.) 21:57:40 Means I need to fly back to San Jose and kiss some ass with Chelsio and company 21:58:17 I may have someone but would need to check his bandwidth 21:58:40 jvl: Serious question, did you install standalone smartos on this with 3-mirrors? Or did you start as a USB or ISO booter and then uttered "piadm bootable -e zones" ? 21:59:30 If you needed 100 GbE today and don't want to do driver work really, Chelsio's T6 or a Mellanox CX-5 are probably where I'd start. 21:59:31 Because "zpool create -B " will fail. "zpool create " will succeed, and maybe "piadm bootable -e" needs to be more careful 22:00:30 danmcd: the devices were 4x 480G SSD + 2x 240G SSD. I'm not 100% sure, but I think I've installed to 4x 480G in mirrored pool with two vdevs and then attached another mirror 22:00:46 Also jvl --> `piadm bootable -r ` will do its best to update all the devices (intended for single-vdev mirrored or raidz)... maybe you needed to do that before yanking drives? 22:00:59 rmustacc: I told them T6 was perfect but they really want me on T7's which come out Q1. T7's have a very interesting new offloading engine included. 22:01:18 danmcd: now I was just removing the 2x 240G SSDs. I was tyring bootable -d and -e from the stick, but that didn't help. I just removed all the PIs and grabbed the latest one and then it booted. 22:01:41 I've been using T5's on my BSD IPS appliances for probably 8 years at this point. Those are great NICs 22:01:45 I'm aware of what the T7 brings, don't worry. And waiting for samples. But Q1 isn't today, which i why I haven't mentioned it. 22:01:51 *which is 22:02:05 lol, I trust me I get it. Additional workload not needed at this time. 22:02:44 danmcd: it was really new installation to boot from the SSDs from day one. not upgraded to boot from local devices later 22:02:54 Worse than the 3 new racks we have coming we're rearranging all of our DC ops. 90 project to move into all equinix facilities 22:03:06 90 day* 22:03:41 Which is proving to be nontrivial with 415v 3 phase power requirements 22:04:08 jvl: not sure what to say right now save I may have some questions for you later. 22:04:58 danmcd: no worries, I'll re-test the same thing then I have those servers available end of next week. I usually lurk here in tmux, so if something crosses your mind, I'm here. 22:06:26 Thanks. 22:06:45 likewise 23:04:16 danmcd: Dan, fwiw, I've just installed latest smartos from iso into virtualbox VM with 4 32G drives. Installer suggested raidz1, do I manually did zpool create zones mirror d0 d1 mirror d2 d3 and went through. removed the virtual iso and booted just fine. 23:04:38 AHA! 23:04:41 Thank you! 23:05:25 By default installer tries to create with `-B` and will fail on your two mirrors scenarion, then it will jump back to just without -B and succeed. 23:06:08 what does it mean? that only first drives are bootable or that it just tries something else and makes all of them bootable? 23:06:12 I guess you're booting BIOS too, not EFI. You cannot boot a pool with EFI unless you created with with 'zpool create -B'. 23:06:23 yes, this is legacy boot 23:06:43 "piadm bootable -r " will attempt to update MBR (and ESP on EFI-bootalbe pools) for every drive in the pool. 23:07:39 refresh, got it. thanks! 23:07:57 raidz1 should be bootable too? last time I tried I wasn't all that lucky ... 23:08:18 Raidz1 is officially bootable (you can even do raidz1 with `zpool create -B`). 23:08:51 `zpool create -B` is the gold standard for bootable pools. 23:08:58 I was trying on old hp dl360g7 with 8 drives and it was trying to get blocks in 64K increments and after 30 minutes *I* gave up ... 23:09:10 You don't have to boot zones either if you've, say, a spare drive, you can dedicate a boot pool. 23:09:18 My triton head node on in-house "Kebecloud" does this. 23:10:02 Unsure of your TZ, but mine is US/Eastern and it's dinnertime. 23:10:05 before you did you magic with piadm, I was perfectly fine with remotely upgrading USB stick and rebooting, without KVM attached ... for like 6-7 years :)) 23:10:42 Europe/Prague. Have a good one and enjoy your meal. Thanks for all! :) 23:11:07 Thank you!