-
tozhu
btw, the OpenZFS 2.2 released with many new feature and improvement, if the ZFS in illumos kernel will get same feature and improvement in recently?
-
tozhu
jbk: Thanks for the help, changed line 50 in usr/src/cmd/cmd-crypto/decrypt/Makefile from '$(ROOTDECLINK):' to '$(ROOTDECLINK): $(PROG)' , then compile and get the error as this :
pastebin.com/m1ME4Uhh is there still some issue with some Makefile ?
-
jbk
i'm not sure..
-
jbk
it's strange since i've been building that exact same commit just fine
-
jbk
though i need to rebuild some stuff anyway... and i have to run for a bit, so I'll clean and do a full build and see what happens
-
jbk
well don't _need_ to, but now that I'm able to talk to my tpm2 module, I want to see if I can get it working as a kernel RNG provider
-
tozhu
jbk: Thank you, I’ll wait for the fix, thanks for the great help
-
jbk
hmm.. i just did a build which included that commit and didn't have any issues
-
tozhu
jbk: the build with commit: e5c93d6afd576eba5ed58a0f188357e3cc604b61 , the ‘Makefile’ has changed on this commit
-
tsoome
eee, install is missing all target there, decrypt is not getting built
-
tsoome
that is, after make clobber, make install does not build anything.
-
nikolam
Something tops sftp transfer speed from LX zone that is behind other gateway zone to 100Mbit and I don't know what causes it.
-
jbk
tozhu: looks like a fix for your issue is up for review now
-
jbk
i had the wrong line
-
tozhu
jbk: Thank you, what’s the fix line?
-
tozhu
I’m going to verify the fix in my env
-
jbk
-
jbk
on line 48 -- add 'all' to the list of dependencies
-
tozhu
okay, I have changed it manually, and is going to verify it now
-
jvl
Hi, I've just seen the "OS-8496 Deprecate docker registry access". I'm using docker images on standalone smartos - this goes away now?
-
jinni
-
tozhu
jbk: I have verified, the fix works well
-
jperkin
nahamu: did you get anywhere with further testing of the wireguard-go rebase?
-
danmcd
@jbk & @tozhu (pardon latency) ==> That fix needs to be in -gate as well.
-
danmcd
See illumos#16057 ( jinni )
-
jinni
-
danmcd
I'll be RTI-approving it today if it flies by.
-
jbk
yeah.. andyf fixed it and has it up for review
-
jbk
just this time slot every day is pretty much always booked
-
jbk
and often runs over
-
jbk
also, i really wish the materials on kcf had made it out...
-
nahamu
jperkin: not yet, no.
-
nahamu
But I don't think that the one I did is the one to ship. I want to rebase from a tagged version rather than arbitrary commits on master.
-
danmcd
Does anyone have any of the Intel CPU models mentioned here?
-
danmcd
-
danmcd
I don't, but would like to find someone who does.
-
pjustice
Is there a way to get smartos to cough up the id numbers instead of the human name for the installed processors?
-
jvl
danmcd: We have Intel(R) Xeon(R) Silver 4310 CPU @ 2.10GHz
-
jvl
for testing from vendor. It's in semi-production now, but I can probably check with boss how long he wants to test it and squeeze some time before we return it
-
jvl
if you want to run smartos on it. otherwise it runs debian now and I have root to get some data
-
danmcd
-
danmcd
How do you want your SmartOS? .iso? .usb? Platform image for Triton?
-
danmcd
(I was hoping "today or tomorrow" since we cut SmartOS tomorrow night.)
-
danmcd
pjustice: `psrinfo -vp` will emit something after the "GenuineIntel" at the bottom that will help.
-
jvl
It's supermicro, I can boot ISO, I'll check with the boss in the morning, but my guestimate is week or two ETA
-
danmcd
Oh damn... that's gonna be too long. (I may put the updated ucode into SmartOS anyway if I can't get it into -gate in the next 24 hours).
-
jvl
other box we're testing has 'Intel(R) Xeon(R) E-2378 CPU @ 2.60GHz'
-
pjustice
Ok, I have at least one 606A6 Ice Lake Xeon Scalable Silver 3309Y
-
jvl
but probably same timeframe
-
danmcd
pjustice: I'm building SmartOS with this as I type.
-
jbk
i forget, do we have the CPUid and platform ID values somewhere exasily accessible?
-
jbk
i might have one of those CPUs
-
danmcd
`psrinfo -vp` look for the string after "GenuineIntel" on the bottom
-
jvl
danmcd: I also have Intel(R) Xeon(R) Bronze 3106 CPU @ 1.70GHz that's not in production
-
jvl
-
jbk
well it's a Xeon -D but it says '50663'
-
danmcd
Nope.
-
danmcd
For those who speak intel code names we need "Ice Lake" or later.
-
danmcd
I have Xeon Ds of that flavor, jbk
-
danmcd
x86 (GenuineIntel 50654 family 6 model 85 step 4 clock 2200 MHz)
-
danmcd
Intel(r) Xeon(r) D-2123IT CPU @ 2.20GHz
-
pjustice
Looks like all of the relevant processors are 6xxxx or higher.
-
pjustice
I have two of the 606A6 family machines.
-
danmcd
pjustice: what flavor of SmartOS can you try? Or do you want to use ucodeadm and apply them yourself?
-
danmcd
(ISO, USB, or PI ?)
-
pjustice
PI
-
danmcd
Lemme push it to kebe.com
-
pjustice
Does the ucodeadm require reboot?
-
pjustice
You're packaging 0d0003B9?
-
danmcd
You don't need to reboot with ucodeadm, IIRC.
-
danmcd
(I haven't done it myself ever so I have no experience upon which to draw.)
-
danmcd
-
danmcd
MD5 == b4a5ebc3da29a35187b2d88898bad14e
-
danmcd
If you swap PI you will DEFINITELY need to reboot.
-
pjustice
aye
-
danmcd
pjustice: also before-and-after outputs of theses:
-
danmcd
psrinfo -vp
-
danmcd
ucodeadm -v
-
pjustice
Would be nice to avoid a reboot if possible. Does ucodeadm get you enough info?
-
danmcd
I think it might. Lemme check history. What are you running right now PI-wise, pjustice?
-
pjustice
joyent_20230504T000449Z
-
danmcd
Hmmm. I was hoping it'd be recent enough to include illumos#15846 in it ( jinni )
-
jinni
-
danmcd
Still...
-
pjustice
Well, lemme declare an emergency reboot, then we can just be sure. :)
-
danmcd
Thanks and sorry.
-
danmcd
This is damned near what 20221116 will be (which in addition to being a SmartOS release, will ALSO be a Triton release, so it'll go on the `release` channel.)
-
barfield
danmcd: is that the Xeon gold family?
-
barfield
"Ice Lake"?
-
danmcd
If it's the right *generation*. Yes, Ice Lake (not all Xeon Golds are Ice Lake, older ones are Skylake)
-
» danmcd hopes pjustice had a happy reboot...
-
barfield
wait I have "Sky Lake" does that work for what you need? I'm on the last "release" channel PI for triton on SmartOS
-
danmcd
Skylake doesn't.
-
danmcd
Skylake isn't affected.
-
danmcd
pjustice: I hope you took "pre-reboot" output of `psrinfo -vp ; ucodeadm -v`
-
barfield
I have noticed recently a strange behavior when provisioning a new bhyve VM on E5- series (haswell/broadwell) and on "Sky Lake" where the VM just hangs on start until I do a `vmadm reboot UUID -F` then it boots up like it should.
-
barfield
Fedora38 is the guest.
-
barfield
vmadm console is just black
-
danmcd
Does this guest use a fabric network? There's still a race between varpd coming fully up and VM boot.
-
danmcd
You'll see something in /var/adm/messages on the GZ about it.
-
barfield
It does, on Intel 40GB NICs
-
danmcd
It's device-independent.
-
barfield
what should I grep out?
-
danmcd
It's about zones beating out varpd going to fully initialized.
-
barfield
I see different errors which could be what you're referring to
-
danmcd
Something like this:
-
barfield
Interesting I hadn't seen that one but it would make sense
-
danmcd
larry root: [ID 702911 daemon.error] zone fa7130e6-9907-4851-8980-72d2f4ed9588 failed to create overlay device sdc_overlay4385813 with command 'dladm create-overlay -e vxlan -s svp -p svp/host=portolan.kebecloud.work.kebe.com -p svp/underlay_ip=192.168.69.14 -p vxlan/listen_ip=192.168.69.14 -p mtu=8500 -v 4385813 sdc_overlay4385813
-
danmcd
So "failed to create overlay device"
-
barfield
I dont have anything like that
-
barfield
2023-11-14T17:34:05.683912+00:00 40-a6-b7-22-54-20 ip: [ID 722105 kern.warning] WARNING: ip_interface_cleanup: cannot open /devices/pseudo/udp@0:udp: error 13
-
danmcd
Okay so much for that theory.
-
barfield
2023-11-14T17:34:05.687992+00:00 40-a6-b7-22-54-20 mac: [ID 736570 kern.info] NOTICE: vnic1057 unregistered
-
barfield
2023-11-14T17:34:05.689636+00:00 40-a6-b7-22-54-20 mac: [ID 736570 kern.info] NOTICE: vnic1056 unregistered
-
danmcd
(That's a shutdown warning, noisy but not helpful.)
-
barfield
2023-11-14T17:34:18.422486+00:00 40-a6-b7-22-54-20 genunix: [ID 408114 kern.info] /pseudo/zconsnex@1/zcons@0 (zcons0) online
-
barfield
2023-11-14T17:34:19.570779+00:00 40-a6-b7-22-54-20 mac: [ID 469746 kern.info] NOTICE: vnic1058 registered
-
barfield
2023-11-14T17:34:19.570829+00:00 40-a6-b7-22-54-20 mac: [ID 435574 kern.info] NOTICE: vnic1058 link up, 40000 Mbps, unknown duplex
-
barfield
2023-11-14T17:34:19.675923+00:00 40-a6-b7-22-54-20 mac: [ID 469746 kern.info] NOTICE: vnic1059 registered
-
barfield
2023-11-14T17:34:19.730053+00:00 40-a6-b7-22-54-20 genunix: [ID 408114 kern.info] /p
-
barfield
Thats the gist of them
-
barfield
Unknown duplex looks interesting
-
barfield
ah
-
barfield
Guess I'll figure out how to get a dump next time it happens. I already sdc-oneachnode rebooted all of the offenders.
-
barfield
I provisioned 20 guests at the same time and 2/3rds of them did this
-
barfield
on 20 different compute nodes with no workload at the time.
-
danmcd
OH... this is post-provision. Yeah, probably losing races of a DIFFERENT flavor.
-
danmcd
Might be more of a #triton problem.
-
» danmcd is nervious about not having heard from pjustice.
-
barfield
Yeah its definitely on triton. I provisioned them from node-triton in a for loop
-
barfield
frustrating because after provisioning I ran an anisble script to set them up. I find it everytime because the ansible_host cannot SSH into the IP
-
barfield
anyway, sorry to jack your thread
-
barfield
oh another question for you danmcd
-
barfield
Do you know remember the openbgpd issue that another illumoser had opening bgp messages to other neighbors? I dont remember exactly what the issue was but you had mentioned it was an illumos bug
-
barfield
I dont expect you to remember with that shitty description either lol
-
barfield
But if you do I was just wondering if it had been fixed yet
-
danmcd
I have no memory of that right now, sorry.
-
barfield
Cool, no expectations
-
barfield
I think we were talking about that when both of us were still at joyetn
-
barfield
probably a bit late for jperkin to know pkgsrc status. I'm just tired of running quagga.
-
barfield
I did successfully port rustybgp a couple of years ago, along with gobgp client binaries. May look back into that too.
-
pjustice
-
danmcd
You forgot `ucodeadm -v`
-
danmcd
If you have another same-CPU server you can run it on there for "before" and on your rebooted-PI one as "after"
-
danmcd
Sorry for not seeing that sooner pjustice
-
» danmcd make sure he mentioned `ucodeadm -v`... yes he did.
-
pjustice
Sorry, failed at reading.
-
pjustice
File at the above link is updated.
-
pjustice
But I seem to have the same ucode on both machines? (Only rebooted one.)
-
pjustice
"before" ucodeadm comes from the one I didn't reboot.
-
danmcd
No you don't. ...3b9(new) != ...389(old). This is good.
-
danmcd
This also tracks with Intel data (and that MAN your old PI is old... you're missing an update):
-
danmcd
| ICX-SP | Dx/M1 | 06-6a-06/87 | 0d0003a5 | 0d0003b9 | Xeon Scalable
-
danmcd
Gen3
-
danmcd
THANK YOU @pjustice !!! I owe you a reasonably-priced beverage of your choice.
-
» danmcd is updating the testing notes.
-
pjustice
Did I mention blind? :)
-
jvl
danmcd: boss says the tests should be finished by end of next week. then the boxes are available.
-
pjustice
old PI was May.
-
pjustice
I thought.
-
danmcd
Hang on...
-
danmcd
You missed this from August:
-
danmcd
15834 Update Intel microcode to 20230808
-
danmcd
that explains it.
-
danmcd
jvl: Not as urgent now. But please boot the latest smartos you can if you do.
-
pjustice
Yeah, ok. That trafcks.
-
pjustice
-f
-
jvl
danmcd: will do and get back to you.
-
danmcd
pjustice: you're now mentioned (as is your URL) on the bug report for illumos#16058 ( fenix )
-
jinni
-
pjustice
Infamy!
-
danmcd
Is your workload shared by adversarial parties (e.g. a public cloud)? IF SO you really really ought to update to this week's fresh PI.
-
danmcd
As the REX prefix flaw here can lead to privilege escalation.
-
danmcd
Kinda glad it affects Ice Lake and later (which we@MNX do not have as of yet).
-
barfield
I've got 3 racks of ice-lake coming ~30th november. Glad this was fixed first lol.
-
pjustice
We don't have directly adversarial workloads, but otoh, we run PHP, so...
-
danmcd
HAH! (php...)
-
danmcd
What NICs are you using with those 3-racks-worth?
-
» danmcd is curious .
-
pjustice
Us? Whatever was on board.
-
danmcd
Oh.... I really would like to see the spec sheet on it.
-
barfield
I'm not sure, with no 100GB Intel support I tried to get Chelsio in place. But our mfg arrangement was difficult. We may be stuck with those.
-
» danmcd is worried if it's 800-series Intel
-
barfield
Hopefully we got the Chelsio's. They were supposed to procure them. But if not then I guess linux compute nodes it is
-
danmcd
100GB? Chelsio is your best choice. I'm trying to get Mellanox 100G working (it works, but not very fast right now).
-
barfield
I love Chelsio. VP of sales was working very diligently with me but his last day was last friday. They're literally an uber drive from our integration facilities in San Jose but mfg refused to procure in the US. Had to be done in Taiwan
-
danmcd
And by "I" I actually mean "Alex Wilson, Robert Mustacchi, and I". I just happen to have more motivation (if less clue).
-
barfield
and also had to be with their preferred distributor. It was a nightmare. So I either need to find 120 Intel 40GB NIC's used on Ebay and shelve these until 100GB is ready...or use linux :(
-
barfield
lol
-
barfield
I worked through the linux computenode setup issues. But I feel like it still has a ways to go. I dont even really want to use it either. But if push comes to shove. I can't have $1.5m of hardware sitting idle.
-
barfield
OCP was rolling out 1.6TB switching fabric this year.
-
jvl
danmcd: I have a question on how piadm installs the PI on all drives? last week, I stepped on a rake ... I had mirrored pool with 6 devices (3 mirrors) and needed to remove a device pair. All went well, but on reboot, the bootloader went directly into panic. I booted from an usb stick, removed all existing PIs and installed latest on those 4 remaining drives, but it was ... unexpected.
-
barfield
I think port speeds are 800GB aggs
-
danmcd
jvl: You can't boot a pool with multiple vdevs. (3 mirrors == 3 vdevs)
-
danmcd
That you were able to at all is, honestly, kinda miraculous, or a bug.
-
barfield
danmcd: intel dpdk is the hold back on drivers right?
-
jvl
eh? :) why am I always the one who doesn't know that something can't be done and does it? :)
-
danmcd
Nope. There is an ice(4D) driver that exists in FreeBSD and Linux; it looks similar to i40e in some ways. THere may even be a prototype out there somehwere.
-
barfield
Just wondering if illumos community writes 1 if it would "theoretically" work on the up coming 400,800 etc
-
barfield
ah. I knew BSD/Linux had it already. Is the intention to port BSD version?
-
rmustacc
Probably depends on who's doing the work. When I started it, I skipped the common code.
-
rmustacc
But it became a non-thing for me.
-
rmustacc
I would not hold my breath that the same NIC logic would hold for a 200/400/800G part.
-
jvl
danmcd:
pastebin.com/DkyMx0Z8 this boots fine
-
danmcd
There is a case for not using the common code. (I think of 13230 in i40e and my blood pressure rises.)
-
danmcd
jvl: seriously how did "piadm bootable" work?
-
rmustacc
The I/O engine is the same in i40e and ice, so finding a way to reuse some of the same logic would probably help a bunch.
-
barfield
Damn ... so this is a "big deal" then?
-
danmcd
barfield: it's nontrivial. I'd rather spend my cycles on mlxcx(4D) all the same, thank you.
-
barfield
hehe
-
danmcd
Or finding someone to regression test the 13230 changes. (It's a big test space, unfortunately.)
-
barfield
Means I need to fly back to San Jose and kiss some ass with Chelsio and company
-
barfield
I may have someone but would need to check his bandwidth
-
danmcd
jvl: Serious question, did you install standalone smartos on this with 3-mirrors? Or did you start as a USB or ISO booter and then uttered "piadm bootable -e zones" ?
-
rmustacc
If you needed 100 GbE today and don't want to do driver work really, Chelsio's T6 or a Mellanox CX-5 are probably where I'd start.
-
danmcd
Because "zpool create -B <your layout>" will fail. "zpool create <your layout>" will succeed, and maybe "piadm bootable -e" needs to be more careful
-
jvl
danmcd: the devices were 4x 480G SSD + 2x 240G SSD. I'm not 100% sure, but I think I've installed to 4x 480G in mirrored pool with two vdevs and then attached another mirror
-
danmcd
Also jvl --> `piadm bootable -r <pool>` will do its best to update all the devices (intended for single-vdev mirrored or raidz)... maybe you needed to do that before yanking drives?
-
barfield
rmustacc: I told them T6 was perfect but they really want me on T7's which come out Q1. T7's have a very interesting new offloading engine included.
-
jvl
danmcd: now I was just removing the 2x 240G SSDs. I was tyring bootable -d and -e from the stick, but that didn't help. I just removed all the PIs and grabbed the latest one and then it booted.
-
barfield
I've been using T5's on my BSD IPS appliances for probably 8 years at this point. Those are great NICs
-
rmustacc
I'm aware of what the T7 brings, don't worry. And waiting for samples. But Q1 isn't today, which i why I haven't mentioned it.
-
rmustacc
*which is
-
barfield
lol, I trust me I get it. Additional workload not needed at this time.
-
jvl
danmcd: it was really new installation to boot from the SSDs from day one. not upgraded to boot from local devices later
-
barfield
Worse than the 3 new racks we have coming we're rearranging all of our DC ops. 90 project to move into all equinix facilities
-
barfield
90 day*
-
barfield
Which is proving to be nontrivial with 415v 3 phase power requirements
-
danmcd
jvl: not sure what to say right now save I may have some questions for you later.
-
jvl
danmcd: no worries, I'll re-test the same thing then I have those servers available end of next week. I usually lurk here in tmux, so if something crosses your mind, I'm here.
-
danmcd
Thanks.
-
jvl
likewise
-
jvl
danmcd: Dan, fwiw, I've just installed latest smartos from iso into virtualbox VM with 4 32G drives. Installer suggested raidz1, do I manually did zpool create zones mirror d0 d1 mirror d2 d3 and went through. removed the virtual iso and booted just fine.
-
danmcd
AHA!
-
danmcd
Thank you!
-
danmcd
By default installer tries to create with `-B` and will fail on your two mirrors scenarion, then it will jump back to just without -B and succeed.
-
jvl
what does it mean? that only first drives are bootable or that it just tries something else and makes all of them bootable?
-
danmcd
I guess you're booting BIOS too, not EFI. You cannot boot a pool with EFI unless you created with with 'zpool create -B'.
-
jvl
yes, this is legacy boot
-
danmcd
"piadm bootable -r <pool>" will attempt to update MBR (and ESP on EFI-bootalbe pools) for every drive in the pool.
-
jvl
refresh, got it. thanks!
-
jvl
raidz1 should be bootable too? last time I tried I wasn't all that lucky ...
-
danmcd
Raidz1 is officially bootable (you can even do raidz1 with `zpool create -B`).
-
danmcd
`zpool create -B` is the gold standard for bootable pools.
-
jvl
I was trying on old hp dl360g7 with 8 drives and it was trying to get blocks in 64K increments and after 30 minutes *I* gave up ...
-
danmcd
You don't have to boot zones either if you've, say, a spare drive, you can dedicate a boot pool.
-
danmcd
My triton head node on in-house "Kebecloud" does this.
-
danmcd
Unsure of your TZ, but mine is US/Eastern and it's dinnertime.
-
jvl
before you did you magic with piadm, I was perfectly fine with remotely upgrading USB stick and rebooting, without KVM attached ... for like 6-7 years :))
-
jvl
Europe/Prague. Have a good one and enjoy your meal. Thanks for all! :)
-
danmcd
Thank you!