-
tozhu
does anyone built oxidecomputer’s omicron? I hit issue on omnios, and hit issue,
pastebin.com/VHXHnCYr how to resolve the problem? does it support omnios or smartos ? best wishes
-
tozhu
please point it out if anyone know the issue, thanks
-
jbk
have you tried doing what the error suggets and running 'tools/ci_download_dendrite_openapi' ?
-
jclulow
tozhu: OmniOS is quite similar to our distribution, but with a few differences here and there.
-
jclulow
There is some more detail on getting started with a simulated environment here:
github.com/oxidecomputer/omicron/bl…main/docs/how-to-run-simulated.adoc
-
jclulow
I would note two things though: It's a complex consolidation of software that is seeing a lot of churn -- we're actively working on it; and, it's really only targetted to run on Oxide hardware on top of our Helios distribution, in a rack with our specific switches and so on, it's not really a general purpose control plane you can just install on some Dell servers etc.
-
sjorge
danmcd: did the i40e stuff ever get integrated? I sort of lost track of that a while ago.
-
danmcd
It hasn't because it requires a LOT of regression tests. See
illumos.topicbox.com/groups/network…ce26c8a54-M02e81fae4b931966e92d75bd for what I mean.
-
danmcd
The fix should be straightforward, but making sure it doesn't break other promisc corner-cases is a nightmare requiring time and multiple i40e boards.
-
sjorge
ah right we needed a range of hw to test
-
danmcd
And a range of tests (per the email).
-
sjorge
right, so it seems unlikely to get all that arranged
-
papertigers
danmcd: is the patch small enough that you could hide it behind a configuartion in system.d like sjorge did for the bhyve viona stuff. Then at least people with the part could opt into the new behavior and provide feedback etc
-
sjorge
IIRC it changes the way erm sole HW based mac filters are initialized, so not sure it's easy to toggle
-
sjorge
If it is, that could indeed be an option
-
jclulow
danmcd: I'm struggling a little to understand exactly what cards are what in that mail chain
-
jclulow
Is there a concise list of the parts we need?
-
jclulow
It's possible I can buy something, and we have a 40G-capable switch here, etc
-
jclulow
-
jclulow
I feel like we could parcel that up behind an /etc/system toggle tbh
-
rzezeski
So I've been working on building zig-0.11.0 on OmniOS (making use of the excellent omnios-extra facilities), and through a few small patches I made it to attempting to build stage3. However, as part of that process, the zig2 binary creates a bogus ELF executable where the header claims it's a 64-bit executable but the interpreter section points to 32-bit ld.so.
-
rzezeski
As I debugged this more I found the executable is built with lld which is baked into the zig2 executable. Here is how the zig2 executable is called:
-
rzezeski
-
rzezeski
So, I don't know much about linkers, but I do know the common refrain of "you should be using the system linker". However, zig appears to be taking the strong approach of controlling the linking itself. Right now they rely on lld, but they have already been working on their own linker and the plan is to eject LLVM from zig itself.
-
rzezeski
So I guess I'm wondering, what are the big reasons for insisting on the system ld(1)? I have notions of why we take this stance, but I'd love to hear specifics. If zig is going this route is it just doomed to never running on illumos?
-
rzezeski
One thing I need to ask the zig folks: would they allow an escape hatch to farm out to the system ld. That's code I could write. However, I think I've also heard that you typically want to use gcc as a front-end to ld for various reasons. So perhaps even directly invoking system ld(1) is a bit of work IF the zig folks are amenable to such a tactic on illumos.
-
rzezeski
I'm also wondering, if in the meantime, I can get zig's embedded ld.lld to build a proper executable on illumos. Is that possible?
-
jbk
for building, i suspect changing `-dynamic-linker /usr/lib/ld.so.1` to `-dynamic-linker /usr/lib/amd64/ld.so.1` would probably help
-
rzezeski
jbk: yea that's one think I wrote down in my notes
-
jbk
as for built-in ld vs. not.. rich is probably the best person to know (though I thought i saw he's going to be out of pocket for a bit).. if I had to guess, maybe similar to 'use libc' vs. 'issue syscalls directly'
-
rzezeski
I guess part of my ignorance here is that I've oft heard that we should always using the illumos ld(1). And I get why we might say that. But I'm also wondering if that's because other linkers just straight up cannot build a executable that runs on our system or if it's more for subtle reasons we says this.
-
jbk
in that it might allow us to insulate over some lower-level stuff that might otherwise cause issues
-
jbk
(but as I said, it's more of a guess than anything, so could easily have it wrong)
-
jbk
also i suppose, it removes some amount of potential versioning hell if worrying about specific versions of third party linkers
-
rzezeski
Yea, I undertand our stance around "we provide the system API/ABI for you", and I imagine part of that is the linker part. But zig seems to want to draw a hard line in the sand here, and I've also noticed an uptick of other languages (like Rust) that seem to want to experiment more with newer linkers.
-
rzezeski
I came across some issue in Rust that was basically "use lld everywhere", but obviously they haven't actually full committed to that path at this point
-
danmcd
@jclulow THere are two big types to test:
-
danmcd
1.) 710 family: X710 (10), XXV710 (25), XL710 (40)
-
danmcd
2.) 722 family: X722 (I have this and THIS is the one that exhibits problems). THere might be other X722 actual boards, but I"m not sure. X722 is often found on motherboards.
-
jbk
could probably even do that with a dladm private property as well
-
» rzezeski tries to pretend not to see i40e talk
-
rzezeski
danmcd: that fix certainly feels vaguely familiar, many many years ago I remember discussing with Robert that I need a one-line change that Linux had made. But because of other work things I never got around to it. And that _might_ be the change I was thinking of.
-
jbk
rzezeski: another aspect might be that we might want extra 'stuff'.. e.g istr talk of eventually integrating the ctf stuff into ld so you don't need to do any post processing
-
danmcd
rzezeski: It's not documented and Linux has this "if not VF do this simple thing". PITA. See the fix in
kebe.com/~danmcd/webrevs/13230-newtry
-
rzezeski
danmcd: yea that's what I was referring to, though I feel like the change I wanted to make was during init, not during set promisc
-
rzezeski
but it had something to do with the default VSI
-
rzezeski
and I remember looking at Linux and being like "oh we have to do that too"
-
rzezeski
and I never got around to it
-
danmcd
Eeesh I hope it's not a both thing?
-
rzezeski
well the problem is this was like 4-5 years ago? lol
-
danmcd
X722 experiments show the extra dups disappear. The other cases need testing, though.
-
rzezeski
I have an X710, but that's just not something I'm willing to jump into right now, something like that would probably take me a week and I just don't have that desire right now. Maybe if I was getting paid for it.
-
rzezeski
jbk: yea CTF as part of ld would certainly be a good reason not to rewrite all this in zig, though they seem to be determined to own all codegen/linking...
-
jbk
since that also means things like dtrace SDT probes can work
-
jbk
(and for some reason, i wanted to typo that like it's an embarassing disease :P)
-
jclulow
rzezeski: Basically, the system linker is the one we can actually control, support, improve, debug, etc
-
jclulow
The suggestion to use the compiler driver (e.g., gcc) instead of ld directly is, I feel, mostly because it's probably easier when what you already have are C-like objects
-
jclulow
There is no reason not to invoke ld itself, if you know what you're doing with it, etc
-
jclulow
There's no reason other linkers can't be made to improve the fidelity and correctness of their outputs for illumos
-
jclulow
Just ... it's a lot of work, and if we're going to improve a linker I would perseonally improve our own haha
-
jclulow
Go also emits their own ELF binaries, and they basically mostly work as well as can be expected at this point -- Zig can certainly do the same if they want to
-
rzezeski
jclulow: makes sense, and yea, my gut feeling is: other linkers can probably build executables for illumos, but it might not always produce the best result and you should prefer the system's ld
-
rzezeski
I did NOT know that about Go
-
jclulow
I mean, Go and Zig are philosophically quite close together as far as I can tell
-
jclulow
We've also definitely had bullshit problems because of Go's DIY linker, but it's all just software. People can fix things.
-
rzezeski
Yes I think in some ways, and I knew go avoided libc, but I didn't realize it emits its own ELF. Though I guess I shouldn't be surprised by that.
-
jbk
i've not looked closely at zig, but for the love of god, i hope it has a better ffi story than go
-
jclulow
I think they still get the OSABI value wrong for example? But it's also probably not _critical_ because they also don't do any of our extensions or features lol
-
rzezeski
jclulow: in any even, those are very helpful answers
-
jclulow
jbk: It would be an impressive feat to come out worse FFI-wise
-
rzezeski
thanks
-
jbk
true
-
rzezeski
zig FFI is top-notch AFAICT
-
jbk
there was an article that described it as 'golang is an island'.. which is i think apt
-
jclulow
ELF is also pretty bloody wishy-washy
-
jbk
rzezeski: oh thank god
-
jclulow
rzezeski: Even if I'm not looking to write any Zig right now, I appreciate first-class support in up-and-coming runtimes and toolchains, so thanks for digging in there!
-
jclulow
I know that, e.g., ncdu (a utility I love a lot) has been rewritten in Zig
-
jclulow
I would be sad to not have ncdu
-
rzezeski
jclulow: Yea, this is stuff I don't know much about, and it made me wonder if the SYSV ABI covers some of this stuff, but not nearly enough...kind of like POSIX
-
jclulow
What I'm definitely sure of is that we could make a lot of this clearer on
illumos.org/docs
-
jclulow
e.g., it would be good to refer to what actually constitutes our ABI etc
-
nbjoerg
heh expecting useful changes to the SYSV ABI
-
jclulow
I'm sure rmustacc would have thoughts about that
-
nbjoerg
nowadays it is just legalizing GCCbugs
-
rzezeski
jclulow: haha, oh I'm sure he would
-
jclulow
I think it will be even more important now that we're gradually minting a new platform (ARM)
-
andyf
rzezeski - did you look at how zig 0.9.1 is currently built from omnios extra? lld is pulled in there.
-
andyf
hadfl added that, he might now more.
-
jbk
speaking of languages, i should look at reporting pony again.. i almost had it working on illumos a long time ago, but ran into a cpp macro abuse from hell issue that i could never completely untangle.. though i think they (rightly) got rid of that
-
rzezeski
andyf: yea, I'm using a modified version of that to build 0.11.0
-
rzezeski
0.9.1 predates the self-hosted compiler
-
jclulow
danmcd: Do you think _any_ single board in the 710 family is enough to make you happy?
-
andyf
That could well be why it's still on 0.9.1
-
rzezeski
andyf: yea, there has been at least one other person that tried to get self-hosted working and failed, we'll see what happens to me
-
rzezeski
maybe not "failed", but got stuck
-
andyf
I know some people have been experimenting with using the mold linker with rust for building illumos executables, so other linkers do work in some way or other.
-
danmcd
jclulow: Honestly, the spanning-space for me would be:
-
danmcd
- Any 710
-
danmcd
- Any 722 with X557 PHY
-
danmcd
- Any 722 standalone board (if it exists).
-
danmcd
AIUI only the middle one manifests the problem. :upside-down:
-
sjorge
as far as we know 😅
-
danmcd
I have a 710 mobo edition (with X557 PHY) on Kebecloud. I know where I can find a 722 with X557 phy (either sjorge or there's an MNX box with it), the 722 standalone may be harder.
-
danmcd
sjorge: I've tested 710 with X557 phy on Kebecloud.... doesn't exhibit the problem.
-
danmcd
(But the combinatoric explosion per board... I've only done half of it (3/6 regression tests stated on that email) on the X710/X557 mobo edition.
-
danmcd
I should update 13230 with explosion...
-
sjorge
ebay seems to indicate there are X722-DA4's but those are spf not rj45
-
sjorge
found a DA2 on ebay but at the 2nd hand price point it's a but steap for me
-
sjorge
they are all spf+ it seems
-
danmcd
I could use an SPF board anyway so it can speak with another SPF board (Mellanox ConnectX 6)...
-
danmcd
I've updated fenix illumos#13230
-
fenix
BUG 13230: i40e has duplicate traffic when used with bhyve/snoop running (New)
-
fenix
-
danmcd
See the test plan yourself.
-
sjorge
ebay.com prices are tiny bit better, but 400-500 range is still a bit high for me to buy one
-
sjorge
what's a LOM? those are way cheaper
-
nomad
lights out management?
-
nomad
aka ilom or ipmi or iDRAC or ...
-
sjorge
that doesn't make sense for a nic perspective
-
sjorge
it seems to be a formfactor
-
nomad
ILOM is frequently a dedicated NIC, sometimes on a daughter card.
-
» nomad has no further insight so shuts up again.
-
ptribble
LOM = Lan On Motherboard ?
-
jbk
lights out management
-
nomad
I'm glad I'm not the only one who said that. :)
-
jbk
though if you've ever used intel's.. i think 'lots of misery' might also be appropriate
-
jbk
i've not looked to see what chip they actually use for it on their server systems
-
jbk
but if it turned out that they had a whole bunch of 8088s laying around and decided to repurpose them as their BMCs, it would not shock me
-
nbjoerg
nah, they sold most of those to NASA already
-
jbk
browsing the internet over a 56k modem is generally more responsive than their BMCs
-
sjorge
-
sjorge
7%7Ctkp%3ABFBMspvW18li
-
sjorge
off ok that links is horrible
-
sjorge
sorry
-
sjorge
Normal PCIe seems to be 4 or 5x that to start even 2nd hand
-
sjorge
Perhaps it's some HP specific formfactor, wouldn't put it past them
-
sjorge
Seems to indeed stand for Lan on Motherboard
-
sjorge
TIL
-
jclulow
sjorge: wow
-
sjorge
Was kind of hoping it was just PCIe, as that price isn't too bad but a reg X722-DA2 or DA4, oh boy.