00:08:18 yuripv: so it doesn't have an entire network switch (complete with LLDP and dcbx) running on the nic? :) 08:07:42 [illumos-gate] 16739 dladm: dangling pointer 'buf' to 'bbuf' may be used -- Toomas Soome 12:58:01 [illumos-gate] 16688 libipadm: storing the address of local variable -- Toomas Soome 15:54:47 [illumos-gate] 16728 libppt: not all devices have a subsystem id and subsystem vendor id -- Hans Rosenfeld 16:01:00 jbk: I didn't have a real plan but probably would just fall back to something like single MSI/INTx means single rx/tx pair and group. 16:01:03 Just to keep life simple. 16:02:25 I think the HWRM interface has its limitations too. The grass isn't all greener on the other side. 18:55:20 yep, but it looked just a bit more human-readable to me :) 18:56:25 BTW, does it make sense to support INTx these days? 18:58:13 For a new PCIe card like this, likely not. 18:58:31 I'm going to say "maybe" 18:58:35 pcplusmp as an interrupt distributor only gave us a very small number of interrupts. 18:58:46 We should definitely still continue to support INTx. 18:58:50 'cos robert is thinking very x86-ily 18:59:15 Whether a driver uses it or not is very dependent on it. 18:59:22 But virtualization and other cases still are challenging. 18:59:59 Yeah, it's true I am. I guess if we really have run out because someone has loaded up the system and we only have one CPUs worth of IPL4 or IPL5 interrupts you're going to have a bad day. 19:00:09 In other cases like virtualization, INTx is also still important. 19:00:30 everything is screwy, and you'll never escape 19:00:42 I see. FF6 doom train no escape. 19:01:16 If I were working on it I would probably treat non-MSI/X as basically one rx ring/one tx ring/one group. 19:01:55 yuripv: I think the Intel datasheets are a bit better than my memory of some of those headers. 19:02:27 But IIRC we also ahve the problem that you basically have to send the varios rx and tx rings to a completion queue in bnxt. 19:02:44 And that the interrupt can only be enabled or disabled on a per-MSI/X basis and not a per-CQ basis. 19:02:49 But it's been a long time. 19:34:15 i guess the thing with the intel nics is what OSes are actually using all of that complexity with the multiple forms of virtualization, the embedded network switch, all the network services (LLDP, DCBx, ...) that run on the NIC, etc. vs just as a plain old NIC w/ a bunch of tx and rx rings? 19:35:11 I mean we use some of them like the VSIs and related. 19:35:37 Put differently, bnxt/hwrm was too simple to do the more comlex rings/groups in the normal config we have in ixgbe and i40e. 19:35:41 Though not impossible. 19:35:51 But as for who uses those other things, telcos. 19:35:58 I'f you're using DCB, you need LLDP. 19:36:22 Not saying it's the way I would go if I was designing the NIC, but if we had access to programming the pipeline that'd be valuable. 19:37:18 I mean at least e810, it supports the OCP apis and you do get at least some of that programming IIRC (i've read through but at almost 3000 pages long, don't remember all of it :P) 19:37:43 It's certainly probably changed since I talked with the team in early 2020. 19:38:07 Look, we may complain about lenght and what's left out, but trust me that's much better than the no datasheet case. 19:41:43 I have datasheet for bnxt, and it helped immediately as BARs you had in there from FreeBSD (I guess?) were incorrect, so yes, having datasheets really helps :D 19:42:28 i just mean, I think it does give you some of the access to the pipeline (which is new w/ it)... i just can't remember the details despite reading the specs (because of the size of it) 19:42:38 since it wasn't an area i'm focused on atm 19:42:58 and didn't look like it was needed to get things working, more of 'if you want to get fancy with things' 19:43:44 I guess put differently my currently ideal nic would have a lot more than the X520, as reliable as it is. 19:44:06 I am so happy to exist in a world where I have never had to think about my "ideal NIC" 19:44:17 jbk: hyperscalers are one market doing that sort of thing so they can (among other things) resell the whole intel CPU as "bare metal" but not give the customer raw network access. 19:47:32 yeah, but like are Linux and FreeBSD actually using all of those various virtualization features and taking advantage of the built-in switch? or are they just using it as a really fast nic w/ a bunch of rings? 19:47:39 or windows? 19:48:04 I mean, we use the built-in switch in i40e. 19:48:32 And a lot more people in those worlds that are buying aren't just using the defaults net setup. 19:48:53 Again, not defending the E810 choices. 19:51:07 we do? I mean aside from just the fact you have to touch it to make traffic flow? I don't think there are any apis in mac that'd let it basically hw offload vnic support or handle switching packets between instances on the same host (I thought mac will basically hairpin that before it can hit the driver) 19:52:12 I mean, we are using VSIs and VEBs to make our reality mesh. 19:52:22 It's not offloading, but we're using it. 19:53:06 jbk: my impression is that a fair number of the weirdo features get used via userspace networking. Things like: https://research.google/pubs/snap-a-microkernel-approach-to-host-networking/ 19:53:14 But not sure it's really worth the distinction. As we're just doing what we need. 19:53:29 Yeah, there's definitely a lot that get used that. 20:08:03 jbk: holy grail for some is to get the host/hypervisor stack out of the datapath for guest traffic. You give each guest some number of specially configured tx/rx rings and their traffic gets encapsulated by the NIC directly onto the wire. 21:13:26 that makes sense 21:15:00 yeah, i mean the nic allows you to do that which is neat, i was just wondering if there is any hypervisor that's actually making use of it though? e.g. is there any sort of kernel API on linux or windows or such that'd allow you to hook into all of that stuff? 21:27:58 I mean, in the google case that isn't going to matter 21:47:29 jbk: most of the detail work in sort of thing would likely be in privileged userspace code talking to device control queues with minimal kernel involvement. 22:11:14 I think the thing to keep in mind is this is all based on IEEE standards. And that it was created to solve issues that developed when you moved the VM network switching from the hypervisor to things like VMDq and SR-IOV, providing VFs to guests and taking the hypervisor out of play for data flow. And I think the standards are an attempt to make it easier to consistently deploy ACL/QoS rules across the virtual network. 22:11:51 That said I have no idea how widely used they are or who uses them. And certainly no one is using this stuff in illumos-land. 22:12:26 But at this point, if you want to program these new Intel NICs, this is the language you need to speak. 22:20:53 I will also say that, as much as I have cursed i40e in the past, now that I have experience writing the ENA driver and getting Chelsio T7 up: Intel's programming manuals are pretty great. They might be wrong in places, and you should tread carefully; but they do a good job of describing the various aspects of how to program their NICs.