-
sommerfeld
so I ran into an odd bug with checksum offload yesterday. Turns out that the IPv6 raw sockets API includes IPV6_CHECKSUM to have the stack compute a checksum of the outgoing packet after source address selection (see section 3 of RFC3542).
-
sommerfeld
But it would appear that this is busted in our stack for i40e - packet gets dropped by the driver; looks like it's a path which bumps the tx_hck_nol4info kstat (haven't isolated further just yet)
-
sommerfeld
patching dohwcksum to 0 with MDB forces IP to compute the checksum and things start working.
-
sommerfeld
not sure if it's a missing piece of the mac_ether_offload_info infrastructure or a bug in i40e
-
sommerfeld
I will be filing a bug but would appreciate it if anyone has any insight into what should be happening here..
-
jbk
someone (patrick?) did some work recently in that area... I don't know if that might address it (or maybe related depending on on current the code is you're running)
-
jbk
err on how current
-
sommerfeld
Looks like that work landed in february and is in the bits I'm running.
-
jbk
or am i thinking stuff in an rfd
-
sommerfeld
there's IPD 55 & 56 which extend checksum and LSO to certain tunneling protocols
-
rmustacc
sommerfeld: So to understand, our expectation is that the hardware will calculate it or software will? Is this about calculating an ICMP checksum?
-
sommerfeld
OSPF on IPv6, using IPV6_CHECKSUM to specify the packet offset where the checksum lands. The code in ip_output_cksum_v6() sets HCK_PARTIALCKSUM and leaves computing it to the driver.
-
sommerfeld
so raw socket, not ICMP
-
sommerfeld
("nxge_cksum_workaround" doesn't apply because it's not ICMP..)
-
rmustacc
OK. Gotcha.
-
rmustacc
I expect it shouldn't be going to hardware then.
-
sommerfeld
mac_ether_offload_info sees a proto it doesn't know, doesn't set l4hlen, and ignores DB_CKSUMSTUFF(mp).
-
rmustacc
And we need to do it in software.
-
sommerfeld
i40e sees HCK_PARTIALCKSUM without l4info and throws up its hands
-
rmustacc
Specifically the driver specifcying HCKSUM_INET_PARTIAL indicates only TCP/UDP and we've seen this a bunch.
-
rmustacc
That is there are other devices where it they don't support ICMPv6.
-
rmustacc
sommerfeld: I wonder if I did the wrong checksum bit there. But i40e was a long time ago.
-
rmustacc
Given we don't actually use start and the descriptors don't really support partial configurations.
-
sommerfeld
so perhaps the fix would go in ip_output_cksum_v6() and (approximately) change the "proto != ICMPV6" to "(proto == UDP || proto == TCP)"
-
sommerfeld
looks like the DB_CKSUM* interface is only consumed by nxge
-
rmustacc
-
rmustacc
I'd have to go back through the datasheet, but I'm not sure if we can do this checksum action with i40e.
-
rzezeski
sommerfeld: that's because nxge is a very old driver, the newer driver use meoi to compute that stuff
-
sommerfeld
yeah, but every packet pays the cost of filling in fields that no driver but nxge will ever look at..
-
rzezeski
sorry, it's been a while since I've been working on mac/drivers, I might have been wrong to point at meoi, I would have to refresh my memory on all the different pieces, but you can clearly see mac_provider APIs making use of DB_CKSUM. E.g., drivers make use of mac_hcksum_get() to help fill out Tx descriptors. Since nxge is so old it just uses DB_CKSUM* directly.
-
sommerfeld
sorry, I misread "git grep" output. yes, I see that they are in fact still used.
-
rzezeski
There are some dragons in the checksum/capab handling stuff, and it can get confusing fast. I'll be ramping up on all this again in the next couple of weeks and can probably help review whatever fix comes out of this.
-
rmustacc
sommerfeld: Do you have IXAF_SET_RAW_CKSUM set in the ixa when you get to ip_output_cksum_v6?
-
rmustacc
I guess it's not clear to me which path you're in in in that code.
-
sommerfeld
yes, I believe IXAF_SET_RAW_CKSUM is set on the packet.
-
wiedi
I've also noticed something broken with ipv6 router advertisements on igc that worked on other cards before, but didn't have time to debug yet. Starting snoop to switch the card into promisc mode makes it work
-
sommerfeld
-
fenix
→
BUG 17593: We should not attempt to offload checksums for raw sockets with IPV6_CHECKSUM set (New)
-
sommerfeld
rzezeski: thanks in advance for taking a look at this..
-
divlamir
Feeling nostalgic about OpenSolaris, it's been more than 15 years since.. What would be the distribution to install today? OmniOS, OpenIndiana? A general purpose one is what I am after
-
divlamir
I'll eventually try more than one, but an advice for a first one is welcome
-
jbk
hrm..
-
jbk
at least on a few NICs, dladm show-linkprop appears to always show the current MTU as the default
-
jbk
e.g. you enable jumbo frames, the 'default' value becomes 9000
-
jbk
shouldn't the 'VALUE' column show 9000, 'DEFAULT' 1500 given pretty much every NIC does in fact default to 1500, and POSSIBLE show <MIN>-<MAX> ?
-
jbk
mlxcx seems to be one
-
richlowe
it certainly seems odd
-
rmustacc
wiedi: If you get more info on that let me know and I'm happy to help as I can. I may have screwed something up there.
-
rmustacc
Well, let's just a get a list of them jbk and we can file bugs and fix it.
-
wiedi
thanks, will try to get some pcaps and open a ticket with more details when I have a moment :)
-
richlowe
the only place I have a 9k default is vnics, but I don't have much actual hardware
-
jbk
so ok.. i just wanted to be sure that wasn't the intended behavior (sounds like it's not)...
-
jbk
(just noticed it from output from a customer, but waiting on feedback to determine if it's a renamed nic or a vnic since it's not obvious from the show-linkprop output they gave
-
rmustacc
vnics and other things will be a different story
-
jbk
yeah, that's why I want the details for the other links shown in the output
-
jbk
the mlxcx ones are fairly obvious
-
jbk
hopefully I'll get that in a bit...
-
danmcd
If I gcore(1) a process, is there a snappy way with mdb to locate *callers* of a function?
-
danmcd
SOMething like "dis-all-the-functions ! grep $FUNCTION_NAME", or something more clever.
-
richlowe
what is the function?
-
richlowe
if it's non-local, there will be a relocation for every reference
-
richlowe
and you can pull them out of .SUNW_reloc and match them in the symtab
-
tsoome
isnt it easier to dig from source? or there is no source?
-
sommerfeld
weidi: I have a machine with some spare igc's I could experiment with. (currently only using one of them as a v4-only interface; a few months ago it successfully got a DHCPv6 address and default v6 route from my ISP)
-
richlowe
tsoome: function pointers
-
richlowe
tsoome: not that my solution helps much there either
-
sommerfeld
-
fenix
→ CODE REVIEW 4374: 17593 Don't offload IPV6_CHECKSUM on raw sockets (NEW) |
illumos.org/issues/17593
-
tsoome
-
jclulow
haha wow
-
jclulow
divlamir: If you mean a general purpose machine without graphics, I would say OmniOS. If you need a desktop environment, though, I think Tribblix or OpenIndiana are the options there generally.
-
richlowe
IPA on the console, finally we can add pronunciations to our manual pages.
-
richlowe
alanc can email the austin group asking for canonical pronunciations for standards things.
-
richlowe
I'm sure it'll be good fun
-
danmcd
@richlowe C_DecryptInit and it's in in.iked or libike (closed-source, sorry @tsoome )
-
tsoome
ah. right, I already forgot about it...
-
» danmcd still runs a punchin server on kebe.com
-
tsoome
dtrace with ustack() for some time?:)
-
danmcd
Oh damn, and it's Function Pointers Everywhere (TM) in libike and in.iked. <sigh> I'll just have to do literally what tsoome ^^^ just said.
-
danmcd
Oh... NVM. I won't have to worry about my problem anyway... no support of AES-[GC]CM in in.iked.
-
tsoome
time to look for something like
openiked.org ?:)
-
» tsoome hides
-
jbk
i almost had transport mode working, but i never had a chance to debug it
-
jbk
tunnel mode after that wouldn't have been _too_ bad I don't think
-
jbk
though i guess transpport mode doesn't get as much use which is unfortunate
-
jbk
basically you can manage over-the-wire encryption on a per-host basis
-
jbk
instead of having to worry about certs for every single app (which may or may not have bespoke ways of dealing with that)
-
jbk
IIRC, the problem was some problem dealing with the incogruence of IKEv2 traffic selectors and how you represent those in the kernel and trying to basically 'negotiate'
-
jbk
(kernel wants ADDRESS/MASK (or /PREFIXLEN) IKEv2 does START ADDR-END ADDR so you have to figure out the intersection(s) that can be expressed as an address+mask
-
richlowe
danmcd: ustack would be the easiest way, if you can provoke it with sufficient coverage
-
gitomat
[illumos-gate] 17554 Add -p flag for "smbadm lookup" for parsable output -- Chao Wang <cwang⊙rc>
-
gitomat
[illumos-gate] 17556 SMB client test memory leak -- Gordon Ross <gwr⊙rc>
-
gitomat
[illumos-gate] 17557 Memory leak in PKCS11 C_DecryptInit with AES_CCM -- Gordon Ross <gwr⊙rc>