-
jbk
it sounds like it was maybe a research thing, so maybe that would come later?
-
jbk
one of these days when there's > 24h in one, I still have a copy of that branch using doors for FUSE on illumos that'd be nice to revive and play with.. i think as long as libfuse keeps the same API, it'd be largely transparent to any fses
-
sjorge
So IPv6 still mostly broken on newer lx images it seems :(
-
sjorge
I think we're missing some stuff, there seem to be multiple breakages
-
sjorge
having a IPv6 NS will break all name resolving and just using IPv4 will still make stuff not connect in a lot of cases
-
jbk
rmustacc: if a driver returns mblks in its tx routine (i.e. couldn't transmit them), do you know if mac always retries later, or if it propagates any 'tx failed' errors upstack?
-
rmustacc
When someone performs a mac tx it depends on what they pass.
-
rmustacc
In general for TCP the backpressure will go all the way back up.
-
rmustacc
And will get to sockfs.
-
rmustacc
However, this is specifically the MAC_DROP_ON_NO_DESC and related flags that exist.
-
rmustacc
So for example, the IP fast path mode via dld will get the cookie and set that the queue is full.
-
rmustacc
I should add MAC will consider the tx queue blocked until it receives a call to mac_tx_update() or the ring equivalent.
-
jbk
the context is a system where we're seeing a spurt of tcpListenDrop events, but by the time we get on the system, they're gone... trying to dig into potential causes, i noticed that tx_err_nodescs for a number of rings (i40e) are non-zero and appear to have incremented (unfortunately, the granularity of checking the kstat isn't sufficient to know if it happened at the same time, only that it might)
-
jbk
so was wondering if that might be a potential cause
-
jbk
esp since during the time period some connections (inbound) were successful while others failed w/o any clear pattern
-
rmustacc
So the theory would be that the application is trying to accept connections off the queue, but can't because some response that needs to tx out isn't happening?
-
rmustacc
And therefore subsequent inbound connect attempts are being dropped?
-
jbk
it's a possible theory... but wasn't sure if tx failign like that could actually induce that up stack or not
-
jbk
(trying to think of things we can instrument in case it happens again as well)
-
rmustacc
I think the main initial question is what's the application doing and is it even trying to accept when this happens.
-
jbk
it's smb, and it appears to.. on that side we see ksocket_accept() failing while tcpListenDrops are incrementing...
-
rmustacc
You're getting ECONNABORTED?
-
jbk
yeah (130)
-
sommerfeld
jbk: one thing I'd look at is the full descriptor recycling path for that driver. what thread of control handles that, and what does it do with mblk/dblk/etc., and what can it get entangled with in the process that might slow it down?