-
gwr_
code.illumos.org/c/illumos-gate/+/4795 17694 test runner silently ignores missing tests in runfile
-
fenix
→ CODE REVIEW 4795: 17694 test runner silently ignores missing tests in runfile (NEW) |
illumos.org/issues/17694
-
jbk
sommerfeld: in your investigations around tcp congestion -- did you notice if the amount of 'control packets' (not really sure what the right name is -- they're just IP+TCP headers with no data)
-
jbk
decreased
-
jbk
?
-
jbk
i've noticed in some packet captures over time that we seem to send a fair amount of such packets (but investigating other problems, never had a chance to dig deeper)
-
jbk
to the point I was contemplating for some of the NICs we use adding a separate cache of pre-DMA mapped buffers that are smaller (e.g. 512 bytes) to use when we're handed tiny packets (< 512 bytes)
-
jbk
(since the naive apprach most NICs take can result in multiple GB of kernel memory being dedicated just for a NIC)
-
danmcd
jbk -> small caches for ACKs makes some modicum of sense at first glance, but if anyone knows the potential pitfalls it'll be Bill.
-
jbk
i know also in our case, things like SMB traffic can also result in small responses from the protocol as well..
-
jbk
since RAM is so expensive, having a driver allocate so much kernel memory that it could typically buffer multiple seconds of data (at line rate) even for 200Gb NICs gets harder to justify
-
jbk
(I'd love of have something like kmem_cache, but with a cap that could be used)
-
danmcd
You can probably, in your constructor/destructor for a kmem_cache, or as a wrapper to kmem*_alloc(), check if a request exceeds your cap, then either block-and-wait (no KM_NOSLEEP), spin-and-check (KM_NOSLEEP but not _LAZY or existence of KM_NORMALPRI), or return failure (KM_NOSLEEP_LAZY).
-
danmcd
Since you own the driver (or mac if you make it generic) you can restrict to KM_NOSLEEP_LAZY only. :)
-
richlowe
this
-
richlowe
... seems like it's going to make memory availability even harder to predict
-
richlowe
is there really no better source of backpressure than failing the allocator?
-
richlowe
as well, I feel like when OpenBSD introduced the ability to pf to prioritize ACK and other empty packets, they saw big perceived perf. improvements. Would the opposite happen if we ran out of pre-allocated space?
-
jbk
well that's why it'd be nice to have a cap
-
jbk
right now most NIC drivers tend to do things in a simple, but naive way, which usually results in excessive memory use
-
jbk
and in some cases, _extreme_ memory use (60+GB of kernel ram)
-
jbk
and if you do the math, the driver basically can (given the link speed) multiple seconds worth of packets at line rate, which should rarely be necessary (if ever)
-
Dixie_F
Wouldn't. it be cheaper to actually just buffer a part of a second? Nothing urgent should need more than a second to complete and if you are going to allocate the memory, why not get the benefit for the fast moving stuff you care about (FWIW I am not a software guy but may have designed chips in your phone)
-
Dixie_F
As in caching blindly seems better than letting stuff that doesn't make it out of the buffer quickly eat memory
-
Dixie_F
Ah, or is the idea that you ack and thus if your buffer overflows, you have lost the handshake aspect...
-
jbk
though on the TX side, it's technically not a problem -- generally for larger packets we DMA bind the existing memory and just copy the smaller ones, but even if we ran out of smaller ones, we could still bind them (just at a cost of some extra overhead for that packet)
-
MelanieUrsidino
I'm being endofuncted
-
MelanieUrsidino
help
-
MelanieUrsidino
(this is a joke)
-
sommerfeld
jbk: I generally see those described as ack-only packets.
-
sommerfeld
one way to handle them on RX is to copy them into a freshly allocated small mblk (they're small, after all..), freeing the large mapped mblk back where the driver can re-post it for receive.
-
rzezeski
which is what every driver should already be doing via it's copy threshold
-
jbk
i mean on tx
-
jbk
it's almost always copied into an MTU-sized buffer
-
jbk
vs. having a separate cache of smaller buffers that could leave the bigger ones for bigger packets
-
jbk
we usually allocate as many MTU-sized buffers as # tx rings x ring size just in case we have a packet split across a bunch of tiny mblk_ts
-
jbk
so we can copy all of those tiny mblk-ts into one larger buffer
-
jbk
since the larger mblk_ts will usually just be bound/mapped directly
-
jbk
(as long as the # of cookies doesn't exceed any NIC limits)
-
gitomat
[illumos-gate] 17331 convert mdb(1) to mdoc -- Andy Fiddaman <illumos⊙fn>