-
tozhu
hello all, what’s this tcp parameter means? tcp_rst_sent_rate_enabled , when should it be set to 0 ? I’m running PostgreSQL rdbms, should it be set to 0? or use default value 1 ?
-
tozhu
it is in an isolated network
-
danmcd
It's a protection against packet amplification (it rate-limits RST packets). WHat problem are you trying to actually solve?
-
nomad
I *must* be doing something wrong. I'm trying to set a MTU of 9000 (for jumboframes) on an ixgbe device and I'm told it isn't valid. When I check, I see this:
-
nomad
ipadm show-ifprop -p mtu ixgbe4
-
nomad
IFNAME PROPERTY PROTO PERM CURRENT PERSISTENT DEFAULT POSSIBLE
-
nomad
ixgbe4 mtu ipv4 rw 1500 -- 1500 68-1500
-
nomad
ixgbe4 mtu ipv6 rw 1500 -- 1500 1280-1500
-
nomad
Is 1500 really the max MTU?
-
nomad
(looking at the POSSIBLE column.)
-
jbk
yes, but no
-
jbk
do
-
jbk
illmos separates out the data link layer bits from the IP bits
-
danmcd
nomad: dladm(8)
-
nomad
danke
-
jbk
so you need to see what the MTU is on the underlying link
-
» nomad goes to read more manpages
-
jbk
annoyingly though, to change it, you'll need to tear down your IP interface(s)
-
jbk
change it, then recreate them with ipadm / ifconfig
-
nomad
that's ... ungood.
-
danmcd
The good news is that when you change it with dladm(8) it persists.
-
nomad
LINK PROPERTY PERM VALUE DEFAULT POSSIBLE
-
nomad
ixgbe4 mtu rw 1500 1500 1500-15500
-
jbk
yeah, i've floated around some sort of 'apply at next 'start' option to dladm as a possible way to deal with it, but haven't really done much more than what I just said
-
nomad
so, I have to ipadm delete-if then dladm set-linkprop then ipadm create-if to reset it?
-
nomad
well, I can test to see if setting jumbo frames impacts iperf3 test results but I'm going to be stuck if it actually turns out to potentially matter.
-
nomad
I guess I'll worry about that after the tests. Thankfully I have two test hosts I can do this on without impacting prod.
-
jbk
the reason for the limitation is basically once you start 'using' the link, (e.g. ipadm create-if, not the only way, but by far most common) pretty much every driver (for performance reasons) creates a pool of MTU-sized buffers for TX and RX (that are pretty much ready to go)
-
jbk
so changing the MTU would mean having to reallocate all of those, which would be rather complex to do if you're also actively using the buffers to pass traffic
-
jbk
(i'm simplifying a bit, but but that's basic gist)
-
jbk
hrm.. i think this has been asked before (and the answer is 'no'), but anyone put any thought into what NVMe over fabric would look like?
-
nomad
well, the good news it doesn't seem to actually matter to the speed tests I'm doing. The bad news is, it's still slower than it should be.
pastebin.com/TJh7WgNz
-
tozhu
danmcd: Thank you very much, I just read the parameters, my application is postgresql, and I hope network could run with best performance for postgresql
-
danmcd
RST wont' help you.
-
danmcd
There are PG-savvy folks on the developer list who might be able to help.
-
tozhu
okay, thank you very much
-
danmcd
@nomad ==> I wonder if you're single-CPU-stream bound? Did you try running multiple iperf server processes on multiple ports, followed by clients connecting to those ports?
-
danmcd
iperf isn't MT, but your apps are.
-
danmcd
For a 10G, two should do nicely.
-
danmcd
IF they add up to 10, you're CPU (and yes, possibly driver and/or TCP stack) bound.
-
nomad
danmcd, -P 2 has much better totals.
-
danmcd
Okay.
-
nomad
I forgot about that because iperf3 on a different host sees the full 10G (or close enough).
-
nomad
[SUM] 0.00-30.00 sec 33.0 GBytes 9.44 Gbits/sec sender
-
nomad
[SUM] 0.00-30.00 sec 33.0 GBytes 9.44 Gbits/sec receiver
-
nomad
well... in one direction it does.
-
danmcd
Responder is still single-threaded IIRC.
-
nomad
the same hosts going in the other direction are still way low.
-
nomad
[SUM] 0.00-30.00 sec 17.6 GBytes 5.03 Gbits/sec sender
-
nomad
[SUM] 0.00-30.00 sec 17.6 GBytes 5.03 Gbits/sec receiver
-
danmcd
iperf3 is good for only one thing: Single-stream tests.
-
nomad
what would you use for testing?
-
danmcd
Two iperf -s procs with different ports, and then run two iperf clients, one each to each server.
-
danmcd
You can use pwait(1) as a starting line:
-
danmcd
sleep 3600 &
-
danmcd
( pwait `pgrep sleep` ; iperf -c <... one server port>) &
-
danmcd
( pwait `pgrep sleep` ; iperf -c <...other server port>) &
-
» nomad nods
-
danmcd
pkill sleep
-
danmcd
BOTH go off to the races.
-
danmcd
Someone should write an iperf server variant that spawns threads upon accept() per connection.
-
nomad
I'll poke at that shortly.
-
nomad
I'm getting a combined total of 8.94Gb/s in one direction and 9.57Gb/s in the other.
-
nomad
so now to tear the interfaces down and reset them to the 1500MTU again.
-
nomad
so, yeah, MTU not making a difference in iperf3 timing. As would generally be expected.
-
nomad
which leaves me confused why it made such a huge difference on the FBSD tests I did over the weekend.
-
gitomat
[illumos-gate] 16603 acl_totext(3SEC) can truncate users and groups -- Gordon Ross <gwr⊙rc>
-
gitomat
[illumos-gate] 16623 Want tests for libsec acl text conversions -- Gordon Ross <gwr⊙rc>
-
gitomat
[illumos-gate] 16591 nvme_field_validate swallows more specific error messages -- Andy Fiddaman <illumos⊙fn>
-
gitomat
[illumos-gate] 16592 Cannot update NVMe firmware on Micron 7300 -- Andy Fiddaman <illumos⊙fn>
-
gitomat
[illumos-gate] 16593 nvme panic when committing partially loaded firmware -- Andy Fiddaman <illumos⊙fn>
-
gitomat
[illumos-gate] 16596 nvmeadm: some firmware activation controller errors are not -- Andy Fiddaman <illumos⊙fn>