-
antranigv
-
ajk203
Hi, all. Does anyone know where in the source I can look for the NFS client? I'm specifically looking for where it sets up its TCP socket. I'm looking for how the client sets its TCP keepalives. Can anyone point me in the right direction? Many thanks.
-
gitomat
[illumos-gate] 16035 ::msgbuf help missing whitespace -- Jason King <jason.brian.king⊙gc>
-
rmustacc
A bunch of nfs is in uts/common/fs/nfs and then cmd/fs.d/nfs which has a bit for the libraries, and related. I've not looked at that off hand so I can't speak to where the bit you're looking for is.
-
danmcd
For NFS service it's rather complex thanks to very old design decisions. NFS server starts in kernel using <ick> TLI/XTI endpoints. $UTS/comnon/fs/nfs/nfs_server.c has this:
-
danmcd
/* Create a transport handle. */
-
danmcd
error = svc_tli_kcreate(fp, readsize, buf, &addrmask, &xprt,
-
danmcd
sctp, NULL, NFS_SVCPOOL_ID, TRUE);
-
danmcd
which deep down should open a TLI endpoint and bind port 2049 to it.
-
sommerfeld
danmcd: question was about the client side. ajk203: look for the "clnt" equivalents of the "svc" things Dan mentioned. start looking around usr/src/uts/common/rpc/clnt_*.c
-
sommerfeld
connmgr_connect() in clnt_cots.c looks relevant..
-
jbk
so who wants to update the nfs code to use ksocket? :)
-
jbk
(as I take a step or two back)
-
sommerfeld
"not poo" ?
-
sommerfeld
:-)
-
sommerfeld
(unfortunate subject line truncation, no doubt)
-
sommerfeld
jbk: my understanding is that the issue for #16163 isn't the in-flight I/O's but rather a large worklist of block pointers read from metadata blocks.
-
sommerfeld
(the sorted scrub idea is that you scan metadata blocks and accumulate a sorted list of block pointers, and then process them in LBA-ish order to sequentialize scrub I/O)
-
sommerfeld
but I've got to run now..
-
jbk
oh heh.. when I copied it, it left off the l
-
jbk
as i see it highlighted in the window
-
jbk
do you know from the dumps if the memory is that and not zio_ts?
-
jbk
we've seen the scrub prefetcher basically goes full throttle as it goes through the metadata
-
jbk
so it can generate these absolutely massive bursts of largely prefetch I/Os
-
jbk
it didn't help that the specific model drives from the vendor (whom shall remain nameless) seemed to come down with random bouts of what I dubbed 'tortoise nervosa'
-
jbk
but with no actual errors
-
jbk
in any log pages or anything
-
jbk
just get slow for a bit, then fine
-
jbk
(they also had the fun side effect that right as it would get near finishing resilvering the pool, another disk would fail, triggering more resilvering)
-
jbk
that would exaggerate the bias in how the zios were distributed
-
jbk
there is an openzfs change that will switch the zio scheduler into LIFO mode if the queue depth gets too large (or too old), but that seemed more extensive
-
ajk203
@danmcd. thanks. @sommerfeld thanks I'll take a look at the clnt code. many thanks.
-
sommerfeld
jbk: sorry, was off running errands.
-
sommerfeld
see the "Grand Theory Statement" in dsl_scan.c; BP's found in metadata get recorded as a scan_io_t which is " the minimum information needed to reconstruct a
-
sommerfeld
* zio for sequential scanning."
-
jbk
yeah, i was just wondering if those dumps where the actual space being used was -- in the issue we had, we'd routinely see > 100gb of zio_ts queued in the pool, and pretty much all of it was was prefetching
-
jbk
we only saw it with one customer (but it was a $@#$@#$ to root cause for various reasons) and it seems like something that needs a large pool and lots of ram
-
jbk
and probably helped by disk performance dropping for unexplained reasons for stretches at a time during the resilver
-
jbk
i've mentioned it before, but we've seen the same conceptual problem with other bits in zfs (basically generating load w/o any backpressure or throttle and hoping the system can handle it)
-
jbk
e.g. zfs diff
-
sommerfeld
jbk: I didn't look into where the memory was going in those dumps.
-
sommerfeld
taking a look now
-
sommerfeld
~900MB in the sio_cache_* caches which is where the sorted block pointers go (this is a 24G machine).