-
gitomat
[illumos-gate] backout: 15405 Port NFSv41 base (needs work; see 15869) -- Patrick Mooney <pmooney⊙pc>
-
tsoome
really?
-
sjorge
my non plex vm also had a weird issue when I woke up (like 10 min ago), not sure it's related as nothing in any logs but it's again something that uses sqlite dbs in it's /var/lib (which is on NFS for me)
-
sjorge
Will poke it more later tonight when I have time
-
sjorge
Internet seems to say, just don't use sqlite on NFS. Fair I guess, but It's been working fine for years for me. (Note it's single host -> ds, so there is like no multiple host accessing the same files)
-
tsoome
sjorge ok, sorry to hear
-
sjorge
Current plan is: use plex with vers=4.0 for a bit more, to make sure that's a valid workaround
-
jclulow
sjorge: It's trendy to take a dim view of NFS, but if it was previously working and now it's not working it's a regression which is why we backed it out -- to give us time to debug it and fix it so we can integrate it again with the fix
-
sjorge
That's fair
-
jclulow
(that is: I'm sure people say "don't use SQLite on NFS, but I don't think there's any reason it shouldn't work, even if it doesn't perform super well, provided you're only using it from one system at a time)
-
sjorge
I'll also try and replicate it with vers=4.1 and get a pcap dump
-
tsoome
sjorge we definitely need samples of network stream from the time of problem.
-
jclulow
sjorge: That would be very helpful thank you
-
sjorge
Just want to make sure vers=4.0 is fine and vers=4.1 isn't running the new bits, that would hopefully indicate there is a new feature that is tripping up the client
-
sjorge
Will report back later this weekend
-
jclulow
tsoome: For what it's worth we were discussing it with you and Vitality on the bug tracker
-
jclulow
We'll make sure it gets integrated once the issues are worked out
-
jclulow
File system issues are always critical issues, and we need to take them seriously
-
jclulow
*Vitaliy
-
jclulow
Phone autocorrect is the worst
-
sjorge
qq, does sharectl's nfs property of server_versmax take minors? e.g. can I set that to server_versmax=4.0 or server_versmax=4.1 ?
-
jclulow
I suspect it does not
-
tsoome
no, versmax is only for major version
-
sjorge
OK, I know the min/max_protocol for smb takes minors.
-
tsoome
because till 4.1 there was no minors
-
jclulow
Right
-
sjorge
Would have been a nice easy server side workaround if it's indeed all good on 4.0
-
sjorge
OK, off to do saturday offline obligations
-
jclulow
Do weekend things! Enjoy
-
tsoome
jclulow in my book, discussion means also responses are involved. That is, you hear out what other people have to say.
-
tsoome
it does not mean I object the backout as such, but I do object how this was done.
-
jclulow
tsoome: I understand, and I sympathise. We had to make a call to unbreak master while people were around to make it happen, and to make time to debug without demanding people do that on their weekend
-
jclulow
It's tough to do all of that asynchronously with people in different timezones
-
jclulow
I promise that once this gets sorted out it can go back in
-
jclulow
We just can't have known regressions lingering for days, it makes it hard for people to ship master to users
-
tsoome
this is another problem, we can not expect single developer, company or single developer, can get everything "properly" tested. hell, we still find bugs more than *decades* old. So, if you want stable version, then stable branch must be built, but we can not block development just because we fail to handle versions.
-
tsoome
single person*
-
jclulow
tsoome: I agree that no testing is ever going to cover everything. In this case, we made a judgement call to accept the test plan you presented and integrate the patch. But then a regression was found, and when we find those we have to back things out
-
jclulow
It's all a balance
-
jclulow
If a regression was found months from now, then it would be different -- but it only just landed
-
jclulow
You talk about how hard it is to find resources, but I have no idea where the resources would come from to run a stable branch. Who's going to be responsible for taking things from unstable to stable? That's also a full time job.
-
tsoome
well, sure, we could have been arguing if backout is necessary given there is simple mount -o vers= option, but now there is no point.
-
jclulow
As I said, we'll get the packet traces and we'll get it debugged, and in the meantime we're not asking anyone to work around a known defect with mount options on their clients.
-
jclulow
There's nothing to stop sjorge from running a system with the patch to get those in the next couple of days and then we'll have it squared away.
-
tsoome
it is full time work, and there is two option for this - either we do this at gate level, or we assume distributions do this, and currently we *assume* distributions do this *and* we are requesting gate to receive very stable patches, but that means we have practically no updates. just compare number of commits done on FreeBSD and on illumos - the numbers are there on github.
-
jclulow
We definitely don't assume distributions do this. OI ships master to people straight up. Joyent and Oxide both stay close to master, merging often, and with often a fairly short runway between those merges and users running the software.
-
jclulow
At any rate it's after 0100 here. I'm sorry we weren't able to have a full interactive discussion before we backed out the patch. I hope we can all come together to get it integrated with a fix this week.
-
jclulow
(I guess s/Joyent/MNX/ now -- old habits!)
-
sjorge
It's a damned if you do, damned if you don't situations. I run Omnios Bloody and upgrade every 2 weeks to 1 months. Because I like shiny things, and generally it's pretty darn stable for me with the occasional minor issue. e.g. the sudo RequireTTY default change that bit me. But because I run bloody Andy was quickly able to spot the difference and find the issue. If bloody was very unstable, I would probably just run Omnios Stable. Meaning we
-
sjorge
probably would not have spotted the sudo issue anytime soon.
-
sjorge
Pending more data, if vers=4.0 is a decent workaround for my workload I don't mind running like that for a bit. So a backout would not be needed, on the other hand, I'm not a company that depends on it for business reasons. Dan having just cut a SmartOS release with those bits, even if 1/100 of the customer workloads is impacted, it will be a big deal for them.
-
sjorge
I'll probably file a feature request later for having server_versmin/server_versmax support minors too
-
sjorge
That would also have allowed for an smoother integration, the default could have been server_vermax=4.0 until the 4.1/4.2 stuff had baked a bit more. Hightside 20/20 and all that though.
-
andyf
The backout won't be in omnios bloody until this time next week, most likely, so you have time to test the mount option workaround and grab packet traces or whatever will help.
-
sjorge
andyf was there a way to install the debug variant into a new BE ?
-
sjorge
While leaving the current be at the normal variant?
-
hadfl
iirc something like `pkg change-variant --be-name <be name> debug.illumos=true`
-
sjorge
that sounds vagually familier
-
sjorge
thanks, will give that a go if needed
-
sjorge
3h of playback and a full library scan on vers=4.0 without issues, I'll try to replicate it with vers=4.1 now and get a capture
-
tsoome
cool
-
sjorge
Good news is
-
sjorge
I can trigger it with a simple library scan
-
sjorge
with vers=4.0 it completes with maybe one or two warnings it took 300ms to respond to the db
-
sjorge
But with vers=4.1 I get
-
sjorge
-
sjorge
And I only switched the plex dataset to vers=4.1 I kept the ones with the scanned media on vers=4.0
-
sjorge
Let me erm hard reset the vm and try to capture the packets
-
sjorge
tsoome: so you want an a/b capture? ones on vers=4.0 were there are no issues an one on vers=4.1 ?
-
Smithx10
rmustacc: the nvme 2.0 rfc will that include nvme over tcp client ?
-
tsoome
if you can make both, that would be great
-
sjorge
ofcourse I can't replicate it with a tcpdump running, where would be the fun in that
-
sjorge
It still gets the super long query t imes warnings though but not the full lockup
-
sjorge
Stop the dump, restart the scan and instantly trigger it
-
sjorge
Got it on the 3rd try!
-
rmustacc
Smithx10: Not at all.
-
rmustacc
I'm working on what I need. The library is providing the ability to admit if it's that.
-
rmustacc
But I'm not making a new target.
-
rmustacc
Always happy to collaborate with folks who do.
-
sjorge
tsoome got both, uploading now... very intresting on vers=4.0 it never shows queries higher than 500ms and it finishes the scan in like ~1min dump is smallish
-
sjorge
With vers=4.1 it never finishes and the dump is huge
-
sjorge
For those reading along, tsoome now has a 4.0 and 4.1 dump where I trigger a library scan, where the alter just hangs plex and the former comples in ~1 minute or so
-
jclulow
Neat
-
jbk
from what (little) I read, it does look like NVMe fabric and NVMe over TCP seem to be going in the direction that feels like 'scsi, without all the legacy stuff'
-
jbk
for lack of a better description
-
jbk
granted, i suppose what's in a standard vs. what actually exists is probably also a different matter
-
jbk
e.g. you can use IKEv2 over SCSI to do key exchange to encrpt SCSI frames over whatever transport you're using, but I'm not sure i've found anything that actually implements it
-
sjorge
so in theory that could be a iSCSI replacement?
-
sjorge
i have a love/hate (mostly the later) relationship with iSCSI
-
jbk
NVMe over TCP does (at least from a very superficial point of view) looks like the NVMe equivalent of iSCSI
-
jbk
and yes, i hate iSCSI
-
vetal_
sjorge: Hi, Where did you put pcaps ?
-
sjorge
tsoome has then, i removed them once he downloaded them
-
vetal_
sjorge: I suppose tsoome is not online now. How big were those files?
-
sjorge
90M for the vers=4.0 where I triggered a library scan, AKA the one were everything works and responds fine
-
sjorge
450M_ for the vers=4.1 one
-
sjorge
That one dit not finish and resulted in plex haging and requiring a pkill -z <zonename> -9 bhyve
-
sjorge
Nothing in dmesg, even with the echo thing from the ticket
-
sjorge
It's getting late, but tsoome should still be online. I think he's -1 or +1 from me
-
sjorge
So 22:00-00:00 range
-
tsoome
00:07 here:P
-
vetal_
sjorge: Did you try "echo w > /proc/sysrq_trigger" on Ubuntu VM?
-
sjorge
Yes, thats what I ment with the echo from the ticket
-
vetal_
sjorge: Good. Thanks!
-
sjorge
I did not wait super long after it got stuck though, I know the first time round plex had the Z status once I got to it, like an hour after it hung. I could probably snapshot, force the issue and wait for a bit tomorrow to see if one pops up way later
-
sjorge
The good news is that at least vers=4.0 is a workaround for now.
-
vetal_
sjorge: Could you check vers=4.0 mount, but before set this on the server side: sharectl set -p server_delegation=off nfs
-
vetal_
sjorge: As assumption that missed delegation can be reason why you have had a problems.
-
sjorge
So you want me to try 4.0 with delegation off ?
-
sjorge
Or 4.1 with delegation off ?
-
vetal_
sjorge: 4.0 with delegation off
-
sjorge
And you expect it to also fail ?
-
vetal_
sjorge: yes.
-
vetal_
sjorge: nfsv4.1 bits does not support delegation for now and it is planned following up work.
-
sjorge
should I restart nfs/server after setting server_delegation ?
-
vetal_
sjorge: I guess it should be restarted in the sharectl command.
-
sjorge
online 22:05:47 svc:/network/nfs/server:default
-
sjorge
svcs seems to agree
-
sjorge
You might be on to something
-
sjorge
ERROR - [MusicAnalysis] Waited over 10 seconds for a busy database; giving up.
-
sjorge
Same error and plex hangs
-
sjorge
Let me set that back to on and try again
-
sjorge
Would lack of delegation support... cause more lockX calls?
-
sjorge
tsoome saw none in the old 4.0 pcap but lots in the 4.1 pcap
-
sjorge
Hmm restoring that to on seems to leave it in a broken state
-
vetal_
sjorge: yes. It can. Delegation means that client "owns" that file and if someone tries to open, owner will be notified (recall delegation .
-
sjorge
Interesting, so I set sharectl set -p server_delegation=on nfs again, restarted the vm
-
sjorge
still broken
-
sjorge
even with vers=4.0
-
sjorge
but tcpdump shows LOCK calls too so that's new
-
sjorge
brb, gonna reboot the entire box so my bouncer will be gone for a bit
-
vetal_
sjorge: interesting. I would expect that GRACE PERIOD can effect for some fails, but LOCK calls shouldn't be seen.
-
sjorge
Not sure those were cause by plex
-
sjorge
as I was poking around my self a bit
-
sjorge
But server_delegation=off and vers=4.0 on the client has plex hung too
-
sjorge
Even after restoring server_delegation=on
-
sjorge
VMs still booting so hopefully a full physical host reboot solves it
-
sjorge
All good again after a host reboot
-
sjorge
So you're theory on delegations is probably correct
-
vetal_
sjorge: That's good. Thanks! I am going offline for now.
-
sjorge
I should head to bed
-
sjorge
thanks for looking into it vetal_ tsoome
-
vetal_
sjorge: And you!