-
jbk
hrm.. is there a way to set the signal sent to a parent when a whild process dies? (e.g. equiv of freebsd procctl(P_PID, -, PROC_DEATHSIG_CTL, ..) or linux prctl(PR_SET_PDEATHSIG, xxx) ?
-
jbk
maybe something with contracts?
-
jbk
err i have that backwards.. want child to get signal when parent dies
-
denk
as a variant: register via atexit() a function to send a signal to the children
-
sommerfeld
jbk: contract should work though I believe you'd get the notification via a fd rather than a signal. A child could also potentially poll with getppid()
-
igork
hi all, i have no perms for review. i'm interested in why my pers was removed
-
igork
*permissions
-
sommerfeld
igork: you asked that question about two days ago - did you see jbk's response to you? (have you tried signing in again?)
-
igork
sommerfeld: yes, i saw, and i tried his proposed solution - it is not working
-
daleg
-
igork
i saw some removed of my +1 to richlowe changes with sparc updates by jclulow and after that i have no perms for review on tsoome changes too
-
igork
jbk: thanks for your proposal but it is not working me for review, i have no permissions and i have no ideas - why ?
-
andyf
daleg - yes illumos 16373 (fenix).
-
fenix
BUG 16373: ena network interface occasionally hangs (In Progress)
-
fenix
-
andyf
That is the same - just you ran something like `dladm` or something else that tried to send an admin command after it hung.
-
andyf
I have implemented reset for the driver, which lets it get over that bump but don't know the root cause yet.
-
daleg
andyf: thanks. alright. I have some crash dumps, but have not hit upon a way to reliably reproduce this. It seems to happen in spates - a few panics and reboots due to this, followed by a long period of it not occurring.
-
andyf
Here's the change that adds reset, and fixes some other bugs -
code.illumos.org/c/illumos-gate/+/3367?usp=search
-
fenix
→ CODE REVIEW 3367: 16391 ena driver could support reset 16392 ena driver async event queue stalls (NEW) |
illumos.org/issues/16391
-
andyf
I could spin a hotfix if you're on omnios. I'm working on finding the root cause but no breakthrough yet.
-
daleg
thanks. I'll try those in a local branch and try some more to get this to happen. It's on bsros, so I'll just do my own local application of the patch and try it.
-
andyf
I have never managed to reproduce it myself in AWS, but some CI runners are hitting it fairly regularly. That patch is getting them to run to completion at least while I work on it.
-
andyf
daleg - what's the instance type? In case it's relevant - I'm mostly testing on m7a.medium
-
daleg
yeah I've been trying to get our nitro story in shape and hit upon this using an m5.large. All seems quite fine otherwise.
-
danmcd
-
danmcd
If anyone here has a modern (Skylake-era or later) Atom processor running any illumos, please contact me.
-
jbk
i'm thinking this is the software trying to use something it shouldn't, but warning on R_AMD64_COPY: file <shlib> <sym>: relocation bound to a symbol with STV_PROTECTED visibility
-
jbk
if i'm wrong, is there something to fix?
-
jbk
(specifically, this is trying to link against pkgsrc libcrypto)
-
jperkin
datapoint: I only see one such error in my bulk builds, and it's with ncmpcpp trying to link against libicuuc.so
-
jbk
annoyingly, i think it's the openssl macros itself that are responsible (still digging)
-
jbk
ex: ASN1_TIME_it
-
gitomat
[illumos-gate] 16311 ps: Inconsistant formatting of options in usage output -- rigzba21 <jonathan.velando01⊙gc>
-
sommerfeld
is that macro part of the published openssl API/ABI? (they've been tightening up on that lately..)
-
jbk
that i'm not sure
-
jbk
though doing some more digging, i guess it's maybe non-fatal (makefiles don't make that clear)
-
jbk
but at least now i have a hopefully working tpm2 binary for testing
-
KungFuJesus
so what on earth is nfsfind?
-
KungFuJesus
and why is it removed nfs.*?
-
KungFuJesus
removing*
-
tsoome
-
KungFuJesus
hmmm, is that necessary? Is there a way to kill it. This particular file system is pretty dense
-
KungFuJesus
lots of files
-
tsoome
those .nfs files are leftovers, so it is good to have cleanup. it is run via root cron, if you do not have nfs service, you can disable it.
-
sommerfeld
KungFuJesus: unix semantics require that you be able to continue doing I/O to an unlinked file that you have open. NFS doesn't allow for that so NFS clients don't unlink open files -- they rename them, instead. And unlink them later, but later may not happen if the client crashes.