-
luser
ant-x: I use openbox.
-
tm512
this process of bisecting drm-kmod to pinpoint the commit that fixes the amdgpu GPU hangs is making me feel like I'm going crazy. I seemingly wasted a lot of time
-
tm512
started by going up to current so that I could test the 6.10 drivers at the HEAD of drm-kmod, no GPU hangs, then compiled and installed the 6.9_1 tag to check that it's broken as on 15-STABLE, it was, used those as endpoints for git bisect
-
tm512
but every commit I was testing wasn't hanging, even as the number of commits related to amdgpu dwindled. as a sanity test I went back to the 6.9 drivers, and for the life of me I can't get it to hang now
-
tm512
going back to my 15-STABLE BE, which currently has 6.9_1, I was able to get it to hang within a few minutes
-
tm512
I even completely purged out my git clone of all artifacts left over after make clean, double-checked that I had 6.9_1 checked out, and recompiled. I'm pretty baffled. all of this is on the same snapshot of 16, though I ran pkg upgrade -r FreeBSD-ports a couple of times, but I don't think anything got upgraded that would've fixed this
-
tm512
testing my 15 BE was in part to see if the fact that I'm now using iwlwifi rather than iwm might've had any effect, like if it's rooted in LinuxKPI rather than the driver and iwlwifi changes enough to affect the hangs
-
tm512
the 15 BE wasn't connected to the internet though since I don't have the firmware installed there, maybe that tainted the test, if the driver didn't fully load
-
stl
were there power cycles involved?
-
stl
(e.g. if the 6.10 driver does _something_ that doesn't get undone by 6.9)
-
tm512
I'm rebooting every time, not reloading the kmod
-
tm512
I can't help but suspect iwlwifi because it's the only major change to this setup between me starting this bisect process (right after confirming 6.9 hangs on 16), and now where I'm struggling to get 6.9 to hang. I've got like half a dozen youtube video tabs open in firefox, and I keep on opening up videos in mpv, which should reliably hang it
-
tm512
on 15 with 6.9, it hung even without firefox open, like the 5th or 6th time I opened up mpv
-
tm512
even though iwx was causing me issues, I guess I should configure my system to use that and reboot, see if that changes the situation even though I randomly lose connectivity with that driver
-
tm512
hmm, a potentially relevant package upgrade was mpv 0.41.0, which I installed on the 18th which I think was after I started bisecting
-
tm512
this entire time mpv was *the* trigger for the hangs, I think I've only had firefox cause a GPU hang on its own like one single time and I'm not even sure it was the same issue, I think that was a ring vcn_dec timeout, not ring gfx
-
tm512
time to test iwx
-
tm512
iwlwifi isn't even loaded, mpv is downgraded back to 0.40.0_8, it's still not hanging. I'll give it a bit more of a chance, like I guess the amount that I'd give a commit during the bisect before marking it fixed
-
tm512
only other thing I can think of is making a new 15 BE that's fully up to date on packages, to see if the issue is magically resolved there, perhaps it's not mpv but this new patch for ffmpeg. otherwise I'm just at a loss
-
tm512
I kinda need my system in a state where it is broken on the drivers older than 6.10, the main other guy who has this GPU hangs issue is on a slightly different architecture (Renoir rather than Picasso), and it has different triggers for him
-
tm512
if the only BE I have where it hangs at this point is a slightly older 15-STABLE snapshot, which can't run 6.10, might need to just take the personal W and hope others can take it from there
-
nimaje
hm, GPU hangs don't sound like something that could be caused by userland software, but maybe triggered, could you try to get the mpv from the BE that experiences hangs to the one without hangs and try if you get the hangs then?
-
karolyi
hey, is anything blocking the merging of
bugs.freebsd.org/bugzilla/show_bug.cgi?id=292129? if not, can we have it merged please?
-
tm512
the current version of mpv I have (downgraded from the recent-ish 0.41.0 update) should be the exact one that triggered hangs here on 16: mpv-0.40.0_8,1
-
tm512
although the version on the 15 BE still experiencing the hangs is mpv-0.40.0_7,1
-
tm512
this 16 BE is basically only a day newer, I dunno if 7 or 8 would be the one it would've installed during the 15->16 upgrade
-
ridcully
tm512: does the hang only dead-lock the video playback with e.g. vulkan or the whole frontend (e.g. X)?
-
tm512
previously, it completely locked up graphics system-wide (couldn't even switch to a VT), but recently, at least after I brought my testing to recent builds of 15/16, it only locks up X, I can switch to a VT and kill and restart X
-
tm512
I think I'll clone my current 15 BE where the hangs were still occurring, then bring the clone's (non-base) packages up to date, if the hangs disappear there, then I guess I have to track down which package update changed things
-
tm512
but that will have to wait until after I've gotten some sleep. if I continue investigating this I'll probably be up an hour later than I otherwise would be
-
ivy
armin: fwiw, you aren't the only person who doesn't notice UPDATING. i'm not actually sure anything does, this is an actual problem with ports
-
ivy
s/anything/anyone
-
ketas
people only look updating if it fails
-
nimaje
well, UPDATING should only relevant to users of ports, not of binary packages from the official repos, but the messages pkg prints at install/upgrade are relevant to users all users, but they are sometimes easy to miss, if you get a wall of those "port unmaintained" messages
-
ivy
yeah, i'm pretty sure no one reads those either since it constantly reprints older messages you already read or that aren't relevant. i'm not sure if that's a pkg(8) issue or people aren't adding the messages correctly in ports...
-
ketas
updating is also free form text
-
ketas
to parse affects: you need actual brain
-
nwe
good evening, maybe a stupid question.. but why was ALTQ removed from PF ? so you cant use PF for bandwidth limit etc?
-
nwe
looks like I need to use ipfw.
-
ivy
nwe: i wasn't aware altq was removed from pf? but if it was, you don't need to use ipfw for bandlimit limits, pf has dummynet integration
-
ivy
you can configure dummynet queues with dnctl then shunt traffic into a queue in pf.conf
-
ketas
altq was removed from pf'
-
ketas
?
-
ketas
also can pf do mac's now?
-
ivy
my pf.conf(5) from main as of a few days ago still documents altq
-
ketas
whole issue with ipfw is that i don't speak it's config
-
ketas
which is fun since i installes 4.6
-
ketas
d
-
ketas
but apparently it's so difficult to get it's config syntax
-
nwe
oh then I had wrong.. sorry, are you guys using it with pf for bandwitdth limitations etc or ipfw? I using today my pf, I will look again and try it out :)
-
nwe
sorry guys!
-
nwe
ipfw looks like iptables :P
-
ivy
i'm not currently doing rate-limiting in pf but i did previously do it using dummynet and it seemed to work fine
-
ivy
and it has the benefit that you don't need to compile a custom kernel to use it
-
nwe
ivy: oh thanks will try to get something up and running in my test-vm with pf and dummynet
-
nwe
ivy: so you did someting similar? dnctl pip 1 config bw 10Mbit/s and in pf.conf pass out on em0 proto tcp from 192.168.100.0/24 to any dnpipe 1 ?
-
ivy
i don't remember exactly but it was something like that, yes
-
armin
ivy: I read the message when doing the pkg upgrade even, I copy/pasted the whole post-procedure output to a text file so I can read it later on, but basically all it said was "re-start the damn thing" which wasn't exactly helpful.
-
ivy
oh, you read the message after it already broke it, right
-
armin
ivy: Basically, yes, I could of course have tested to install the 2.x thing from pkg and see if I could attach that way, but at that point I just gave up and actually re-started.
-
armin
ivy: It's not exactly critical, I was just bit by it. :)
-
ivy
armin: i'm not trying to have a go at you, just in case it seems like that, rather i think we need a better way to handle things like this in ports
-
armin
ivy: :)
-
rwp
If we are talking about tmux then it seems that every tmux update will break backward compatibility. I just always assume now that tmux will be broken if tmux is upgraded now.
-
bdrewery
ya it sucks
-
AmyMalik
This month, much of BC that originally observed shifting UTC-8/UTC-7 will switch to permanent UTC-7. I assume this impacts on tzdata and tzdata right mode?
-
rtprio
yeah
-
rtprio
I would think so
-
mvanbaak
this might be way above my paygrade, but I'm trying to get podman to work on freebsd. More specifically, the bridged networking thing.
-
mvanbaak
I do have a linux container running, I can ping the ip of the bridge interface on the host, but I have 0 idea how I can get it to talk to the LAN or the internet etc
-
mvanbaak
does anyone have a good resource for me to study this?
-
mvanbaak
the host I run it on has 3 interfaces. a physical interface, and 2 vlan interfaces on top of that. Podman is creating the bridge cni-podman0 and adds the container interface to the bridge
-
mason
mvanbaak: Did you see this one?
daemonless.io/guides/networking
-
mvanbaak
Yes, so I opted for the bridge one. Just no idea how to setup the pf.conf
-
mason
I don't know much about the site, but that seemed relevant.
-
mvanbaak
I tried the snippet there, with 'ext_if' set to 'vlan6'
-
mason
Ah, I don't use vlans so I'm not probably a good resource for that.
-
mvanbaak
Yeah, I know I complicated my life with that haha
-
mvanbaak
Thanks for trying to help me.
-
mvanbaak
I'll look into pf debugging. never needed it till now haha
-
mason
mvanbaak: If you get it all going, it'd be cool if you added it to the wiki.
-
mvanbaak
yeah, I will.
-
mvanbaak
I cant be the only one with vlans
-
rtprio
i have used podman on freebsd but only with the most basic of nat/pf with it
-
mvanbaak
ok, got a basic linux container with networking tools running. the debugging begins :)
-
mvanbaak
ok, maybe it is because the vlan interface itself does not have an ip address configured. bridge0 has vlan6 interface as member, and bridge0 has an ip on my local lan
-
mvanbaak
can it be that I have to NAT to (bridge0) instead of (vlan6) ?
-
mvanbaak
in my pf.conf
-
mvanbaak
ok, I'm officially a noob when it comes to freebsd networking and pf
-
mvanbaak
this works
-
mason
mvanbaak: nice!
-
mvanbaak
So I took the example from
daemonless.io/guides/networking 'bridged networking' pf.conf. set the variable ext_if to bridge0, BAM, works
-
mvanbaak
but only on vlan6, but that's ok
-
mason
Cool.
-
mvanbaak
now to see if I can run a real app like this (and sorry for all the noise
-
mason
mvanbaak: Nah, it's useful to see.
-
mason
Good luck pushing it further.
-
mvanbaak
thanks. my brainfart of 'lets take the easy way out. instead of porting all this to freebsd and all lets just run the container. eaasier' is not really working out hahaha
-
rtprio
mvanbaak: for most of my other things it's bhyve/debian/docker :shrug:
-
mvanbaak
yeah, I'm considering that
-
mvanbaak
isnt networking even worse then?
-
mvanbaak
vlan -> bridge -> bhyve bridge -> docker bridge -> container
-
mvanbaak
something like that
-
rtprio
i don't have any specialized networking for docker
-
rtprio
is the issue with podman is the bridge binds on the wrong vlan?
-
crb_
how can I find out the list of valid time zone acronym for when I use TZ=UTC in a crontab entry?