-
andyf
nomad - if you get a moment, could you try the updated `dump_spares`?
-
andyf
There is also a hotfix for you from danmcd_ at
hf.omnios.org/r50/mpt_sas.p5p
-
andyf
(that one is signed :p )
-
nomad
andyf: : || lvd@chrufs ~ [520] ; ./dump_spares
-
nomad
Segmentation Fault (core dumped)
-
andyf
That's a shame!
-
andyf
Can you share the core?
-
nomad
as long as it doesn't contain any of the data in the zpool, sure.
-
nomad
That host has PII on it.
-
andyf
It should not, but the output of `pstack <core>` is probably enough for me to track this down
-
nomad
: || lvd@chrufs ~ [523] ; pstack core
-
nomad
core 'core' of 1345: ./dump_spares
-
nomad
0000000000401528 sort_spares () + 76
-
nomad
fffffc7fef0bcba6 qsort_r_wrapper (9ecfb0, 9ecfb8, 4014b2) + 16
-
nomad
fffffc7fef0bd1cc qsort_r (9ecfb0, 2, 8, fffffc7fef0bcb90, 4014b2) + 61c
-
nomad
fffffc7fef0bd647 qsort (9ecfb0, 2, 8, 4014b2) + 27
-
nomad
0000000000401643 dump () + e0
-
nomad
fffffc7feed1bc7d zpool_iter (918550, 401563, 0) + 9d
-
nomad
00000000004016b4 main () + 27
-
nomad
00000000004013f7 _start_crt () + 87
-
nomad
0000000000401358 _start () + 18
-
nomad
I can't test the hotfix on these prod hosts. Unfortunately, my test host is currently in a bit of a disfunctional state while we deal with the 9500 HBA driver question.
-
nomad
oh, the hot fix was for the driver. Right, that I can test.
-
nomad
IIRC, there's some special magic for getting a hot fix recognized in a test be, right?
-
swinokur
hey there - I'm noticing that the omnios.org website & pkg updates have been slow/unreachable for the past couple days. Is it me, or is there something else going on?
-
andyf
swinokur - I'm not aware of anything. The US mirror was down for update yesterday, but everything is there now.
-
andyf
nomad - You should just be able to `pkg apply-hot-fix --be-name=<name for test BE> <url>`
-
swinokur
hmm... omnios.org (pkg.omnios.org, www.ominos.org) aren't pingable right nwo
-
andyf
I'm not sure if they ever are pingable
-
nomad
thanks andyf
-
swinokur
okay, and my traceroute stops here 13 168 ms 167 ms 169 ms rou-fw-rz-rz-gw.ethz.ch [192.33.92.169]
-
andyf
That's close enough - the server is hosted at ETH Zürich
-
nomad
looks like the new driver is attaching and drives are visible.
-
swinokur
sorry for the delay -- too many phone calls this morning! If I surf to
omnios.org/releasenotes - the page just times out - and the pkg updater runs very very slowly (if at all). If I go to the web page on my mobile phone (not connected to wifi, just to cellular) - that page does load fine. Is it possible that my ip/network is being blocked?
-
nomad
swinokur, I just loaded
omnios.org/releasenotes successfully
-
nomad
there was a very brief delay before it loaded.
-
swinokur
thanks! yeah I'm thinking there's something amok goin on over here on the wired network...
-
swinokur
okay I have figured it out - my omnios server and windows desktop machine both have jumbo frames (mtu 9000) enabled. When I set the interface on the omnios machine back to MTU = 1500 then pkg started to work again. This used to work fine so I'm wondering if someone changed something on the omnios networking side?
-
sommerfeld
swinokur: likely to be MTU discovery issues.
-
sommerfeld
path mtu discovery, that is
-
sommerfeld
at some point in your network, there is a point where the link MTU changes from 9k to 1500
-
sommerfeld
the router straddling that boundary needs to be able to send ICMP errors that the packet is too big back to the host sending it.
-
swinokur
nod - although nothing has changed on my network in quite a while, and pkg not working seems quite recent
-
sommerfeld
or all hosts inside need to be aware of the boundary (perhaps by setting the MTU on routes)
-
sommerfeld
could also be path mtu discovery issues on the other end (you're advertising a TCP MSS based on an MTU of 9k, other side sends jumbograms and they don't get through).
-
sommerfeld
path mtu discovery via ICMP is very often broken because of either busted firewalls not letting it through, or busted routers failing to send the error
-
swinokur
its very strange, because omnios.org is the only website that I've found that's behaving this way (and as I mentioned pkg)
-
swinokur
hm, you mentioned ICMP - would omnios.org not responding to pings be related to that?
-
sommerfeld
yes
-
sommerfeld
if all ICMP is blocked, path mtu discovery can't work
-
nomad
swinokur, are you using IPv6 perchance?
-
sommerfeld
just for yucks: route add 192.132.2.0/24 <your default router> -mtu 1400
-
sommerfeld
and reenable jumbograms
-
swinokur
not using IPv6 myself (who knows what comcast is doing in the middle)
-
swinokur
okay, jumbograms are back on. did the add net command
-
swinokur
pkg stops working again
-
swinokur
oh -- the route command has a typo!
-
swinokur
omnios is 129.132 ;-)
-
swinokur
okay with the route command correct, and jumboframes enabled pkg update -v does work
-
sommerfeld
so you're probably better off in the long run setting that MTU value on your default route. (you said comcast, I'm assuming residential cable modem...)
-
swinokur
yeah, residential cable modem
-
swinokur
yeah that's probably safest
-
sommerfeld
and -mtu 1500 will almost certainly work, barring some odd PPPoE config (unlikely on comcast), but I suggested 1400 for the test for the sake of eliminating that variable
-
sommerfeld
one of the more impactful things I did at google was as part of a project that let jumbo(ish)grams be incrementally deployed in their datacenters. If you have authoritative info on where in your network topology the MTU steps are, route-based constraints on MTU work better than MTU discovery.
-
swinokur
yeah that makes sense - I only need jumbo frames inside the house to speed big copies of files around on the 40G & 10G links
-
swinokur
unfortunately doesn't look like windows supports route based MTU hahahah
-
sommerfeld
so your best bet is isolating the windows box via some sort of NAT proxy that does MSS clamping. (see ipnat.conf "mssclamp" option if you want to do it with illumos)
-
swinokur
it turns out that pfSense has this built in. there's a "MSS" setting in the interfaces section. I put that at 1500 and omnios website now loads again from windows
-
swinokur
(pfsense does the header size subtraction for us)
-
swinokur
Thanks so much for your help sommerfeld!
-
sommerfeld
glad to be able to help.