-
szilard
-
danmcd
FreeBSD just removed le(4D) support yesterday. I'm pretty sure we ripped it out during Solaris 10 bringup.
-
jbk
oh that's a blast from the past..
-
jbk
and the infamous ce
-
danmcd
le was the onboard chip for late Sun-3s and the SPARCstations 1 & 2.
-
danmcd
IIRC my first UltraSPARC I workstation at Sun was an early model that also had le on it.
-
danmcd
-
fenix
→ OpenSolaris issue 4942766: Remove le driver from ON10 (Closed)
-
jbk
I'm trying to remember what was on the sparc 5 & 20
-
jbk
the two sun boxes i ever got access to :)
-
danmcd
-
jbk
ok.. that sounds right..
-
jbk
they were running solaris 2.4 at the time and at one point I believe disksuite was installed (uugh)
-
jbk
(I greatly disliked the admin interface of disksuite.. while maybe not admin hostile, it was certainly admin indifferent :P)
-
jbk
naming is important, and just giving a user an arbitrary number (or these days, guid) is just a giant FU IMO
-
richlowe
almost all the "classic" Sun machines were the lance ethernet
-
richlowe
and optional weirdness like atm and fddi
-
richlowe
I have this vague memory that le v. hme was the difference between the "Ultra 1" and "Ultra 2" and the "Ultra 1 Enterprise" and "Ultra 2 Enterprise"
-
richlowe
along with a framebuffer
-
alanc
yeah, it was early in S10:
-
alanc
PSARC 2003/335 EOL of le Ethernet driver
-
alanc
4942766 Remove le driver from ON10
-
richlowe
those machines would never run 64bit, so on10 was toxic for them eventually anyway
-
richlowe
they had that one bug nobody ever explains
-
alanc
and this many years later, there may be no one left who remembers the details
-
jbk
was 64-bit the 'prize' for the happy meal? :)
-
jbk
heh... and the choice of reusing EBADE instead of just added a new error code for zfs continues to cause confusion :)
-
ENOMAD
any HBA-driver experts present? My newly imaged OmniOS host is setting the same problem report as
illumos.topicbox.com/groups/develop…55bd13e0d-M68f6ceb0f3e2a1cf3bbeb89d and I'm curious if it is really ignoreable or if I can/need to do something to fix it.
-
ENOMAD
In my case the complaint is "Mar 3 12:17:25 fs2 scsi: [ID 107833 kern.warning] WARNING: /pci@95,0/pci8086,352c@5/pci1000,4060@0 (mpt_sas0):#012#011Number of phys reported by HBA SAS IO Unit Page 0 (11) is greater than that reported by the manufacturing information (8). Driver phy count limited to 8. Please contact the firmware vendor about this."
-
ENOMAD
value='SAS3808ALLHBA 9500-8i03-50134-01004SPF3001010'
-
ENOMAD
value='HBA 9500-8i'
-
ENOMAD
value='MPTSAS HBA Driver 00.00.00.24'
-
ENOMAD
value='9500-8i Tri-Mode HBA'
-
richlowe
it seems like it's saying there's two ways to get that value, and they give different answers, we picked the smaller, but you should ask LSI's desecendents to make it not do that
-
richlowe
which seems like it's trying to imply you're ok
-
richlowe
I'm not an expert
-
» ENOMAD nods
-
ENOMAD
My 'concern' (said gently) is the "count limited to 8" part. This host could eventually have up to 36 SAS devices connected. Right now we only have 11.
-
jbk
rmustacc: is there a reason we couldn't convert pci_boot.c to use the busra.c interfaces (ndi_ra_XXX)?
-
jbk
(it seems like it'd be nicer, and seems like it'd allow most of the code to not care about the PCI segment it's on)
-
sommerfeld
so I'm trying to understand where mblks can get queued between a NIC driver (specifically i40e) and a tcp socket. there's the receive ring, then there are soft rings in mac and then squeues entering ip. anywhere else? (I'm trying to figure out how a single active tcp connection that's coming from a 1gbit/s link and being actively read by the reciever can cause the i40e driver to run out of receive buffers after loaning out ~1024 of them to
-
sommerfeld
mac and points downstream..)
-
sommerfeld
suggests to me that something is causing large packet batches to accumulate somewhere along the pipeline.
-
richlowe
rzezeski: this sounds like something you know
-
jbk
one thing I've thought about but haven't dug in too deeply to see how difficult it'd be is for mblk_ts going upstack that are being loaned up, to copy and release the original mblk_ts if processing gets deferred for some reason
-
jbk
since the loaned up resources are often shared amongst multiple 'streams' (tcp connections/etc), so can potentially hog loaned out resources from down stack
-
jbk
i saw this in a bug I was never able to entirely chase down with inter-zone traffic on the same box
-
sommerfeld
the specific thing I'm chasing is that with a 4M tcp window, throughput sucks on most connections (~200Mbits/s); with a 500k window it goes at 9xx Mbit/s (gigabit-ish line rate).
-
sommerfeld
my working theory is that there are the standing waves building up *somewhere* and when the sender fills the tcp window it gives a chance for the receiver to drain and stay caught up.
-
sommerfeld
jbk: I think the hard part is that there are potentially so many places for mblks to get queued that knowing where to look is half the battle..
-
sommerfeld
ENOMAD: so how are things cabled up? expanders, or multiple 9500's?
-
ENOMAD
sommerfeld, single 9500
-
ENOMAD
I presume expanders. I uploaded the prtconf to the ticket I just opened.
-
sommerfeld
so it's probably at that point only counting the 8 ports on the 9500 and not the other ports on the expander(s) plugged into some of those ports
-
sommerfeld
note that it says "phy count" not something like 'target count'
-
ENOMAD
hmm. Interesting.
-
ENOMAD
Not sure why the number is odd but it is a potentially reasonable interpretation.
-
jclulow
sommerfeld: Which congestion control algorithm are you using?
-
jclulow
(I discovered last year that our "cubic" is possibly rubbish)
-
sommerfeld
i am in fact using cubic
-
sommerfeld
(which seemed to help for long-haul connections which this test was not..)
-
jclulow
sommerfeld: Does it improve if you switch back to sunreno
-
sommerfeld
trying that now
-
jclulow
The conditions where I was seeing this issue were also a speed imbalance: a 10G server into a generally 1G network etc
-
sommerfeld
yah, with both sunreno and newreno set as congestion control on the sender I don't see the speed collapse
-
jclulow
yeeeeah
-
jclulow
sigh
-
sommerfeld
this is the reverse situation (1G sender, 10G receiver)
-
jclulow
I don't think I got around to filing a bug for this (with my apologies) but it definitely seems like a real issue
-
sommerfeld
I still think there's something wrong going on in between driver and tcp on the receiver independent of tcp congestion control
-
sommerfeld
(because the trigger for the aforementioned rx_bind_norcb events in i40e is too many buffers on loan from driver to mac)
-
rmustacc
ENOMAD: I added that comment. There are basically two different log pages that report that and my memory is in this case we had things that were from other devices.
-
rmustacc
ENOMAD: In your case with an LSI 8i you only have 8 actual PHYs on that HBA that can be directly connected anyways.
-
rmustacc
jbk: Eventually we will rewrite pci_boot.c, but the main reason no one has yet is because it has a huge amount of testing implications.
-
rmustacc
But fundamentally one wants this to mostly be able to look like hotplug after a fashion.;
-
rmustacc
And not have multiple divergent paths.
-
rmustacc
If I were going to work on that project, I'd go first finish the project to allow me to do arbitrary PCIe briges in propolis so I can fake up all the different corner cases of devices and resources.
-
jbk
i ask because getting segments to work, you pretty much are having to touch a _lot_ (and I mean a _lot_) of pci_boot.c
-
rmustacc
Doesn't really change what I'd do first.
-
rmustacc
Which is have a good way to test arbitrary topologies with a VM configuration that I can automated.
-
rmustacc
*automate
-
jbk
that's not a very useful answer tbh
-
rmustacc
I mean, if I was going to rewrite it I'd want to do that first.
-
rmustacc
It may make sense and be the right way.
-
rmustacc
But again, how do we test it is the big question that i'd have.
-
rmustacc
It's really high risk.
-
rmustacc
I've not looked at the ndi_busra stuff in detail, sorry. Keith did that on Oxide.
-
rmustacc
So, no, I guess I don't know of a reason, but if it was me, I'd first figure out how to test it all without needing every different hardware config under the sun. There are other ways too to look at it like figuring out how to write it so you can drive it outside of a specific booting config.
-
rmustacc
Dunno, happy to talk live or something if that'd be more useful for you. Not sure if I can give you the answer you want.
-
rmustacc
Or ask the question again and I'll try to do better.
-
rmustacc
I'd probably also see how Rich redid enumeration, which I know has been described and I've forgotten.