-
Smithx10
anyone ever see / experience [ID 433237 kern.warning] WARNING: t4nex1: regs[2] does not have a valid MMIO address
-
Smithx10
i have a bunch of compute nodes with chelsio T62100-lp-cr cards
-
Smithx10
Sometimes the CN some up clean with 2 and now its not. the difference between the ones that boot normal and one that doesnt can be seen here.
gist.github.com/Smithx10/0ece643e3f85207390220c62a1fad930
-
rmustacc
Smithx10: I've not seen that before. That'd suggest something in firmware or later pci_boot.c is ont programming that correctly.
-
rmustacc
I would grab the prtconf output from it.
-
Smithx10
rmustacc: danmcd suggests something like a partially plugged in card. prtrconf output
gist.github.com/Smithx10/8fc260abf19e972634d1ca9759336f04
-
rmustacc
Smithx10: You need -v to get properties.
-
Smithx10
Looks like after re-seating both the risers and cards in one of those boxes we have link lights on both cards.... its booting now, see if we still have the error
-
Smithx10
yeah that server came up clean afterward. Im assuming the risers werent seated fully
-
rmustacc
OK. Doesn't quite add up for me, but fair enough.
-
Smithx10
alright, I just gave the order for him to reseat the 4 others that were effected out of the 20
-
Smithx10
if any have the same issue ill get a -v sent over
-
rmustacc
But I'm also having trouble tracking down exactly where that error is being geerated. Ultimately if it is that and that solves that, all the better.
-
Smithx10
-
Smithx10
but I think that one isnt even showing the other card... looks like it might be the seating
-
jbk
hrm...
-
jbk
we have errno.d that provides sympolic values for error values (i.e. `inline int EPERM = 1;`)... any feelings about adding a script that can be #included that maps values back to name (in the same vein as truss)?
-
rmustacc
I think you probably just want a subroutine or action that does this ala strerrorname_np() but without the mouthful.
-
rmustacc
At least, that's what I had in the back of my head as a follow up.
-
xv8
This might be a long shot, but does anyone have a good source to buy aftermarket parts for SPARC hardware outside of eBay? Specially for an M10-4...
-
richlowe
Smithx10: the message usually means we've been given an MMIO address of 0x0, or near it
-
richlowe
not admin error, bug.
-
richlowe
rmustacc: if y'all are trying to make libjedec public, I'd expect an IPD?
-
richlowe
if only to say so, and maybe update some manuals
-
rmustacc
richlowe: I'm not intending to make libjedec a committed interface at this time.
-
richlowe
so it's a kind of contracted private affair?
-
richlowe
if that's the case, I think an ipd makes even more sense to warn people, but also we've never done that before since the Sun days
-
richlowe
and it didn't entirely work great even then
-
rmustacc
Yeah, I don't think we'd be making any promises like that all imho.
-
rmustacc
As I mentioned in the ticket, it may makes sense in a 100% buyer beware case like we do for other libraries for folks to experiment with. But reserving all rights with prejudice I guess.