-
jbk
ahh.. so i think i've found the culprit behind the short reads
-
jbk
-
sommerfeld
so this was the unexpected read of a 512-byte block on a 4k device you were trying to hunt down?
-
jbk
i think so
-
jbk
the sequence makes sense with the other scsi commands being sent
-
tsoome_
eh, thats nice find.
-
jbk
i'm going to try grabbing the block size from the device-blksize property (falling back to DEV_BSIZE) and use READ(10)
-
jbk
since I think we can safely pay any performance penalty on that read on anyone crazy enough to multi-path with parallel SCSI (if that's even possible)
-
jbk
(AFAICT, using the smallest CDB for READ/WRITE was to improve parallel SCSI performance)
-
jbk
i'd be skeptical if it makes a difference with SAS or FC
-
jbk
annoyingly though, the scsi standards don't say exactly when READ(6) was deprecated
-
jbk
danmcd: I know you said you're not around much, but your mlxcx change seems to have just caused a panic if you've got some time
-
sommerfeld
Yeah, I'd think that on anything modern you're dealing with cache-line-ish chunks between host memory and adapter and 6 vs 10 vs .. is in the noise for command movement..
-
sommerfeld
trying to read the backstory I found a note in a 2005 T10 doc which basically pleaded with people to migrate away from 6
-
tsoome_
uh, do we actually have parallel scsi HBA support?
-
jbk
if you have an ancient enough HBA, i suppose so
-
danmcd
Yeah... jbk --> we need to loosen the VERIFY in mlxcx_explore_pcam() to be a if (!mlxcx_cmd_access_regseter()) return
-
danmcd
because CX-4 doesn't have PCAM, I'm guessing.
-
danmcd
Also, there's an ASSERT(c->mlc_pcam) that MIGHT just become the if (!c->mlc_pcam) return; we need.
-
danmcd
SOrry for not having CX-4 to test, and good catch. If you've customers getting bitten, you can hotpatch an immediate return in the VERIFY() codepath instead of calling assfail().
-
danmcd
(back to vacation...)
-
jbk
right now this is just a lab box
-
jbk
i'm in the middle of verifying a fix for a more critical issue w/ it and hit that (i just backed it out for now on my test branch)
-
jbk
but once that's resolved, I could probably work w/ you to test/diag/etc
-
jbk
if you have the availability
-
jbk
it has both cx-4 and cx-5 cards
-
jbk
so might be a nice test
-
jbk
i think there's some other boxes w/ cx-6 as well
-
danmcd
-
fenix
→
BUG 16129: mlxcx_explore_pcam() in mlxcx(4D) doesn't play nice with ConnectX-4 parts anymore (New)
-
danmcd
Weird.
-
danmcd
I just noticed the caller of mlxcx_explore_pcam() doesn't go there unless the mlcap_general_flags_c (unsupported on CX-4?) reports PCAM is present.
-
jbk
unfortunately, it happens before a dump device is there
-
danmcd
Try modifying the VERIFY() to be a return-if-false instead. That should make your CX-4 not regress, but still be good for CX-5 & CX-6.
-
danmcd
Oh yeah.
-
jbk
so it means just kmdb on the console
-
danmcd
But the fix is around the VERIFY(); if you can compile and replace you're okay too,.
-
danmcd
(SERIOUSLY, back to vacation now.)
-
jbk
yeah, that pcam bit is not defined in cx-4
-
jbk
(I'll update the ticket)