19:14:23 ahh.. so i think i've found the culprit behind the short reads 19:14:29 https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/io/scsi/adapters/scsi_vhci/scsi_vhci.c#L3944-L3966 19:20:44 so this was the unexpected read of a 512-byte block on a 4k device you were trying to hunt down? 19:21:05 i think so 19:22:59 the sequence makes sense with the other scsi commands being sent 21:11:00 eh, thats nice find. 21:48:10 i'm going to try grabbing the block size from the device-blksize property (falling back to DEV_BSIZE) and use READ(10) 21:49:12 since I think we can safely pay any performance penalty on that read on anyone crazy enough to multi-path with parallel SCSI (if that's even possible) 21:50:02 (AFAICT, using the smallest CDB for READ/WRITE was to improve parallel SCSI performance) 21:50:15 i'd be skeptical if it makes a difference with SAS or FC 21:56:38 annoyingly though, the scsi standards don't say exactly when READ(6) was deprecated 22:07:04 danmcd: I know you said you're not around much, but your mlxcx change seems to have just caused a panic if you've got some time 22:07:08 Yeah, I'd think that on anything modern you're dealing with cache-line-ish chunks between host memory and adapter and 6 vs 10 vs .. is in the noise for command movement.. 22:07:54 trying to read the backstory I found a note in a 2005 T10 doc which basically pleaded with people to migrate away from 6 22:15:55 uh, do we actually have parallel scsi HBA support? 22:17:04 if you have an ancient enough HBA, i suppose so 22:58:36 Yeah... jbk --> we need to loosen the VERIFY in mlxcx_explore_pcam() to be a if (!mlxcx_cmd_access_regseter()) return 22:58:53 because CX-4 doesn't have PCAM, I'm guessing. 22:59:33 Also, there's an ASSERT(c->mlc_pcam) that MIGHT just become the if (!c->mlc_pcam) return; we need. 23:00:27 SOrry for not having CX-4 to test, and good catch. If you've customers getting bitten, you can hotpatch an immediate return in the VERIFY() codepath instead of calling assfail(). 23:00:32 (back to vacation...) 23:00:58 right now this is just a lab box 23:01:27 i'm in the middle of verifying a fix for a more critical issue w/ it and hit that (i just backed it out for now on my test branch) 23:01:43 but once that's resolved, I could probably work w/ you to test/diag/etc 23:01:49 if you have the availability 23:02:02 it has both cx-4 and cx-5 cards 23:02:11 so might be a nice test 23:02:22 i think there's some other boxes w/ cx-6 as well 23:05:06 @jbk --> https://www.illumos.org/issues/16129 23:05:07 → BUG 16129: mlxcx_explore_pcam() in mlxcx(4D) doesn't play nice with ConnectX-4 parts anymore (New) 23:05:56 Weird. 23:06:30 I just noticed the caller of mlxcx_explore_pcam() doesn't go there unless the mlcap_general_flags_c (unsupported on CX-4?) reports PCAM is present. 23:07:00 unfortunately, it happens before a dump device is there 23:07:00 Try modifying the VERIFY() to be a return-if-false instead. That should make your CX-4 not regress, but still be good for CX-5 & CX-6. 23:07:13 Oh yeah. 23:07:19 so it means just kmdb on the console 23:07:46 But the fix is around the VERIFY(); if you can compile and replace you're okay too,. 23:07:59 (SERIOUSLY, back to vacation now.) 23:37:17 yeah, that pcam bit is not defined in cx-4 23:37:20 (I'll update the ticket)