01:51:18 hrm.. this is odd... this system (annoyingly I can't access to it to dig more) is getting SMF_ERROR_NO_RESOURCES when a bunch of disks are connected to it (and failing to boot).. with the disks removed, comes up fine... 01:52:43 i was thinking maybe /etc/svc/volatile filling up, but at least with the disks removed, df -h is showing GBs of free space... 06:44:56 SCF_ERROR_NO_RESOURCES you meant? altho there is also SCF_ERROR_NO_MEMORY; apparently we do get SCF_ERROR_NO_RESOURCES translated from REP_PROTOCOL_FAIL_NO_RESOURCES but that too seems to be related to allocation errors. 06:45:37 still there are constructs like res = backend_tx_run(tx, q, fill_property_callback, &ci); if (res == REP_PROTOCOL_DONE) res = REP_PROTOCOL_FAIL_NO_RESOURCES; 06:50:27 basically, there are functions to return either OK or NO RESOURCES:) 11:40:57 [illumos-gate] 17314 Obsolete reference to mibiisa in the netstat(8) manual -- Peter Tribble 11:44:38 [illumos-gate] 17315 The ddi_ufm_op_readimg(9E) manual is missing -- Peter Tribble 18:05:01 tsoome_: yeah.. 18:05:21 though out of memory seems odd.. the system has something like 300+gb of ram 18:06:21 ... though svc.configd and svc.startd are 32-bit 18:06:57 ... i wonder how many problems would crop up if I built an image w/ 64-bit versions of those... 18:07:43 tsoome_: i don't suppose you've tried that yet w/ your work of building things 64-bit? :) 18:27:48 I have done very little to build things 64-bit, just some zfs commands and few ones which used isaexec anyhow... 18:32:52 jbk: if you want "64bit versions of things", check the arm64-gate, they're all that way necessarily 18:33:19 for startd/configd in specific, you want _all_ the arm fixes to cmd/svc and lib/libscf, at least 19:27:22 also the "repcache" stuff is kinda scary, and where one of those bugs lives 19:27:43 heh.. 19:27:49 so if tsoome is right pointing you into that, you probably want to make assumptions that what it seems to be telling you is true 19:28:20 it's a real hoot 19:29:09 the hard part is that I can't actually get my hands on the system 19:29:20 i get at best a description of what comes up on the console 19:29:37 and not even copy/paste or a screenshot 20:46:21 jbk: Is it an install to disk system, or a UFS ramdisk system? 21:20:42 ufs ramdisk (mostly, /etc is re-mounted later as a zfs dataset) 21:46:55 Really dumb question... should a swap volume have `primarycache=none` set? I mean, it's for swap, i.e. memory, right? 21:47:07 * danmcd is probably missing something obvious... 22:08:02 danmcd: I've used primarycache=metadata on swap volumes on the theory that you want to let indirect blocks and the like stay in cache. 22:08:07 but I haven't measured it 22:08:44 I'm worried about high-memory pressure situations where the reclaim takes so long IO times out. :( Corner-case, but something that prompted me to ask. 22:09:09 danmcd: in memory pressure like that so many things go wrong it's hard to point fingers 22:09:13 (you run out) 22:09:31 Or the ostensibly free memory just can't be reclaimed fast enough. :O 22:10:25 one thing i'm hoping we could upstream soon is we noticed a couple of zfs kmem caches that seemed to have been skipped when doing reaping 22:11:06 since we've noticed the ones that were missing could potentially hold GBs of data in their kmem caches 22:11:23 Whoa! 22:11:33 Which ones? 22:11:57 one related area of work I've thought about but haven't really done anything towards is having our swap system free blocks in the underlying storage when they are no longer needed. 22:12:27 sommerfeld: like UNMAP/TRIM operations? 22:13:43 I have a kernel dump from a too-large-BHYVE/transient-reservoir allocator that Just Froze Everything. Would love to inspect the state of the kmem caches you're targetting on this dump, jbk 22:13:57 sommerfeld: for fragmentation? 22:14:03 danmcd: dnode_cache && dbuf_kmem_cache 22:15:09 yeah. the swap allocator does a clock hand thing which means that an initially-sparse zvol winds up fully populated if you only have a little bit of swap usage. 22:16:26 the latter looks like it's named 'dmu_buf_impl_t' if you're looking in mdb 22:17:55 Not even a Gig... 22:18:15 > ::kmem_cache !egrep "dnode|dmu_buf" 22:18:15 fffffdf266f92008 dnode_t 1200 000000 688 636735 22:18:15 fffffdf2871ae008 dmu_buf_impl_t 1000 000000 216 1897380 22:18:17 > !echo "(688 * 636735 + 216 * 1897380)" | bc 22:18:19 847907760 22:18:21 > 22:18:31 not looking for actual overcommit, but it means the underlying pools have more actual free blocks to play with and don't have to preserve the value of swap pages that nobody will ever read again. 22:18:51 @sommerfeld I get it now. 22:18:55 kinda related to that, though perhaps of less interest given semi-recent events.. i do have a change to sd.c that'll issue WRITE SAME w/ a zero-filled block for VMware virtual disks instead of issuing an UNMAP CDB 22:19:03 since the latter does nothing w/ vmware 22:19:15 but the former will actually shrink thin provisioned vmdk files 22:19:21 WOW! 22:19:33 at least with ESX 22:19:43 but i'm guessing other variants are probably the same 22:21:14 (it actually has three different 'TRIM' implementations -- UNMAP, write same w/ unmap bit siet, write same w/o unmap bit set) 22:21:41 and it looks at the various vpd pages (and supported ops) to pick the method to use (or override via sd.conf) 22:22:12 (including disabling issuing unmap for the given disk type completely) 22:41:25 jbk: If it's a UFS ramdisk, and something fails when you attach more disks, your UFS ramdisk image is probably out of inodes? 22:42:26 All of the things created in the mounted /dev actually create a shadow object _under_ that in the root file system, to store permissions and such 22:42:53 Which, while useless for persistence in a ramdisk situation, means that you need to have a fair amount of inodes in your UFS ramdisk thing 22:43:27 I believe it's the -i option to newfs(8) 22:44:41 oh hmm... 22:45:34 "df -F ufs -o i /" should tell you how much you're using now 23:19:27 ahh firmware... don't ever stop being terrible (unrelated to that issue) 23:19:53 now identified two firmware bugs today and I wasn't even meaning to 23:49:42 so a regular day