19:06:38 Our implementation of posix_fallocate(3c) returns EINVAL unless the underlying filesystem is ufs 19:07:20 This sort of makes sense - on zfs, copy-on-write makes any guarantee of allocation worthless 19:09:19 I tripped over this recently, as Java can use posix_fallocate to place the Java heap on a disk-based file rather than memory 19:09:40 Interestingly, it works on Solaris, which confused me somewhat 19:10:41 I wondered if there were opinions on whether implementing it would be regarded as sensible, or if we just say it's unsupported 19:13:18 really? I could have sworn Solaris returned EINVAL for posix_fallocate on zfs for exactly the reason you mention 19:15:21 It used to, and indeed used to be explicit in the man page about what filesystems were supported. 19:16:24 But the man page no longer says 'ufs only' in 11.4 at least, and the Java functionality appears to work fun 19:17:07 oops, s/fun/fine 19:20:35 but apparently I'm thinking of the underlying F_ALLOCSP fnctl, which our posix_fallocate calls, and then if the fcntl returns EINVAL to indicate the file system doesn't support it, we fake it using an F_FREESP fnctl instead 19:22:28 So, basically, be economical with the truth to keep applications happy 19:23:40 the bug discussion mentions that with a COW filesystem like ZFS, actually allocating blocks would increase the chance of failure by using up blocks you may need for the real write later, but at least doing the equivalent of ftruncate() means you can mmap() the expected size 19:24:06 yep 19:24:23 Yeah I think that's unfortunate but reasonable. In general, software that strongly believes in Preallocation For Performance is going to take the failure to fallocate and write zeroes out anyway 19:24:34 see also: PostgreSQL and their unfortunate WAL behaviour 19:26:37 it also mentions that (at least back in 2012) glibc also did a bit of fakery for filesystems that didn't support the linux syscall for this, and would write out zero bytes the old fashioned way, which is neither good for performance nor for allocations on COW filesystems 19:26:38 this all checks out, but it feels like the greater "we" should maybe like, talk to people about that? 19:27:10 because it's clearly not great, and surely ZFS has traction enough people will 100% care even though we're weird 19:33:01 https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/posix/posix_fallocate.c - apparently its fallback is "write one null byte to every block", which seems awful 19:33:49 that feels like it probably works on ancient filesystems 19:33:59 and ... I guess hurts less on zfs? 19:34:18 but it's definitely a solution that given two ok-ish options, does neither 19:35:32 jclulow: they did at least eventually add a flag to disable that behavior 19:35:48 even if it felt like 'now that I've thought of it, it's a brilliant idea' :) 19:35:57 jbk: Yes, I was there while Jerry and dap were convincing folks to do it haha 19:35:57 https://man7.org/linux/man-pages/man3/posix_fallocate.3.html does warn that the fallback is not great 19:36:47 i remember them arguing for it, but meeting resistance and the actual addition of the flag happening later 19:37:03 richlowe: I don't think writing a single 0 into every block is going to be materially different from writing 0 into the whole block on ZFS 19:39:47 https://github.com/openzfs/zfs/issues/326 has thoughts from a ZFS perspective 19:40:13 jclulow: I thought the variable block size would at least cause ZFS to give away _less_ COW space 19:40:54 (though obviously, if compression is on, neither matters?) 19:41:39 Perhaps that's true! I'm not sure. 19:41:54 Compression would certainly change things and probably make it a wash if it was indeed all zeroes 19:41:55 jclulow: fwiw, the 2nd comment alan just linked is exactly what I just DM'd you surprised nobody made the ZFS team do :) 19:42:39 I suspect the difference between writing a single zero byte vs. a whole zero block is mainly copyin() performance for the system call, and bytes transimitted for the NFS case, while producing the same underlying changes on disk 19:42:51 Right 20:07:33 Just to capture this, I've opened https://www.illumos.org/issues/16887 20:07:34 → FEATURE 16887: Status of posix_fallocate (New) 20:11:19 thanks 23:22:38 I've filed #16888 (fenix?) so I don't lose it. If anyone has battled this and won with a better workaround than /* begin/end cstyled */, I'd love to hear. 23:22:39 BUG 16888: cstyle(1ONBLD) can't handle C11 static assertions with continuation lines (New) 23:22:39 ↳ https://www.illumos.org/issues/16888