02:59:04 hello all, I hit a issue, on a old machine, (8 disk under RAID5 — hardware RAID) ; when the image is 2015xxxx, vmadm create is very fast, it’s about to done the vm creation ; the same, change the smartos to current recent release, run the vmadm create , it time out about 5 min, what’s the different ? 02:59:31 is there any big changes for the vmadm creation procedure? 02:59:37 thanks in advance 03:28:22 What does the time-out error message say? Also are there any other differences between old a new machines save the platform revision? 04:31:16 tozhu : there is no "triton-exporter" or such a thing - but many components can be polled on a metrics port by prometheus 05:15:10 danmcd: thank you, I’ll check the error message 05:15:23 neuroserve: thanks for the advice 14:28:13 tozhu: also if you can share you vm manifest json you are using with vmadm create 14:49:33 jperkin, it looks like the new release of roundcube moved two of the three skins (classic, larry) into a separate git repo, and thus they're no longer installed. Any chance those could be re-included? 15:19:58 sure 15:37:47 https://github.com/TritonDataCenter/pkgsrc/issues/343 16:33:54 Hi, is CMON's mem_agg_usage metric supposed to be accurate for bhyve? 16:35:35 I'm seeing a discrepancy where mem_agg_usage is reporting much less (4.4 MB) than `free -m` (2,173 MB) 17:05:24 blackwood821: No, bhyve's memory is allocated in a way that cmon can't see it properly, but that's also true for KVM. 17:05:47 With KVM you see whatever the allocation is as fully used, regardless of actual memory usage in the guest. 17:06:19 With bhyve you don't see any of it allocated, even though it's hard allocated similar to the way kvm is. 17:06:37 (but it's different under the hood, which is why there's different behavior) 17:25:50 bahamat: Ok so basically CMON can't be used for monitoring HVM memory and disk? 17:27:28 disk usage on bhyve is accurate if the guest supports TRIM 17:39:05 blackwood821 : you need node_exporter (or something like that) inside the vm 17:41:47 neuroserve: Ok, that's what I figured, thanks 17:45:17 bahamat: Is that only true for the zone root since zones/:vm_uuid is a filesystem but additional disks such as zones/:vm_uuid/disk0, zones/:vm_uuid/disk1, etc. are volumes? 17:49:53 No. 17:52:17 The zone root contains all volumes 18:02:55 bahamat: Hmm, we ran into issues with that because the ZFS "used" property shows that the the entire volume is used 18:04:59 So even though the zone root filesystem contains all the volumes, we didn't see a good way to find out how much of the volumes were actually used up by the guest via CMON 18:10:35 Use zfs_logicalused instead 18:14:53 bahamat: I tried that but it doesn't seem to work well when ZFS compression is enabled 18:16:11 I wonder if zfs_logicalused / zfs_compressratio would work 18:40:28 Looks like that works well for bhyve but on a test native zone on an encrypted zpool zfs_logicalused / zfs_compressratio is roughly half a gig less than zfs_used 19:16:06 Any idea why that would be and how to account for that? 19:17:19 I just created a 1 GB file in the zone root and then another 1 GB file in zones/$(zonename)/data/appdata and the difference between zfs_used and (zfs_logicalused / zfs_compressratio) went from 0.57 GB to 0.56 GBf 19:18:01 Are they empty files? 19:18:52 The compressratio is across all data in that dataset. Trying to compare that for individual files isn't going to give you the results you want. 19:19:51 They contain random bytes. I created them via `dd if=/dev/urandom of=sample.txt bs=64M count=16` 19:20:33 My main goal is to have a prometheus alertmanager rule that I can rely on that will alert me when zone/VM disk usage reached a certain threshold 19:21:46 Using (zfs_logicalused / zfs_compressratio) in comparison to zfs_quota seems the most flexible since it accounts for ZFS compression and ZFS volumes (HVM) but I'm curious why the result is different from zfs_used on a native SmartOS zone on a CN with ZFS encrypted enabled 19:24:25 When analyzing the metrics for a bhyve VM on an encrypted CN I found that (zfs_logicalused / zfs_compressratio) was much closer to zfs_used that it is for a native SmartOS zone 19:24:41 blackwood821: So you've got multiple things at play here. 19:24:58 1. when will your guest start experiencing issues because the filesystem is too full? 19:25:14 2. when will the hypervisor start denying writes due to being over quota 19:27:41 Makes sense. I'd like to have a rule that alerts me before either of those scenarios ;-) 19:37:03 For HVM, it's also a good idea to have in-guest metrics surfaced. The hypervisor can't always see everything in the guest. 19:40:39 bahamat: Understood, I might look into adding metrics in the guest 19:41:28 But do you have any idea what the discrepancy between (zfs_logicalused / zfs_compressratio) and zfs_used would be for native SmartOS with compression? 19:42:02 That's explained in the zfs man page. 19:57:56 Read through those properties in the man page but don't see anything that would indicate a difference between SmartOS and HVM 19:58:07 bahamat: Does CMON provide any metrics for the brand? 19:58:44 Not a metric, but it's in the dicsovery 19:59:02 You can use relabling to add that label to all metrics for that instance. 20:01:10 Hmm, I'll check that out then. Using zfs_used / zfs_quota for native SmartOS and (zfs_logicalused / zfs_compressratio) / zfs_quota for HVM sounds safest 20:40:37 Because even if I end up relying in metrics in the guest for HVM, I would still need my disk CMON rule to exclude HVM so having a label for the brand would be handy