-
tozhu
hello all, I hit a issue, on a old machine, (8 disk under RAID5 — hardware RAID) ; when the image is 2015xxxx, vmadm create <vm> is very fast, it’s about to done the vm creation ; the same, change the smartos to current recent release, run the vmadm create <vm> , it time out about 5 min, what’s the different ?
-
tozhu
is there any big changes for the vmadm creation procedure?
-
tozhu
thanks in advance
-
danmcd
What does the time-out error message say? Also are there any other differences between old a new machines save the platform revision?
-
neuroserve
tozhu : there is no "triton-exporter" or such a thing - but many components can be polled on a metrics port by prometheus
-
tozhu
danmcd: thank you, I’ll check the error message
-
tozhu
neuroserve: thanks for the advice
-
papertigers
tozhu: also if you can share you vm manifest json you are using with vmadm create
-
pjustice
jperkin, it looks like the new release of roundcube moved two of the three skins (classic, larry) into a separate git repo, and thus they're no longer installed. Any chance those could be re-included?
-
jperkin
sure
-
pjustice
-
blackwood821
Hi, is CMON's mem_agg_usage metric supposed to be accurate for bhyve?
-
blackwood821
I'm seeing a discrepancy where mem_agg_usage is reporting much less (4.4 MB) than `free -m` (2,173 MB)
-
bahamat
blackwood821: No, bhyve's memory is allocated in a way that cmon can't see it properly, but that's also true for KVM.
-
bahamat
With KVM you see whatever the allocation is as fully used, regardless of actual memory usage in the guest.
-
bahamat
With bhyve you don't see any of it allocated, even though it's hard allocated similar to the way kvm is.
-
bahamat
(but it's different under the hood, which is why there's different behavior)
-
blackwood821
bahamat: Ok so basically CMON can't be used for monitoring HVM memory and disk?
-
bahamat
disk usage on bhyve is accurate if the guest supports TRIM
-
neuroserve
blackwood821 : you need node_exporter (or something like that) inside the vm
-
blackwood821
neuroserve: Ok, that's what I figured, thanks
-
blackwood821
bahamat: Is that only true for the zone root since zones/:vm_uuid is a filesystem but additional disks such as zones/:vm_uuid/disk0, zones/:vm_uuid/disk1, etc. are volumes?
-
bahamat
No.
-
bahamat
The zone root contains all volumes
-
blackwood821
bahamat: Hmm, we ran into issues with that because the ZFS "used" property shows that the the entire volume is used
-
blackwood821
So even though the zone root filesystem contains all the volumes, we didn't see a good way to find out how much of the volumes were actually used up by the guest via CMON
-
bahamat
Use zfs_logicalused instead
-
blackwood821
bahamat: I tried that but it doesn't seem to work well when ZFS compression is enabled
-
blackwood821
I wonder if zfs_logicalused / zfs_compressratio would work
-
blackwood821
Looks like that works well for bhyve but on a test native zone on an encrypted zpool zfs_logicalused / zfs_compressratio is roughly half a gig less than zfs_used
-
blackwood821
Any idea why that would be and how to account for that?
-
blackwood821
I just created a 1 GB file in the zone root and then another 1 GB file in zones/$(zonename)/data/appdata and the difference between zfs_used and (zfs_logicalused / zfs_compressratio) went from 0.57 GB to 0.56 GBf
-
bahamat
Are they empty files?
-
bahamat
The compressratio is across all data in that dataset. Trying to compare that for individual files isn't going to give you the results you want.
-
blackwood821
They contain random bytes. I created them via `dd if=/dev/urandom of=sample.txt bs=64M count=16`
-
blackwood821
My main goal is to have a prometheus alertmanager rule that I can rely on that will alert me when zone/VM disk usage reached a certain threshold
-
blackwood821
Using (zfs_logicalused / zfs_compressratio) in comparison to zfs_quota seems the most flexible since it accounts for ZFS compression and ZFS volumes (HVM) but I'm curious why the result is different from zfs_used on a native SmartOS zone on a CN with ZFS encrypted enabled
-
blackwood821
When analyzing the metrics for a bhyve VM on an encrypted CN I found that (zfs_logicalused / zfs_compressratio) was much closer to zfs_used that it is for a native SmartOS zone
-
bahamat
blackwood821: So you've got multiple things at play here.
-
bahamat
1. when will your guest start experiencing issues because the filesystem is too full?
-
bahamat
2. when will the hypervisor start denying writes due to being over quota
-
blackwood821
Makes sense. I'd like to have a rule that alerts me before either of those scenarios ;-)
-
bahamat
For HVM, it's also a good idea to have in-guest metrics surfaced. The hypervisor can't always see everything in the guest.
-
blackwood821
bahamat: Understood, I might look into adding metrics in the guest
-
blackwood821
But do you have any idea what the discrepancy between (zfs_logicalused / zfs_compressratio) and zfs_used would be for native SmartOS with compression?
-
bahamat
That's explained in the zfs man page.
-
blackwood821
Read through those properties in the man page but don't see anything that would indicate a difference between SmartOS and HVM
-
blackwood821
bahamat: Does CMON provide any metrics for the brand?
-
bahamat
Not a metric, but it's in the dicsovery
-
bahamat
You can use relabling to add that label to all metrics for that instance.
-
blackwood821
Hmm, I'll check that out then. Using zfs_used / zfs_quota for native SmartOS and (zfs_logicalused / zfs_compressratio) / zfs_quota for HVM sounds safest
-
blackwood821
Because even if I end up relying in metrics in the guest for HVM, I would still need my disk CMON rule to exclude HVM so having a label for the brand would be handy