04:57:24 Did it work @neirac? 14:03:06 danmcd do it now, I lost power last night, so resuming building now 14:04:03 danmcd there is a problem with pkgsrc, I just installed the lasted smartos iso to a vm, then after installing python311 to create zones with ansible, pkgin is broken : ld.so.1: pkgin: fatal: libssl.so.1.1: open failed: No such file or directory 14:05:08 neirac, just reinstall the bootstrap 14:17:36 CmdLnKid thanks! 14:25:16 welcome 14:27:57 iirc atm there is a pkgsetup script that comes with smartos that may help you too 14:28:13 don't have access to the machine at the moment 14:28:54 found it by pkg[tab][tab] 15:05:03 hello all, I have simple question about replace the ‘BAD’ disk? I have two spares disk in my pool, currently show the status is incorrect, had a spare disk in ‘INUSE’ status, how can I replace the ‘BAD’ disk safely? thank you for the advice, here is status on the link https://pastebin.com/Q4AJpYQ8, big thanks in advance 15:09:35 danmcd build worked perfectly today, root cause should havee been that my smartos-live repo was too old 15:30:30 tozhu maybe this helps https://docs.tritondatacenter.com/private-cloud/troubleshooting/disk-replacement 15:31:26 neirac: Thank you very, I’ll take a look 15:32:36 by the way, is there any docs introduce how to build triton image from source? I have got a new machine, would like to try build the triton from source 15:44:37 neirac: Phew. :) 15:44:50 I've had enough explode on me this week. 15:44:57 one thing I've been wanting to do is provide the option where (auto) replacement promotes the spare to a regular disk and the swapped disk is added as a spare (assuming the disk being swapped has had a spare kicked in) 15:45:12 instead of reverting the used spare back to a spare 15:45:41 so you don't have to do yet another resilver 15:53:24 jbk: a big thank, according to instraction, all of the disk are ONLINE, and no failed disk, why the spare disk became a part of the mirror-4? 15:55:36 did you zpool clear? that'd lose any checksum errors 15:55:47 but fmadm faulty might show 15:55:50 or fmdump -e 15:56:37 (fmdump -e shows all the events, fmadm faulty should show if the system determine if events suggest there's a problem) 16:05:18 let me check the status 16:06:49 I have not run ‘zpool clear’ 16:08:03 when run ‘fmdump -e’, it show ‘Nov 17 14:53:22.1049 ereport.io.scsi.cmd.disk.dev.uderr’, where to check the detailed error for the device? 16:08:22 either add -v or -V (can't remember which offhand) 16:08:28 to fmdump 16:08:48 it should show the sense key and asc/ascq values 16:15:00 have got the detailed value bye fmdump -e -V at here https://pastebin.com/2MdRJ35n 16:16:48 could repaire the status by reboot system? 16:20:20 you can just remove the spare if the existing disk is online 16:20:52 it looks like that disk (is it connected via usb?) is having issues with reporting its write cache status 16:26:39 thanks for the advice, all of the disk are sas disk, not USB (the boot disk is a usb flash disk) 16:27:45 those errors are probably then complaining about checking the write cache status on the boot disk 16:28:18 you could take the cXtYdZ name of the boot disk and look at what the symlink in /dev/dsk points to.. 16:28:21 if you want to be sure 16:28:40 but the device path for those suggests usb (due to 'hub@1' being in the path) 16:32:56 okay, thank you very much 16:39:32 I’m not sure if this is a bug, if all disk is ‘TOSHIBA’, then the display is correct, the ‘TYPE’ column display ‘SCSI’, I have replaced a BAD disk with ‘SEAGATE’, there are ‘SEAGATE’ and ‘TOSHIBA’ disks on same SAS port, the ‘TYPE’ column diskplay ‘-‘ for ‘TOSHIBA’ disks, all disks are SAS 16:39:44 here is the screen output https://pastebin.com/yAypLm8w 16:41:43 here is othere machine screen output https://pastebin.com/GVW8bdd4 16:50:22 jbk, should I remove ‘c1t500003970852BA8Ed0’ or ‘c1t5000039708526F5Ed0’ for this case? I’m a little bit confuse, which one should be ‘removed’ ? thanks in advance 16:57:33 and last question, I have 10x U.2 NVMe disks, and want to get best balance between ‘reliability’, ‘capacity’ and ’disk lifetime’ ; what’s the level is adviced? spare disk and log disk is adviced ? best wishes, is any advice? 16:58:08 Ultimately it depends on the performance and capacity tradeoffs you want to hit. 17:19:09 rmustacc, I want to get the performance and reliability 17:19:42 is the mirror adviced? need to create slog and spare disk for the case? 17:20:33 neirac: Which image did this happen to you with? 17:20:50 and, I have a disk can’t not remove / repaire, don’t know how to repare it, any idea/advice? here is the details https://pastebin.com/Wmt5W7tf thank you very much 17:22:35 I perfer mirror, but not sure if slog disk and spare disk is needed if all disk is U.2 disk, someone told me, the big size is for long life for SSD 17:23:27 tozhu: If you have all SSDs there's not much point in having a dedicated log device. 17:24:11 if long life require big size, the raid-z2 is better than mirror, but mirror is more reliability in tranditional known, so I’m confuse 17:24:24 There's always a ZIL. If you have HDDs then you can get some benefit by having a dedicated SSD log device. But if it's all SSD already, a dedicated log device doesn't get you anything. 17:25:02 But you should *always* have spares if you're doing anything serious. Even with mirrors. 17:26:17 bahamat, rmustacc thank you very much 17:26:19 Is there a video / doc on how to build triton service images? 17:26:34 I think I remember there being a "build smartos" thursday talk 17:27:04 Smithx10: I think the office hours was recorded. 17:27:18 Do you by chance know the URL to the video/ 17:28:25 would you please advice how to repare this? https://pastebin.com/Wmt5W7tf I have tried to remove the disk, but it reports Pool busy, so I don’t know how to do next step 17:28:45 a big thank :-) 17:28:52 Smithx10: No, but I can check 17:30:56 tozhu: I think this might be your clue: removal may already be in progress 17:32:47 if so, how to check the removal progress ? 17:33:13 I'm not sure. You might be able to get something out of zdb or mdb. 17:35:14 but run ‘zpool remove -s zones’ command, it reports message: cannot cancel removal: operation is not in progress 17:38:33 try zpool detach zones c1t5000039708526F5Ed0 (assuming the zpool status output hasn't changed since your pastebin) 17:39:35 jbk, thanks for the advice, I’ll try it, thank you 17:56:12 jbk, it works, thank you very much for the help 17:58:36 @bahamat looks like they are on youtube https://www.youtube.com/@TritonDataCenter 17:58:41 :) 17:59:11 I figured something like that 18:00:16 np 18:02:40 bahamat: is the jenkins that does all these builds public? 18:03:32 Smithx10: The hvm/lx images aren't built in Jenkins yet, but our Jenkins is public, yes. 18:04:10 nice, I'm trying to get a build pipeline setup internally for sdc-service images So we can experiment in our test DC before nagging folks. 18:04:39 Was hoping to maybe expand the sdc-cloudapi plugins to cover more of the api 18:05:03 I'm not against it. 18:05:37 The plugins used to be able to modify anything. We'd pass in the entire JSON object and let them do whatever before sending it for provisioning. 18:07:10 I think it was rearchitected for docker 18:07:46 Yeah, having our own plugin and keeping some of the custom behavior their is easier to maintain vs keeping a fork of cloudapi 18:07:53 there* 18:08:12 what do you want to expose? 18:14:23 bahamat, just download yesterday latest iso and installed today 18:16:09 neirac: So you're saying that it's pkgsrc-tools that's broken? 18:18:26 bahamat what I did was to install python311 then after that pkgin was broken 18:19:08 in the global zone? 18:20:35 bahamat correct in the gz 18:32:08 Smithx10: if you done the build process for Triton from source, Please record the step to doc, and share it, Thank you, :-) 19:36:33 bahamat: is there a jenkins file or github action for sdc-cloudapi that I can reference to see what image / build zone image we use? 20:08:17 https://github.com/TritonDataCenter/sdc-cloudapi/blob/master/Jenkinsfile :) Looks like it uses triton-origin-x86_64-21.4.0 master-20220322T012137Z-g9382491 20:17:39 https://github.com/TritonDataCenter/triton/blob/master/docs/developer-guide/building.md 20:36:33 yeh a newer pkgsrc bootstrap will include the latest pkgin that has a boatload of fixes for upgrade scenarios 20:37:17 Tried a make all and hit a node-gyp import sys; print "%s.%s.%s" % sys.version_info[:3]; 20:37:27 Quick guess is the python version ? 20:38:13 https://gist.github.com/Smithx10/76b2d82ce1e04cc0fa948140f694c92b 20:48:33 Yeah, you'll need python2 for node-gyp 20:55:17 Is triton-origin-x86_64-21.4.0 master-20220322T012137Z-g9382491 a build zone? It looks like it's missing a few things. 21:01:01 https://gist.github.com/Smithx10/c7cd2418e6074606abb8093f4af1c058 21:01:29 Should I use this old version of https://pkgsrc.joyent.com/packages/SmartOS/bootstrap//bootstrap-2019Q4-tools.tar.gz 21:17:59 Were you just trying to test a change to cloudapi or setup something close to our Jenkins setup? 21:21:27 similar to your jenkins, I looked through the groovy lib, we use gitlab 21:21:50 I think I got a make all to run cleanly, trying to find where the image + manifest are output too 21:28:30 jperkin, neirac: OS-8500 21:28:31 https://smartos.org/bugview/OS-8500 21:29:29 Smithx10: We have special jenkins-agent images on updates.tritondatacenter.com 21:29:45 Smithx10: From your headnode: updates-imgadm list name=~jenkins 21:30:11 nice 21:32:05 Smithx10: We use the GitHub Branch Source Jenkins plugin to hook into our GH org and auto import everything. 21:56:53 make all worked without issue on the latest jenkins image. Will that create artifacts like the image / image json / tar.gz ? 21:57:37 Smithx10: This is what most builds use: https://github.com/TritonDataCenter/jenkins-joylib/blob/master/vars/joyBuildImageAndUpload.groovy#L20 21:57:46 when run from Jenkins. 21:58:03 The `bits-upload` part will want to upload it to manta, so you'll want everything except that. 21:58:15 unless I override the env vars for the MANTA_URL etc right? 21:58:32 Some builds don't use BuildImageAndUpload, so you'll need to check the Jenkinsfile for what they do instead. 21:58:38 gotcha 21:59:15 Correct, if you supply your own manta environment variables then it will go to whatever you've configured instead. 22:00:18 Thank You 22:11:27 Got stuck on pkg_add https://gist.github.com/Smithx10/b5a5fab267f226d2445fdcd024ea963b 22:15:42 chroot resolvers :( our company blocks 8.8.8.8 lol 22:16:38 is that hard coded into the chroot? 22:16:53 not sure, I'm grepping now seeing if I can find an override 22:19:30 https://gist.github.com/Smithx10/babe8d68d0c3ec969c4b828cbd01672a 22:19:46 yea 23:00:28 bahamat: I'm trying to edit buildimage to copy over my resolv.conf but it doesn't seem like changes to that node file are taking effect 23:01:19 Confusingly, there's multiple copies of buildimage, you need to make sure you're using the right one. 23:01:33 ahhh ok 23:01:33 Or possibly all of them. 23:05:50 bingo, /opt/tools/bin/buildimage 23:45:37 success