01:13:39 jbk: I use the in kernel smb for apple timemachine fwiw. Although I can't remember why I ended up using samba to share music to my sonos speaker. I think I couldn't get guest access without a password working 01:13:46 could have been total user error though 01:34:38 yeah.. i have a nas zone on my smartos box that i use for that and it works pretty well.. 02:27:52 papertigers: well, older sonos can only do SMB1 - maybe that was it? 03:51:38 o/ 04:03:42 Hey 04:18:26 NCommander: So, I think you have to run this suite via "gmake test" 04:18:50 There is at least one critical bug if you run it directly, which is that THIS_SH doesn't get set in the environment until after it's used the first time 04:24:41 Some of these failures are just because they are using essentially unstable output of programs like od(1) 04:26:38 e.g., for us: "printf '\n' | od -t a" prints "0000000 lf", but apparently GNU od prints "0000000 nl" 04:52:31 how od(d) 04:52:32 :P 04:55:05 There are two pieces of delta that don't otherwise seem to be related to the fact that we don't have some of the less common locales: run-test and run-errors. The run-test thing seems to be a test that's supposed to fail, the "enable sh bash" line 04:55:58 It seems like it happens wth an error message structure it was not necessarily expecting 04:57:07 err rather I guess that's the run-errors thing 05:03:54 I'm not sure about the run-test thing, but it's a bit odd. It seems like the test expects that "exec 6>&-" will create file descriptor 6, and it wll apparently be a pipe 05:04:09 but ... I don't see that fd open on the process after using that syntax 05:04:19 Some more digging there required probably 05:04:45 NCommander: I don't see any evidence of bash crashing though, FWIW 05:05:03 (this is OmniOS r151046ac) 05:05:12 oh I was trying to figure out what tab this was coming from 05:06:12 jclulow: that might be environment specific 05:06:22 Give me a moment, I forgot I had this open in the background 05:06:28 ok 05:07:29 bash manpage: "the redirection operator 05:07:30 [n]>&digit- 05:07:30 moves the file descriptor digit to file descriptor n, or the standard 05:07:30 output (file descriptor 1) if n is not specified. 05:07:30 " 05:07:31 I shut down the VM to export it, so I have to boot everything back up 05:07:46 so it's some sort of fd juggle 05:08:35 Ah, gmake check does work 05:09:02 And you're right, no core file. I saw the error about the struct issue. Too used to linux just posting something like that and not coring 05:09:41 I have a core dump of bash dying on login. I also have a few litering around. 05:10:12 FWIW, it's the "enable" built-in using dlsym() to look into itself to find "bash_struct" (to do the "enable bash" thing) and not finding it. It is _expected_ that it doesn't find it, seemingly; I'm not sure what Linux systems do there. It's totally possible the Mac OS X equivalent doesn't use dlsym() but something else. 05:10:19 Given they're Mach-O instead of ELF 05:10:57 So that error pops out there because they print it, in the code in "builtins/enable.def" 05:11:20 I think this test suite is actually in pretty good shape FWIW 05:11:39 https://gist.github.com/NCommander/ad6684898f400c2611bfb19dc3a7470e - yeah its less broken with gmake 05:11:42 If you run it with the GNU stuff in PATH first, it finds the GNU od and stops whinging about that stuff; e.g., PATH=/usr/gnu/bin:$PATH gmake test 05:11:44 I legitimately forgot about that 05:12:09 The locale stuff is probably because that locale is missing 05:12:22 Well on Mastodon, the locale got brought up as a suspected cause 05:12:37 I just noted they're not installed, since perl was complaining about it when I was trying to build arm64-gate 05:12:38 Yeah I made that suggestion wrt. the segfault/core file stuff 05:13:11 I am uploading the VHDs to Google right now, but they're 72 GiB 05:13:18 So its taking awhile 05:13:27 But because you're ostensibly using en_US.UTF-8 and a stock install, I am less sure it's locale related haha 05:13:48 I've crawled around in bash's internals before 05:13:57 It is not my favourite neighbourhood 05:14:06 (I dunno if you've seen my YouTube channel, I had an adventure with it on AIX) 05:14:13 I have not! 05:14:29 the long story short: old AIX defines strtoimax, but doesn't include it in the headers. It's also buggy 05:14:41 This does *really fun things*, cause a configure check in bash was broken and uh 05:15:12 well needless to say I filed a few bugs on the matter :) 05:15:19 neat 05:15:28 bugs filed is always good haha 05:15:59 Yeah, this is actually me following up on trying to build illumos for arm64, which is largely how I got here. 05:17:45 The stack traceback to me tells me its dying somewhere in readline 05:18:02 Bash has its own built in version that it uses if the system one isn't present, but I think its linked to readline 05:19:10 presumably "ldd /usr/bin/bash" includes /usr/lib/64/libreadline.so.8 in there 05:19:38         libreadline.so.8 => /usr/lib/64/libreadline.so.8 05:19:42 yeah 05:19:42 We have ourselves a winner 05:21:09 Readline doesn't have a test suite :/ 05:21:25 If you "pargs -e " can you gist the result 05:21:26 Although there are some compiler warnings 05:22:05 https://gist.github.com/NCommander/7ce54489f9dc26f3541fbc0a4f7d93f6 05:22:24 ok 05:22:30 LC_CTYPE=UTF-8 is probably the issue 05:23:09 ncmdr@rosetta2:~/bash-5.2.21$ cat configure.log | grep -i nls 05:23:10 checking whether NLS is requested... yes 05:23:10 checking whether to use NLS... yes 05:23:17 The crash is in readline though 05:23:28 Hold on, let me make a new account which has bash as its shell 05:23:31 so I can test different scenarios 05:24:09 OK I can reproduce it 05:24:21 with... LC_CTYPE=UTF-8 /usr/bin/bash 05:25:00 It is somewhat unconscionable that it handles SIGSEGV haha 05:25:06 ncmdr@rosetta2:~/bash-5.2.21$ LC_CTYPE=UTF-8 /usr/bin/bash -c 'echo HERE' 05:25:07 UTF-8: unknown locale 05:25:07 HERE 05:25:08 ncmdr@rosetta2:~/bash-5.2.21$ 05:25:23 It's not crashing here 05:25:43 you're not doing the same thing I was doing 05:25:50 It also does not crash if I pass it a -c argument 05:26:00 with no arguments, it attempts to start an interactive shell 05:26:04 which is when readline comes in 05:26:19 that also worked 05:26:23 I tried that first 05:28:09 Well I can't reproduce the SSH login crash now. I don't think I saved a snapshot before upgrading 05:28:50 what do you get if you: pkg info readline | grep FMRI 05:30:48 on bloody? 05:30:50 er 05:30:51 sorry 05:31:08 ncmdr@rosetta2:~$ pkg info readline | grep FMRI 05:31:08              FMRI: pkg://omnios/library/readline⊙8:20240402T110232Z 05:31:28 ok and that's ... bloody or stable or? 05:31:59 bloody 05:32:08 ncmdr@rosetta2:~$ uname -a 05:32:08 SunOS rosetta2 5.11 omnios-master-b797611cbb i86pc i386 i86pc 05:32:09 ncmdr@rosetta2:~$ 05:32:21 ok tah 05:32:51 The core file for me was generated by selecting /usr/bin/bash as the shell on the OmniOS installer 05:33:05 I could log in ont he system console, but it dumped when I logged into SSH until I changed it to ksh93 05:33:30 https://gist.github.com/jclulow/3cc98adaa23ccbc3b7227db5f318331b 05:33:47 this is the stack at the point of the actual SIGSEGV 05:33:55 It has, I believe, passed NULL to strlen() 05:34:07 from _rl_init_eightbit() 05:35:32 I'm going through the code 05:35:38 to see if I can figure out where it blows its brains out 05:35:53 it is in _rl_init_locale() 05:36:04 it is explicitly looking at LC_CTYPE haha 05:36:43 well there's a big giant #if here that relates to a configure check 05:36:53 (looking at readline-8.2 source code) 05:37:14 I believe the #if is true 05:37:27 it is 05:37:34 because "_rl_init_locale::dis ! grep call" in the debugger lists some calls 05:38:40 I... 05:38:43 Well, this is where it jumps into the C library 05:38:52 am not sure you are allowed to pass NULL to setlocale() like that?! 05:39:25 oh I guess maybe you are 05:39:41 Well in the case on Linux where locales won't be installed, I think it would go down that codebranch. 05:39:53 It could be a regression in libreadline ... 05:42:32 https://git.savannah.gnu.org/cgit/readline.git/log/nls.c 05:42:35 I'm looking at it in git 05:42:42 This function hasn't changed in like 10 years. 05:46:22 ok so we get "UTF-8" back from the environment lookup 05:46:34 so we are in fact passing it straight to setlocale() 05:47:04 This sounds like the C library is choking 05:47:20 Wasn't NLS one of the bits that was closed source from Sun and had to be replaced? 05:47:35 The i18n stuff 05:47:37 was 05:49:03 ls 05:49:10 this is not my terminal 05:49:18 accurate 05:51:55 I'm trying to find where the other side of that function call goes 05:53:01 https://github.com/illumos/illumos-gate/blob/d363b1b0cb9ef6d6f3febdd8d1cba46507e97098/usr/src/lib/libc/port/locale/setlocale.c#L68 05:54:32 yes I am single stepping through 05:57:18 ok so newlocale() in there returns NULL ultimately 05:57:21 because we parsed it rubbish 05:57:25 *passed 05:57:43 we are back in _rl_init_locale() 05:58:32 yeah it passes NULL to strlen() itself 05:59:03 It's in savestring() 05:59:09 in readline-8.2 05:59:25 It seems to have been inlined (blah) 05:59:40 but at the point where it calls strlen() it is about to pass the result to xmalloc() 05:59:46 which is what their savestring() does 06:00:30 So is the bug that readline passes in NULL or illumos's libc not being happy with that NULL? Cause this all looks like old code 06:01:03 The bug is that in the event that a rubbish locale is passed to setlocale(), setlocale() returns NULL because, well, it failed to set the locale 06:01:06 and then 06:01:27 readline does the wrong thing 06:01:28 wait 06:01:31 by assuming it won't be NULL 06:01:37 the manpage actually defines behavior for nullpointer 06:01:48        A null pointer for locale causes setlocale() to return a pointer to the 06:01:48        string associated with the category for the program's current locale. 06:01:49 We're not passing in NULL, we're _returning_ NULL 06:01:49        The program's locale is not changed. 06:01:52 Oh 06:02:01 we pass in what you set in your environment for LC_CTYPE 06:02:08 i.e., "UTF-8" 06:02:10 which is not a locale 06:03:26 oh hold on 06:03:35 I think I see what happens on Linux 06:03:37 "LC_CTYPE=mr_stephens bash" also fails FWIW 06:03:48 I think on Linux, this would return 'C' 06:04:12 I suspect it depends on your C library, but that may well be true 06:05:07 at least some linux manpage suggests it could return NULL as well 06:05:15 mcasadevall@lighthouse:~$ ./test_setlocale 06:05:16 setlocale returned: C 06:05:38   printf("setlocale returned: %s\n", setlocale(LC_CTYPE, NULL)); 06:05:38 It can definitely return NULL 06:05:48 well to be clear 06:05:57 setlocale(LC_CTYPE, "mrstephens") 06:06:00 not NULLL 06:06:12 macOS also returns 'C' 06:07:57 linux actually returns NULL 06:08:08 ? 06:08:13 if all you call is setlocale(LC_CTYPE, "mr_stephens"); 06:09:31 at least that's true on Ubuntu 22.04.3 LTS 06:09:45 so glibc, really, rather than Linux 06:10:12 yeah my test case was wrong, sorryh, its late here 06:10:37 I believe Mac OS X also returns NULL there 06:11:23 It does seem like /bin/bash on Ubuntu, at least, does _not_ link against libreadline 06:11:29 at least not dynamically 06:11:34 maybe the internal readline is less ruinous 06:11:38 It's likely using the built in one, bash can be set to /bin/sh 06:11:50 And there are some special considerations for /bin/sh and dpkg 06:11:56 sure 06:12:04 So I guess that's why it's not broken there haha 06:12:33 It's also possible that no one has actually run glibc with an entirely invalid lang environment. That code is pretty much older than dirt. 06:12:33 I can't really see how this would not segfault in the same way if you built it linked against the real libreadline -- unless of course they've patched _that_ in Ubuntu 06:12:58 like if you don't have the environmental variables sent 06:13:02 Do you know how LC_CTYPE is getting into your environment? 06:13:03 it sends NULL into setlocale() 06:13:35 As in my SSH shell or on illumos? 06:13:42 on your illumos system yeah 06:14:03 declare -x LANG="en_US.UTF-8" 06:14:17 I am not sure if its getting that from ssh or not 06:14:25 Hold on, let me check locale if I log on the system terminal 06:14:41 we're specifically looking for, like, env | grep LC_CTYPE 06:15:04 Its not set, but its also not crashing now ... 06:15:04 wait 06:15:06 hold on 06:15:18 LC_CTYPE=UTF-8 06:15:25 j'accuse! 06:15:28 I was using cool-retro-term because I was making a video 06:15:34 ah ha 06:15:37 The standard macOS Terminal.app doesn't cause it 06:15:41 but I wanted the fancy graphics 06:15:41 fascinating 06:15:56 I wonder why it does that! 06:16:27 mcasadevall@infinityway ~ % ssh ncmdr⊙110 06:16:27 (ncmdr⊙110) Password: 06:16:28 Last login: Thu Apr 4 05:46:02 2024 from 192.168.2.215 06:16:28 OmniOS r151049 omnios-master-b797611cbb April 2024 06:16:29 You have new mail. 06:16:29 Connection to 192.168.0.102 closed. 06:16:30 mcasadevall@infinityway ~ % 06:16:33 And that infact reproduces the crash 06:17:02 I just checked for good measure and I can ssh into Linux VMs just fine 06:17:03 so uh 06:18:01 You should definitely fix the LC_CTYPE value that cool-retro-term is setting there, I don't believe that's ever valid 06:18:11 I didn't even notice it was broken 06:18:17 cause this was the first time I saw it go kaboom 06:18:28 Well most software probably correctly handles a return of NULL from setlocale() haha 06:18:31 except its a valid locale 06:18:33 C.UTF_8 06:18:43 I believe that would match through locale rules 06:19:14 LC_CTYPE=UTF-8 locale on Ubuntu complains loudly 06:19:50 hold on 06:19:53 I need to figure out what's doing 06:20:02 my locale when logging into Ubuntu over SSH is being set to C.UTF-8 06:20:08 Instead of what its coming in on SSH 06:21:39 so if I login over SSH with an invalid locale, it gets set to C.UTF-8 06:21:48 If I log in with an invalid locale to OmniOS 06:21:50 LC_CTYPE=abcdefg 06:22:28 good lord, /etc/profile.d/01-locale-fix.sh 06:23:21 Yeah ok so they've invented some training wheels in /usr/bin/locale-check haha 06:23:30 if you.. LC_CTYPE=frank /usr/bin/locale-check C.UTF-8 06:23:44 it emits LANG=C.UTF-8 LC_CTYPE=C.UTF-8 06:23:50 Why do I feel like this bug was found a decade ago and this was the fix? 06:24:11 I mean, it is on some level true that not having your locale set properly is bad 06:24:19 but it is also true that it should probably _tell you_ 06:24:34 because how would you ever know that your terminal is setting it to something that is totally bogus 06:24:50 cool-retro-term should be passing C.UTF-8 there if that's what wants 06:24:54 or tbh not changing it at all? 06:24:57 I'm pretty sure we are all intended to be coding on VT100's in a dark room ;) 06:25:04 haha 06:25:24 I do recall using Cathode.app a million years ago, before I gave up on the Mac 06:26:03 So the question is what is the actual and proper fix? 06:26:25 I guess patching readline to be less stupid and hope upstream takes it 06:27:00 Yes I believe it is a readline bug 06:27:20 https://github.com/TritonDataCenter/node-manta/issues/61#issuecomment-19653740 is the only evidence I can recall of my Cathode.app use lol 06:28:21 It's possible we could consider sanitising user locale stuff too, though that really feels like a perilous ptah 06:28:23 *path 06:28:32 I'm sure we'll break _other_ things doing stuff like that 06:29:50 Well, hold on, would you? Applications should be expected not to crash if there's a valid locale, and making sure the environment has a safeguard incase of bad terminals is not necessarily a bad thing 06:30:20 It's just a big hammer 06:30:21 Essentially, this is more making getty (or whatever the equivelent is) basically sanity check what comes in, and screech loudly if it dislikes it. 06:30:46 Well, getty/ssh don't care 06:31:05 most programs correctly handle rubbish in the locale variables FWIW 06:31:14 it's why you got all those messages about it haha 06:31:23 It was in fact crying out in pain 06:31:50 I got desensitized to it because debootstrapped Ubuntu doesn't install locales so Perl screaming is normal. 06:31:52 >.>; 06:31:55 I am surprised that this "locale-check" program doesn't warn people about it 06:32:02 I didn't even know it existed 06:32:07 Me neither! 06:32:20 It of course has no manual page despite being in /usr/bin 06:32:35 hold on, I can find who owns it 06:33:09 mcasadevall@lighthouse:~$ dpkg-query -S /etc/profile.d/01-locale-fix.sh 06:33:10 base-files: /etc/profile.d/01-locale-fix.sh 06:33:12 lol, "locale-check --help" silently exits 0 06:33:40 is "base-files" the package? 06:34:21 https://bugs.launchpad.net/ubuntu/+source/maas/+bug/1134036 06:35:57 "I'm not a big fan of this approach but I certainly don't have any better idea myself." 06:36:18 Well they were having postgres explode 06:36:21 ... which uses libreadline 06:36:39 Well I found the common thread here :P 06:36:59 lol 06:37:01 indeed 06:37:12 So I can actually provide some context 06:37:17 because I worked on MaaS, albiet loosely 06:37:32 They were basically deploying absolutely minimium images without locales for bare metal VM hosting 06:37:44 https://git.launchpad.net/ubuntu/+source/base-files/tree/locale-check.c 06:38:28 This is a deeply magical behavior 06:38:47 it truly is 06:38:58 That's what I mean by peril haha 06:39:08 Like, people will set their locales like your terminal did 06:39:14 and never know they were not getting what they asked for 06:39:26 and stuff will just magically work except sometimes when it doesn't 06:39:40 It really feels like this thing should emit a WARNING to stderr 06:40:05 Arguably, OpenSSH should be sanitizing what its setting the environemnt to 06:40:16 Because its entirely valid that you will have a locale on system A that doesn't exist on system B 06:40:29 Well I believe SunSSH used to actually do thta 06:40:31 *thta 06:40:34 sigh *that 06:40:35 And that formatting doesn't necessarily need to be the same across operating systems 06:40:42 it would negotiate the locale 06:40:54 but we dropped SunSSH (a fork of OpenSSH) long ago 06:41:03 and the OpenSSH people are not into complex stuff like that 06:41:09 yeah I'm not surprised 06:41:29 The ubuntu-devel discussion seems to conclude that openssh is doing the wrong thing 06:41:45 We have our OpenSSH built to AcceptEnv the LC_* LANG stuff by default, as is the broad custom 06:41:54 and then it's just up to users to hold it properly basically 06:42:30 The question becomes is it realistic that OpenSSH will get a locale that's valid on linux/mac/windows, but not valid on illumos, and what the correct behjavior is in that case 06:42:44 It's unfortunately not really possible to say 06:43:17 If you don't follow the spec, and you don't specify a locale that works on the remote system, the best anyone can really do is guess 06:43:26 maybe C, maybe C.UTF-8 if we thought you wanted UTF-8 06:43:43 also what if you specify a correct LANG but a bung LC_COLLATE 06:43:51 It is ultimately a mess haha 06:44:03 Is there an actual defined spec for locales? 06:44:32 I feel like some of it is covered in the various standards yes 06:44:35 between C and POSIX 06:45:41 e.g., https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html 06:46:03 which of course doesn't cover UTF-8 haha 06:46:57 I think C.UTF-8 is actually probably a made up debian thing 06:47:02 that we've mostly all subsequently adopted 06:47:52 The category body shall consist of one or more lines of text. Each line shall contain an identifier, optionally followed by one or more operands. Identifiers shall be either keywords, identifying a particular locale element, or collating elements. In addition to the keywords defined in this volume of POSIX.1-2017, the source can contain 06:47:52 implementation-defined keywords 06:47:54 https://www.illumos.org/issues/11661 06:47:55 → BUG 11661: provide C.UTF-8 locale (Closed) 06:48:15 So there's no defined format 06:48:19 for locale names 06:48:38 Or well, there's nothing saying I can't make my own locale of LANG="pirate_ARGGGG", and be against spec 06:49:07 and then ssh into a normal illumos system from my pirate speaking wonderland 06:49:12 I think that's basically true haha 06:49:29 But I'm not an expert in this area 06:49:39 Neither am I 06:50:01 At any rate, I will file an OmniOS bug describing the readline bug 06:50:09 And we can figure out what to do next 06:50:10 This is getting fairly close the eldritch tomes of lore that AT&T used to forge the One Ring deep in the heart of Murry Hill 06:51:10 I guess fix the readline bug, and see if upstream accepts it, and then deciding if illumos needs to have a locale wrapper, since readline is probably one of the most deployed codebases on the planet 06:54:39 make sure to shoot me the link to that bug, since I'll weigh in. If nothing else, making a video on describing how a bug in libreadline was found will be interesting 06:58:56 I just had a stupid thought 06:59:10 is there any way this can be exploited for code execution? This is a variable a user controls. 06:59:58 I'm not sure it actually gets you anything, but crashing on bad environment data in a library as well used as readline makes me twitchy 07:03:06 no wait, the bad data causes NULL to strlen, nothing a user writes to 07:03:11 ugh, I need sleep 07:08:40 NCommander: https://github.com/omniosorg/omnios-build/issues/3537 07:08:42 I filed the bug 09:37:47 cheers, got a question about NWAM which manages my OI VM's networking: sometimes the home DHCP server from the access point goes AWOL, and the OI VM loses networking (probably lease expires?) -- is it possible to tell it to use the last served address indefinitely until told otherwise? Or a specific fallback IP? 09:41:28 if LAn have multiple gateways, how to tell illumos to always use first gateway (and DNS server) and then if it fails the other one(s). 09:42:08 I see it alternating between gateways to access outside world, even if explicitly set what gateway and DNS to use. (SmartOS) 15:13:13 Silly question... *our* ldd(1) doesn't execute the binary at all while trying to find paths, right? 15:13:15 https://jmmv.dev/2023/07/ldd-untrusted-binaries.html 15:15:13 I'm also pretty sure our strings(1) isn't as dangerous as the binutils one used to be... 15:15:18 https://lcamtuf.blogspot.com/2014/10/psa-dont-run-strings-on-untrusted-files.html 15:41:27 our ldd most certainly exec's the binary after setting a bunch of environment variables that cause ld.so to dump its guts rather than run the program. 15:42:28 our ldd does check that the ELF interpreter is in a plausible location (in /lib, /usr/lib, or /etc/lib) which reduces but doesn't eliminate the chance of mischief 16:19:27 strings should probably dumps privs. 16:19:34 s/dumps/drop 16:50:46 danmcd: sorry to disappoint you: https://github.com/illumos/illumos-gate/blob/master/usr/src/cmd/sgs/ldd/common/ldd.c#L36-L39 17:57:17 Oh damn... 17:57:48 Well, glad I asked? 21:04:05 [illumos-gate] 16413 Post-barrier Return Stack Buffer (PBRSB) fixes can be detected in HW -- Dan McDonald