19:39:50 veg: I tried a dist upgrade in one of the bookworm images and a lot of things broke. Something in systemd (IIRC) wanted a feature that is not being emulated and just dies very early on in the init process. 19:39:58 I didn'treally dig deep though 19:40:07 As I'm ok on bookworm for now 20:07:26 I'm seeing strange behavior in kclient when used against AD with IPv6 enabled. https://pastebin.com/5ZJDV1RC 20:07:39 any ideas? Is this a bug or am I just holding it wrong? 20:28:38 At first glance, it's a bug/missing-feature. Do you know what *process* is failing? 20:29:08 I see some $CMD/krb5 that has AF_INET6/AF_INET duality, but not ALL of it. WOuld be interesting to know what process is only using IPv4. 20:29:53 "Failed to set account password." seems to be a difference. 20:31:45 Ahh, I think I found the problem. Pardon the multi-line: 20:31:49 kebe(lib/krb5)[0]% git grep -lw AF_INET 20:31:50 kadm5/clnt/changepw.c 20:31:50 kadm5/srv/chgpwd.c 20:31:51 kebe(lib/krb5)[0]% git grep -lw AF_INET6 20:31:53 kebe(lib/krb5)[1]% 20:32:23 A fix in $LIB/krb5 is probably your first step. 20:32:36 danmcd, I wish I knew which process was failing. 20:32:53 so should I be reporting this as an illumos bug? 20:32:56 I actually don't need to know that now. I missed the output. 20:33:03 Assuming it isn't already filed, yes. 20:33:42 https://www.illumos.org/issues/17577 is the catch-all "Update our krb5, dammit." 20:33:43 → BUG 17577: Kerberos code needs update (New) 20:33:59 Your problem COULD MAYBE be fixed by whacking those two lib/krb5 files a bit. 20:35:00 No bugs mention IPv6 by name. 20:35:11 I bumped into this problem while trying to solve another (more important) one. 20:35:22 should I add it to 17577 or create a new one? 20:35:53 The `ksetpw` command is what's failing in your kclient (a shell script) invocation. That binary in $CMD calls one or both of the stuck-in-v4-only routines in $LIB. 20:36:19 17577 states that Oracle open-sourced their formerly-in-ON krb5 bits, or at least the patches? 20:36:32 Maybe a v6 patch is in there somewhere?!? I have no idea... 20:37:49 I can work around this temporarily but it's going to bite us ($job[1]) fairly badly if we ignore it for too long. 20:38:06 workaround being to add the DCs to /etc/hosts on the impacted servers. 20:38:27 oracle moved kerberos into solaris-userland 20:38:39 that would be the best source of knowing what the hell, because the diffs of kerberos v. upstream are a mess 20:38:49 and the XXX comments left behind are also not easy to under 20:38:50 stand 20:39:39 new ticket time? 20:40:10 Yes. 20:40:25 I'm assuming that if you can get what we have to Just Speak IPv6 we're all good? 20:40:34 That alone is worth a distinct ticket IMHO. 20:40:51 yeah, that's probably fixable 20:41:08 the krb5 upgrade probably needs someone like racktop with big AD stuff to deal with 20:41:14 at least if they want it to still work afterward 20:44:14 https://www.illumos.org/issues/17809 submitted 20:44:15 → BUG 17809: kclient + AD + IPv6 = "Setup FAILED" (New) 20:44:51 Please put your pastebin output in there. I'll update with how I got to the lib/krb5 bits. 20:45:40 the pastebin output is already in there. That's 90% of the content. :) 20:49:37 Now to go back to trying to figure out why ldap bind is failing 20:55:54 darnit. I was really hoping the computer account was the issue but nope, nscd is still failing to proxy bind to the AD servers. 21:15:40 anyone using LDAP? Can you tell me the ownership and perms of /var/ldap/ldap_client_* please? 21:16:22 I find it hard to believe it needs to be world readable but: 21:16:28 Jan 6 13:08:01 testfs1 svc.startd[36]: [ID 293258 daemon.warning] libsldap: Status: 2 Mesg: Unable to load configuration '/var/ldap/ldap_client_file' (''). 21:17:25 (FTR, making it world readable and rebooting did not get that error logged.) 21:18:03 it shouldn't -- ldap_cachemgr should be the only thing that needs to read it 21:18:28 the ldapclient command should create all of that for you 21:18:46 I assumed it should be 0400 root:root but that generated the above error. 21:18:59 jbk, it did generate it and gave it the 0400 perms. 21:19:13 the big one is the creds file -- that holds the password for binding 21:19:20 (and the DN of the account to bind to) 21:19:30 there's no error with ldap_client_cred being 0400 root:root. 21:20:16 I'm going to put 0400 back on ldap_client_file and restart and see if the error comes back. 21:24:30 I take it back. I'm still getting that can't load error even with world-read perms on the file. 21:24:39 so something ldapclient is writing is upsetting it, I guess. 22:05:55 I think that error was a red herring. 22:28:17 i think you can run ldap_cachemgr with the -g flag (I think that's it) that'll tell you the status of the configured servers 22:30:11 thanks. I'm currently rebuilding testfs1 but I'll give that a try when I'm back to that stage. 22:30:32 I *think* it is actually working... partly. getent passwd returns some users but not all. 22:32:56 are you trying to allow AD accounts to login to the system? 22:35:15 login? no. 22:35:25 This is a fileserver, I need smb to work. 22:35:55 our current system is causing $boss upset because it uses NTMLv2 and he doesn't like that. 22:36:13 why not just use smbadm join ? 22:36:31 (it uses a simple smbadm join) 22:36:48 jbk: that's the thing that's causing the NTMLv2 propblem. 22:36:51 at least recent versions should setup all of the kerberos bits 22:37:08 it does, but for whatever reason we're still getting alerts logged on the DC. 22:37:12 IIUC, that's a client thing -- you have to connect using the FQDN to get the client to use kerberos 22:37:36 fqdn for the client or the server? ... I swear I tried that but maybe I'm just an idiot. 22:38:36 you might want to set the minimum smb verson to 2.1 if you haven't done that 22:39:23 I presume that's different from setting lmauth_level=5? 22:39:48 (I have this vague memory or setting smb version to 2.1 years ago but I don't see anything in my notes about it.) 22:40:04 check sharectl get smb 22:40:10 yeah -- if you send an email to illumos-discuss, I might be able to prod Gordon to answer with more detail since he knows all the details.. 22:40:21 if he doesn't see it himself 22:40:22 min_protocol=2.1 22:41:28 but I do recall if you connect to the share using something like \\IPADDR\SHARE, that _does_ use NTLM, and you have to use \\SERVER.FQDN\SHARE (while required, that might not be sufficient) 22:41:31 looks like we're not specifying the name of the DC in our join command. just -u admin_username doma.in.name 22:42:01 oh yeah, that's... unlikely :) 22:42:19 our users love their go to the DNS CNAME \\fs then browse. 22:42:46 yeah -- smbadm join should do DNS SRV lookups on the domain name to locate the DCs 22:43:02 (I think the bits to make it site aware are also upstreamed if you're using sites) 22:43:08 smbadm doesn't have an option to specify DC for the join command and given the \\IPADDR comment I'm guessing that's not relevant anyway. 22:43:30 I mean, clients going to \\IPADDR are going to use NTML 22:43:36 when you do smbadm join 22:43:44 it's going to do the DNS SRV lookups 22:43:53 create the keytab 22:43:54 etc 22:44:19 sadly, I have no way to force the clients to use the long name. 22:44:28 FQDN 22:45:01 I wonder if that's still going to be a problem if I get this new configuration working. :( 22:46:02 I'd still ask -- ISTR that it's the SMB _client_ (i.e your desktops) that decide to use NTLM or kerberos -- I'm not sure that the server can force the behavior 22:47:00 * ENOMAD nods 22:47:14 I'll try to write a concise question to post ... tomorrow. 22:47:33 I've been busting my head against this LDAP stuff for several days and I'm not sure what's what anymore. 22:47:37 or just clean up everything and try smbadm join 22:47:43 (the host join bug didn't help.) 22:47:58 it'll also configure idmap to talk to AD using LDAP using SASL/gssapi 22:48:05 the thing is, the DC is logging the complaint against the file server, not the client. 22:48:37 (for historical reasons, for smb sharing idmap is the thing that actually does all the LDAP lookups) 22:49:03 yeah, when an smb client connects, the server has to basically forward the request to the DC to validate it 22:49:07 I tripped over that idmap setting at some point but figured I'd get there after I got rid of the binding error being logged in /var/adm/messages first. 22:49:21 if you're doing smb, you don't need ldap/client or any of that... 22:49:28 I'll need to go back through the documentation and find that setting again. 22:49:49 for now I'm trying to follow https://omnios.org/setup/ad-connect but it's for non-SSL connections so I'm ... adapting. 22:49:55 if you're trying to share a given share via both SMB and NFS, you'll need to configure idmap to lookup the correct attributes in AD 22:50:07 nope, I refuse to share via both. 22:50:12 they get one or the other, not both. 22:50:24 yeah, for smb, it doesn't use SSL or TLS 22:50:24 To the point where I have separate servers for NFS & SMB. 22:50:27 it uses gssapi 22:50:55 I don't suppose there's a simple cookbook document about this :) 22:51:19 svcadm enable smb/server idmap; smbadm join -u domain.name 22:51:28 that really should be all you need as long as DNS is configured 22:51:44 that's what I have been doing for the past several years. 22:51:47 well and ntp (because kerberos) 22:52:02 but, like I said, $boss wants the NTML alerts to stop. 22:53:18 previous setup: svcadm refresh ntp; svcadm enable -r smb/server ; sharectl set -p max_workser=2048 smb, sharectl set -p lmauth_level=5 smb ; sharectl set -p ipv6_enable=true smb ; smbadm join -u ... ; and then a bunch of idmap 22:54:09 oh, and cp /etc/nsswitch.ad /etc/nsswitch.conf 22:54:46 ad shouldn't be needed for the most part (nss_ad.so really needs to be rewritten, but that's a _long_ story) 22:55:23 silly question... but have you tried to block client(s) to use NTLM? 22:55:34 you mostly need it if you have RFC2307 attributes in AD (or equivalent) and want to have people login to a system using their AD creds 22:56:19 except that it duplicates (poorly) a lot of logic in idmap (so it doesn't handle down DCs nearly as well as idmap) 23:02:02 tsoome_, I have not done that. Until a few minutes ago I thought this was a server problem. 23:02:49 tsoome_, If you want to save me a bunch of googling and know the command(s) to try for that I'd appreciate it :) 23:02:55 I mean, first for debugging, because the client fall back to NTLM if kerberos fails.... 23:03:12 hmm... thinking about this... the test host doesn't have any clients talking to it yet so why would it be generating those errors? 23:04:59 do not know the commands without google:D 23:05:23 right now I'm trying to google how to set idmap to sasl/gssapi and ... end of day brain isn't helping. 23:09:10 you shouldn't need to do that -- smbadm join takes care of that for you 23:09:35 it doesn't use ldap_cachemgr or ldap/client 23:09:44 then the existing server (which uses smbadm join) should already have done that. 23:09:52 idmap creates its own ldap connections 23:09:57 and has it's own config stored in smf 23:10:30 you should see things like info about the machine account, etc if you do a listprop on the idmap service 23:11:11 basically there's a lot of discovery via DNS SRV records that are used to locate AD DCs 23:11:30 idmap does all of that, and will handle locating a new DC if the one it's currently usign dies 23:12:13 listprop doesn't have anything with 'sasl' in it (grep -i) 23:12:36 svccfg -s svc:/system/idmap listprop - right? 23:14:09 it's basically baked in 23:14:36 you should see things lik config/machine_uuid, config/machine_sid, and config/domain_name populated 23:14:46 yep, I'm seeing those 23:16:05 and you should already have /etc/krb5/krb5.keytab populated with principals for the FQDN of the machine 23:17:04 yep, with DES, ArcFour, AES-128, and AES-256 options. 23:17:33 for host, cifs, HOSTNAME$, nfs, HTTP, and root 23:30:04 I just checked the logs, we're seeing these alrts for fs2 and hvfs2 - both of which have no clients (they're test hosts). 23:30:34 I think I'm going to stop with this for the day and pick up 'fresh and clean' tomorrow. 23:30:44 smbadm list-sessions 23:30:47 make sure of that 23:31:22 : || lvd@fs2 ~ [502] ; sudo smbadm list-sessions 23:31:22 Password: 23:31:22 : || lvd@fs2 ~ [503] ; 23:32:08 same for hvfs2. If they'd had sessions I'd have been very surprised. 23:34:22 specifically, $boss is auditing for" LogName = 'Microsoft-Windows-NTLM/Operational' Id = 4023" 23:36:33 "Check for Missing SPNs: Often, NTLM fallback occurs due to missing or incorrect Service Principal Names (SPNs). Ensure SPNs are correctly registered for the services in Active Directory." 23:38:30 any suggestions on how to do that? 23:39:26 spn is basically service/host.domain format name (or service/host) 23:40:02 so you would need to fetch all ldap entries for host:) 23:41:30 but, I'm not quite sure why is your test server generating those NTLM audit records. Thats something one should dig from source and NTLM related docs - maybe some sort of session setup or whatever like that.... 23:43:36 to disable NTLM from illumos server side... I'd check with gwr as suggested:) 23:43:47 setspn -L hostname returns a lot of things but I'm ignorant of their meaning. More reading. 23:46:11 I'll start drafting that email but suspect it won't go out until I've had time to review it in the morning. 23:46:12 some years ago I had to set up AD auth for solaris, but that was just for user access (ldap + kerberos) and havent touched AD later:) 23:56:42 in my ideal world I wouldn't be touching AD ...