Opened 13 months ago

Last modified 7 months ago

#63405 new defect

openssh @8.4p1_6 on El Capitan broken today - also keychain gives error message.

Reported by: snowflake (Dave Evans) Owned by:
Priority: Normal Milestone:
Component: ports Version: 2.7.99
Keywords: Cc: thetrial (alabay), cooljeanius (Eric Gallager)
Port: openssh keychain openssl

Description (last modified by snowflake (Dave Evans))

After the upgrade of openssh and openssl today, the ssh command of openssh no longer works.

Here is the message trying to connect to my beta host

$ ssh beta
Killed: 9

/Usr/bin/ssh works

I can not find any diagnostic messages for this error. I have restarted the system in case there are any programs which have not loaded the new openssl libraries.

The Macports port keychain has also stopped working.

Here's the message

 * keychain 2.8.5 ~ http://www.funtoo.org
 * Starting ssh-agent...
 * Adding  1 ssh key(s): /Users/davidevans/.ssh/id_rsa
Enter passphrase for /Users/davidevans/.ssh/id_rsa: 
Bad passphrase, try again for /Users/davidevans/.ssh/id_rsa: 
 * Error: Problem adding; giving up

I can find a crash report for ssh-agent in Logs.

Also the ReportCrash process crashes when trying to create a crash report.

Application Specific Information:
Analyzing process: ssh-agent[1194], path: /opt/local/bin/ssh-agent; parent process: [1], path: /sbin/launchd

This is all happening on El Capitan 10.11.6. On Monterey it is all working.

openssl @1.1.1l_0; keychain @2.8.5_1; openssh @8.4p1_6+kerberos5+xauth

Attachments (2)

ssh-agent_2021-08-25-172843_two.crash (11.2 KB) - added by snowflake (Dave Evans) 13 months ago.
Diagnostic report for ssh-agent
failing_keys.zip (4.0 KB) - added by snowflake (Dave Evans) 10 months ago.
A key that breaks ssh-agent on macOS 10.11.6. Also includes a log of the session with ssh-add (18/Nov/2021)

Download all attachments as: .zip

Change History (24)

Changed 13 months ago by snowflake (Dave Evans)

Diagnostic report for ssh-agent

comment:1 Changed 13 months ago by snowflake (Dave Evans)

Description: modified (diff)
Summary: openssh @8.4p1_6 on Mountain Lion broken today - also keychain gives error message.openssh @8.4p1_6 on El Capitan broken today - also keychain gives error message.

comment:2 Changed 13 months ago by snowflake (Dave Evans)

I activated the previous version of openssl = 1.1.1k_0 and now ssh-agent does not crash when keychain adds a password.

Last edited 13 months ago by snowflake (Dave Evans) (previous) (diff)

comment:3 Changed 13 months ago by kencu (Ken)

some fancy business happened with openssl 1.1.1l not building and then being fixed. openssh was revbumped to build against the new openssl 1.1.1l, but I'm not sure that happened correctly for you given the way things worked.

So if you have an interest, you could

  1. install the current 1.1.1l openssl
  2. rebuild from source openssh against that new openssl

and see if that works.

To rebuild openssh from source, you would uninstall the current version and rebuild it with the -s flag, something like this:

sudo port -f uninstall openssh
sudo port -v -s install openssh

NB. If your current working openssh is critical to you, just leave it until somebody else either fixes the issue, or confirms that this works.

comment:4 Changed 13 months ago by snowflake (Dave Evans)

Thank you.

I think I built from source the first time, but I followed your instructions and the error still persists -- ssh-agent crashes after entering the password in keychain.

comment:5 Changed 13 months ago by snowflake (Dave Evans)

I compiled openssh with debugging symbols. ssh without any arguments crashes.

Here's the lldb log:

Script started on Thu Aug 26 13:26:27 2021
command: lldb -X -f ssh
"crashlog" and "save_crashlog" command installed, use the "--help" option for detailed help
"malloc_info", "ptr_refs", "cstr_refs", "find_variable", and "objc_refs" commands have been installed, use the "--help" options on these commands for detailed help.
(lldb) target create "ssh"
Current executable set to 'ssh' (x86_64).
(lldb) run
Process 42888 launched: '/Users/davidevans/junk/hello/ssh' (x86_64)
Process 42888 stopped
* thread #1: tid = 0x2e178, 0x00007fff85e4083a libsystem_kernel.dylib`close + 10,
   queue = 'com.apple.main-thread', stop reason = EXC_GUARD 
         (code=4611686022722355203, subcode=0x7fff74599568)
    frame #0: 0x00007fff85e4083a libsystem_kernel.dylib`close + 10
libsystem_kernel.dylib`close:
->  0x7fff85e4083a <+10>: jae    0x7fff85e40844            ; <+20>
    0x7fff85e4083c <+12>: movq   %rax, %rdi
    0x7fff85e4083f <+15>: jmp    0x7fff85e3a7f2            ; cerror
    0x7fff85e40844 <+20>: retq   
(lldb) up 1
frame #1: 0x00000001000ad5b6 ssh`closefrom(lowfd=3) + 278 at bsd-closefrom.c:114
   111 			goto fallback;
   112 		for (i = 0; i < r / (int)PROC_PIDLISTFD_SIZE; i++) {
   113 			if (fdinfo_buf[i].proc_fd >= lowfd)
-> 114 				close(fdinfo_buf[i].proc_fd);
   115 		}
   116 		free(fdinfo_buf);
   117 		return;
(lldb) up 1
frame #2: 0x0000000100003b9f ssh`main(ac=1, av=0x00000001006043b0) + 415 at ssh.c:683
   680 		 * Discard other fds that are hanging around. These can cause problem
   681 		 * with backgrounded ssh processes started by ControlPersist.
   682 		 */
-> 683 		closefrom(STDERR_FILENO + 1);
   684 	
   685 		/* Get user data. */
   686 		pw = getpwuid(getuid());
(lldb) quit



comment:6 in reply to:  5 Changed 13 months ago by snowflake (Dave Evans)

Replying to snowflake:

>    680 		 * Discard other fds that are hanging around. These can cause problem
>    681 		 * with backgrounded ssh processes started by ControlPersist.
>    682 		 */
> -> 683 		closefrom(STDERR_FILENO + 1);
>    684 	
>    685 		/* Get user data. */
>    686 		pw = getpwuid(getuid());

I commented out line 683 in ssh.c, as shown above, and it now connects to my hosts. This is NOT a fix! I do not know why closefrom() is not working. Or even why it works if openssl 1.1.1k is installed.

Version 0, edited 13 months ago by snowflake (Dave Evans) (next)

comment:7 Changed 13 months ago by thetrial (alabay)

Cc: thetrial added

comment:8 Changed 13 months ago by sambthompson (Sam Thompson)

Cc: sambthompson added

comment:9 Changed 12 months ago by thetrial (alabay)

I'm afraid there may be also a dependency on #63417.

comment:10 Changed 12 months ago by snowflake (Dave Evans)

I noticed that ticket #63421 comment 6 is now closed, fixing some bugs in openssl.

I reinstalled openssh on 10.11.6 and retried ssh-agent and ssh. ssh now seems to be working. ssh-agent no longer crashes, but I will have to check my scripts to see whether I am using the system ssh-agent, but provisionally it is working. Good work, Macporters!

comment:11 in reply to:  10 Changed 12 months ago by sambthompson (Sam Thompson)

Replying to snowflake:

I reinstalled openssh on 10.11.6 and retried ssh-agent and ssh. ssh now seems to be working.

Can also confirm ssh is working after re-install.

comment:12 Changed 12 months ago by thetrial (alabay)

Seems to work, yes.

comment:13 Changed 12 months ago by thetrial (alabay)

… or not. Again something with openssh does not work under El Capitan. See #63598.

comment:14 in reply to:  13 Changed 12 months ago by sambthompson (Sam Thompson)

Replying to thetrial:

… or not. Again something with openssh does not work under El Capitan. See #63598.

Unrelated. #63598 due to patchfile for gsskex not being able to support upgrade to 8.8.

comment:15 Changed 12 months ago by sambthompson (Sam Thompson)

Cc: sambthompson removed

comment:16 Changed 10 months ago by snowflake (Dave Evans)

I've been using Macports openssh successfully since I posted comment 7, 7 weeks ago.

But I have bad news. On macOS 10.11.6, ssh-agent is now broken again. Something has happened in the last couple of days but I don't know what.

ssh-agent now crashes when adding keys with ssh-add. The macOS system program that is supposed to generate crash logs, ReportCrash, is also crashing, so it generates a report about itself and there is no log for ssh-agent. If I start ssh-agent and attach mp-lldb-10 to the PID, the debugger may also be crashing. There is copious output some of which appears to be related to the debugger. It's hard to tell.

I have generated new keys in case the crash was related to my very old keys, but the new keys crash ssh-agent as well.

I will attach the keys that are crashing to this ticket.(failing_keys.zip)

Changed 10 months ago by snowflake (Dave Evans)

Attachment: failing_keys.zip added

A key that breaks ssh-agent on macOS 10.11.6. Also includes a log of the session with ssh-add (18/Nov/2021)

comment:17 Changed 10 months ago by snowflake (Dave Evans)

The current versions I'm running: openssh @8.8p1_1+kerberos5+xauth; openssl @3_1

It is only rsa keys that crash ssh-agent when added. rsa1 keys result in an error but the agent does not crash. dsa, ecdsa, ed25519 are all ok.

I've rebuilt openssh with debugging symbols and lldb is now behaving. The crash seems to be in libcrypto.3.dylib, which is part of openssl.

The next step is to build openssl with debugging symbols.

comment:18 Changed 10 months ago by snowflake (Dave Evans)

This is on openssl3 @3.0.0_5+legacy; openssh @8.8p1_1+kerberos5+xauth; macOS 10.11.6

I've now found the last frame in ssh-agent where the crash happens. It is here

frame #26: 0x000000010489fc81 ssh-agent`
sshkey_private_deserialize(buf=0x00007ff133d21fe0,
 kp=0x00007fff5b3738f0) at sshkey.c:3672:7
   3669		switch (k->type) {
   3670		case KEY_RSA:
   3671		case KEY_RSA_CERT:
-> 3672			if (RSA_blinding_on(k->rsa, NULL) != 1) {
   3673				r = SSH_ERR_LIBCRYPTO_ERROR;
   3674				goto out;
   3675			}

It is conditional on a RSA key, which explains why all the other key types work.

Frame 4 is the last frame in libcrypto.3

frame #4: 0x0000000104c5390d libcrypto.3.dylib
`syscall_random(buf=0x00007ff133c15710, buflen=32) at rand_unix.c:371:9
   368 	    }
   369 	#    elif defined(OPENSSL_APPLE_CRYPTO_RANDOM)
   370 	
-> 371 	    if (CCRandomGenerateBytes(buf, buflen) == kCCSuccess)
   372 		    return (ssize_t)buflen;
   373 	
   374 	    return -1;
(lldb) up 1
frame #5: 0x0000000104c53691 libcrypto.3.dylib`ossl_pool_acquire_entropy(pool=0x00007ff133c156d0)
 at rand_unix.c:646:21
   643 	        bytes_needed = ossl_rand_pool_bytes_needed(pool, 1 /*entropy_factor*/);
   644 	        while (bytes_needed != 0 && attempts-- > 0) {
   645 	            buffer = ossl_rand_pool_add_begin(pool, bytes_needed);
-> 646 	            bytes = syscall_random(buffer, bytes_needed);
   647 	            if (bytes > 0) {
   648 	                ossl_rand_pool_add_end(pool, bytes, 8 * bytes);
   649 	                bytes_needed -= bytes;

So it looks like there is some bug in random number generation.

On a slightly related note, after I had built and installed openssl3 with debugging symbols I tested it with mp-lldb-10

lldb /opt/local/bin/openssl

It worked once, so I tried it again and lldb crashed. These things are sent to try us.

comment:19 Changed 10 months ago by snowflake (Dave Evans)

In comment 18, all this debugging was the result of adding a RSA key with ssh-add

comment:20 Changed 10 months ago by snowflake (Dave Evans)

I've created ticket #64008 for the specific problem of random number generation cashing in openssl3 Applying the fix mentioned in #64008 fixes the ssl-agent crashes on older macOS

comment:21 in reply to:  20 Changed 10 months ago by snowflake (Dave Evans)

Replying to snowflake:

I've created ticket #64008 for the specific problem of random number generation cashing in openssl3 Applying the fix mentioned in #64008 fixes the ssl-agent crashes on older macOS

#64008 has now been fixed. ssh-agent is now working again on older macOS.

openssl3 @3.0.0_6

comment:22 Changed 7 months ago by cooljeanius (Eric Gallager)

Cc: cooljeanius added
Note: See TracTickets for help on using tickets.