Opened 6 months ago

Closed 6 months ago

#68777 closed defect (fixed)

postgresql16 @16.1: cannot connect to servers that require SSL since last update

Reported by: davidnich (David Nichols) Owned by: dgilman (David Gilman)
Priority: Normal Milestone:
Component: ports Version: 2.8.1
Keywords: Cc: barracuda156, neverpanic (Clemens Lang), Dave-Allured (Dave Allured), fgunbin, jrabinow
Port: postgresql15 postgresql16 openssl3

Description

After the last update to postgresql15 and postgresql16, I can no longer connect to servers that require SSL (with either port).

I get the following error with psql for example:

$ psql -Uqorusapi -hsupah -p31432
psql: error: connection to server at "supah" (192.168.16.20), port 31432 failed: FATAL:  no PostgreSQL user name specified in startup packet
connection to server at "supah" (192.168.16.20), port 31432 failed: FATAL:  no PostgreSQL user name specified in startup packet
psql(45426,0x1e07b9ec0) malloc: *** error for object 0x17: pointer being freed was not allocated
psql(45426,0x1e07b9ec0) malloc: *** set a breakpoint in malloc_error_break to debug
zsh: abort      psql -Uqorusapi -hsupah -p31432

This server is running in a postgres-operator cluster - from Linux a successful connection looks like:

$ psql -Uqorusapi -hsupah -p 31432
Password for user qorusapi: 
Pager usage is off.
psql (15.4, server 15.3)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.

qorusapi=> 

Connections to local PostgreSQL DBs and to remote DBs that do not require SSL succeed, so my assumption is that something in the last security patches affected the ability of these ports to connect to PostgreSQL DBs that require SSL.

Change History (16)

comment:1 Changed 6 months ago by ryandesign (Ryan Carsten Schmidt)

Cc: barracuda156 neverpanic added; vital.had@… removed
Keywords: postgresql removed
Summary: postgresql16 16.1 cannot connect to servers that require SSL since last updatepostgresql16 @16.1: cannot connect to servers that require SSL since last update

Seems like the update to openssl3 @3.2.0 has caused problems for many ports.

comment:2 Changed 6 months ago by neverpanic (Clemens Lang)

Which version of macOS or OS X are you running?

comment:3 Changed 6 months ago by davidnich (David Nichols)

Which version of macOS or OS X are you running?

14.1.1 (arm64)

comment:4 Changed 6 months ago by neverpanic (Clemens Lang)

I cannot reproduce that:

$ sudo port install postgresql16-server
$ sudo port select postgresql postgresql16
$ sudo mkdir -p /opt/local/var/db/postgresql16/defaultdb
$ sudo chown postgres:postgres /opt/local/var/db/postgresql16/defaultdb
$ sudo -u postgres /bin/sh -c 'cd /opt/local/var/db/postgresql16 && /opt/local/lib/postgresql16/bin/initdb -D /opt/local/var/db/postgresql16/defaultdb'
$ sudo -u postgres openssl req -new -x509 -days 365 -nodes -out /opt/local/var/db/postgresql16/server.crt -keyout /opt/local/var/db/postgresql16/server.key -subj "/CN=localhost"
$ sudo -u postgres /bin/sh -c '(echo "ssl =  on"; echo "ssl_cert_file = \'server.crt\'"; echo "ssl_key_file = \'server.key\'") >/opt/local/var/db/postgresql16/postgresql.conf'
$ sudo port load postgresql16-server
$ psql "sslmode=require host=localhost user=postgres dbname=postgres"
psql (16.1)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.

postgres=#
\q

I did install PostgreSQL (both the server and client) from the precompiled binary, which was built before the OpenSSL 3.2.0 update, so that should have been built against OpenSSL 3.1.4.

I'm on 13.6.2, also on arm64.

Do you have any further details of the server setup?

There is also a similar report at https://lists.macports.org/pipermail/macports-users/2023-November/052378.html, so I am sure there is some issue, but it would help if I could reproduce it.

comment:5 Changed 6 months ago by neverpanic (Clemens Lang)

I did manage to generate a backtrace that might be related using pgcli:

Exception Type:        EXC_CRASH (SIGABRT)
Exception Codes:       0x0000000000000000, 0x0000000000000000

Termination Reason:    Namespace SIGNAL, Code 6 Abort trap: 6
Terminating Process:   Python [17597]

Application Specific Information:
abort() called


Thread 0 Crashed::  Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib        	       0x1aa948744 __pthread_kill + 8
1   libsystem_pthread.dylib       	       0x1aa97fc28 pthread_kill + 288
2   libsystem_c.dylib             	       0x1aa88dae8 abort + 180
3   libsystem_malloc.dylib        	       0x1aa7aee28 malloc_vreport + 908
4   libsystem_malloc.dylib        	       0x1aa7c55d4 malloc_zone_error + 104
5   libsystem_malloc.dylib        	       0x1aa7b8754 free_small_botch + 40
6   libpq.5.15.dylib              	       0x10587a72c freePGconn + 332
7   _psycopg.cpython-310-darwin.so	       0x10582e530 conn_close_locked + 48
8   _psycopg.cpython-310-darwin.so	       0x10582e4e4 conn_close + 76
9   _psycopg.cpython-310-darwin.so	       0x10582f004 connection_dealloc + 48
10  Python                        	       0x104ce9b78 type_call + 352
11  Python                        	       0x104c8fa84 _PyObject_MakeTpCall + 136
12  Python                        	       0x104c90ac4 _PyObject_CallFunctionVa + 312
13  Python                        	       0x104c90d2c _PyObject_CallFunction_SizeT + 48
14  _psycopg.cpython-310-darwin.so	       0x10583b258 psyco_connect + 208
15  Python                        	       0x104cd3494 cfunction_call + 60
16  Python                        	       0x104c9052c _PyObject_Call + 128
17  Python                        	       0x104d5a7a0 _PyEval_EvalFrameDefault + 13408
18  Python                        	       0x104d563f8 _PyEval_Vector + 368
19  Python                        	       0x104c903e8 PyVectorcall_Call + 176
20  Python                        	       0x104d5a7a0 _PyEval_EvalFrameDefault + 13408
21  Python                        	       0x104d563f8 _PyEval_Vector + 368
22  Python                        	       0x104c92970 method_vectorcall + 124
23  Python                        	       0x104c903e8 PyVectorcall_Call + 176
24  Python                        	       0x104d5a7a0 _PyEval_EvalFrameDefault + 13408
25  Python                        	       0x104d563f8 _PyEval_Vector + 368
26  Python                        	       0x104c8fe4c _PyObject_FastCallDictTstate + 208
27  Python                        	       0x104cf1368 slot_tp_init + 196
28  Python                        	       0x104ce9b34 type_call + 284
29  Python                        	       0x104c9052c _PyObject_Call + 128
30  Python                        	       0x104d5a7a0 _PyEval_EvalFrameDefault + 13408
31  Python                        	       0x104d563f8 _PyEval_Vector + 368
32  Python                        	       0x104c92970 method_vectorcall + 124
33  Python                        	       0x104c903e8 PyVectorcall_Call + 176
34  Python                        	       0x104d5a7a0 _PyEval_EvalFrameDefault + 13408
35  Python                        	       0x104d563f8 _PyEval_Vector + 368
36  Python                        	       0x104c92970 method_vectorcall + 124
37  Python                        	       0x104d61d00 call_function + 124
38  Python                        	       0x104d57f9c _PyEval_EvalFrameDefault + 3164
39  Python                        	       0x104d563f8 _PyEval_Vector + 368
40  Python                        	       0x104c903e8 PyVectorcall_Call + 176
41  Python                        	       0x104d5a7a0 _PyEval_EvalFrameDefault + 13408
42  Python                        	       0x104d563f8 _PyEval_Vector + 368
43  Python                        	       0x104c92970 method_vectorcall + 124
44  Python                        	       0x104c903e8 PyVectorcall_Call + 176
45  Python                        	       0x104d5a7a0 _PyEval_EvalFrameDefault + 13408
46  Python                        	       0x104d563f8 _PyEval_Vector + 368
47  Python                        	       0x104d61d00 call_function + 124
48  Python                        	       0x104d5a840 _PyEval_EvalFrameDefault + 13568
49  Python                        	       0x104d563f8 _PyEval_Vector + 368
50  Python                        	       0x104c92a88 method_vectorcall + 404
51  Python                        	       0x104d5a7a0 _PyEval_EvalFrameDefault + 13408
52  Python                        	       0x104d563f8 _PyEval_Vector + 368
53  Python                        	       0x104c8fddc _PyObject_FastCallDictTstate + 96
54  Python                        	       0x104cf02a8 slot_tp_call + 196
55  Python                        	       0x104c8fa84 _PyObject_MakeTpCall + 136
56  Python                        	       0x104d61d94 call_function + 272
57  Python                        	       0x104d58098 _PyEval_EvalFrameDefault + 3416
58  Python                        	       0x104d563f8 _PyEval_Vector + 368
59  Python                        	       0x104d56274 PyEval_EvalCode + 104
60  Python                        	       0x104da1be0 run_eval_code_obj + 84
61  Python                        	       0x104da1b44 run_mod + 112
62  Python                        	       0x104da1968 pyrun_file + 148
63  Python                        	       0x104da13b8 _PyRun_SimpleFileObject + 268
64  Python                        	       0x104da0d6c _PyRun_AnyFileObject + 216
65  Python                        	       0x104dbb48c pymain_run_file_obj + 220
66  Python                        	       0x104dbade4 pymain_run_file + 72
67  Python                        	       0x104dba740 Py_RunMain + 856
68  Python                        	       0x104dbb7d8 Py_BytesMain + 40
69  dyld                          	       0x1aa627f28 start + 2236

Thread 1:
0   libsystem_pthread.dylib       	       0x1aa97ad8c start_wqthread + 0

Thread 2:
0   libsystem_pthread.dylib       	       0x1aa97ad8c start_wqthread + 0

This seems to happen during connection closing. I suspect there might be an issue with OpenSSL's teardown that causes a double free.

The changelog for OpenSSL 3.2 also says "Major refactor of the libssl record layer." It's entirely possible that there is a memory corruption in those changes.

Other TLS-related changes are the change of the default seclevel from 1 to 2, which now considers RSA, DSA and DH keys <2048 and ECC keys <224 bits as insecure. I suspect this isn't the issue, since the error message would be different, but it's maybe worth checking.

comment:6 Changed 6 months ago by rufty (Bill Hill)

Cc: rufty added

comment:7 Changed 6 months ago by davidnich (David Nichols)

Cc: rufty removed

Do you have any further details of the server setup?

it's Crunchy Postgres for Kubernetes: https://access.crunchydata.com/documentation/postgres-operator/latest/quickstart

it's running server v15.3 - the setup is pretty plain except for a few custom PostgreSQL and Kubernetes configuration settings.

I'll be happy to provide you whatever info you need to help debug. I can say that the server appears to be working fine; it's just inaccessible from any program on my Mac desktop using MacPorts with openssl 3.2.

On my Mac laptop still on openssl @3_14 from MacPorts, everything works normally.

The openssl version on the server is:

postgres@hq-instance1-8cdx-0 /]$ openssl version
OpenSSL 1.1.1k  FIPS 25 Mar 2021

comment:8 Changed 6 months ago by Dave-Allured (Dave Allured)

Cc: Dave-Allured added

comment:9 Changed 6 months ago by jmroot (Joshua Root)

comment:10 in reply to:  9 Changed 6 months ago by neverpanic (Clemens Lang)

Replying to jmroot:

Potentially relevant: https://github.com/openssl/openssl/pull/22820

Thanks, although that seems to be in the DANE code path. Do any of the servers where you see the issue have DANE records configured?

comment:11 Changed 6 months ago by davidnich (David Nichols)

Thanks, although that seems to be in the DANE code path. Do any of the servers where you see the issue have DANE records configured?

not in my case

comment:12 Changed 6 months ago by fgunbin

Cc: fgunbin added

comment:13 Changed 6 months ago by ryandesign (Ryan Carsten Schmidt)

Cc: jrabinow added
Port: openssl3 added

Has duplicate #68791.

comment:14 Changed 6 months ago by dgilman (David Gilman)

Upstream is discussing the issue and has an initial patch. I can do a revision update with the patch once they've merged it. https://www.postgresql.org/message-id/flat/CAN55FZ1eDDYsYaL7mv%2BoSLUij2h_u6hvD4Qmv-7PK7jkji0uyQ%40mail.gmail.com

comment:15 Changed 6 months ago by davidnich (David Nichols)

Upstream is discussing the issue and has an initial patch. I can do a revision update with the patch once they've merged it. ​https://www.postgresql.org/message-id/flat/CAN55FZ1eDDYsYaL7mv%2BoSLUij2h_u6hvD4Qmv-7PK7jkji0uyQ%40mail.gmail.com

Amazing - thanks for staying on top of this!

comment:16 Changed 6 months ago by dgilman (David Gilman)

Resolution: fixed
Status: assignedclosed

In 279d7693305477e1209f4aab5a4d22e5a2ba5806/macports-ports (master):

postgresql16: fix openssl 3.2 support

Closes: #68777

Note: See TracTickets for help on using tickets.