Opened 8 months ago

Closed 5 months ago

#68329 closed defect (fixed)

py311-scipy @1.10.1_0+gfortran+openblas not building on Sonoma apple silicon

Reported by: quintusdias (John G Evans) Owned by: michaelld (Michael Dickens)
Priority: Normal Milestone:
Component: ports Version: 2.8.1
Keywords: Cc: cjones051073 (Chris Jones), greyhare, ddrum2000, khorton (Kevin Horton), bal-agates, markmentovai (Mark Mentovai), mf2k (Frank Schima), Dave-Allured (Dave Allured), jowens (John Owens), jpinedaf (Jaime Pineda), nilason (Nicklas Larsson), amagela (Anthony M. Agelastos), mkuron (Michael Kuron), cooljeanius (Eric Gallager)
Port: py-scipy

Description

I'm testing on Sonoma so everything must be built from source.

py311-scipy fails to build on my M1 laptop, but builds just fine on x86_64. The full log is attached. The problem seems to start with

:info:build Error compiling Cython file:                                            
:info:build ------------------------------------------------------------            
:info:build ...                                                                     
:info:build from libcpp.memory cimport unique_ptr                                   
:info:build np.import_array()                                                       
:info:build IF not NPY_OLD:                                                         
:info:build     from numpy.random cimport bitgen_t                                  
:info:build    ^                                                                    
:info:build ------------------------------------------------------------     

but again, this built just fine on x86_64.

Attachments (3)

main.log (122.8 KB) - added by quintusdias (John G Evans) 8 months ago.
build log on sonoma, arm64
main2.log (118.9 KB) - added by cjones051073 (Chris Jones) 8 months ago.
otool-libopenblas.txt (6.1 KB) - added by bal-agates 7 months ago.
otool -l of libopenblas.dylib

Download all attachments as: .zip

Change History (42)

Changed 8 months ago by quintusdias (John G Evans)

Attachment: main.log added

build log on sonoma, arm64

comment:1 Changed 8 months ago by jmroot (Joshua Root)

Cc: michaelld@… removed
Owner: set to michaelld
Status: newassigned

comment:2 Changed 8 months ago by cjones051073 (Chris Jones)

I am also running into this, although the error seems to be slightly different for me. The important part would appear to be

INFO: /usr/bin/clang /opt/local/var/macports/build/_Users_chris_Projects_MacPorts_ports_python_py-scipy/py311-scipy/work/.tmp/tmplvun8a3d/opt/local/var/macports/build/_Users_chris_Projects_MacPorts_ports_python_py-scipy/py311-scipy/work/.tmp/tmplvun8a3d/source.o -L/opt/local/lib -lopenblas -o /opt/local/var/macports/build/_Users_chris_Projects_MacPorts_ports_python_py-scipy/py311-scipy/work/.tmp/tmplvun8a3d/a.out
ld: duplicate LC_RPATH '/opt/local/lib/libgcc' in '/opt/local/lib/libopenblas-r1.dylib'
clang: error: linker command failed with exit code 1 (use -v to see invocation)

I strongly suspect this is just the Xcode 15 linker issue again, having trouble with the openblas dylib which was created in part with GCC.

I've tried a few ways to get the build to go back to the legacy liker, which should work with something like

if {${os.platform} eq "darwin" && ${os.major} >= 22} {
    if { [vercmp ${xcodeversion} >= 15.0] || [vercmp ${xcodecltversion} >= 15.0] } {
        # On macOS13 and newer ensure the 'legacy' linker is used as GCC currently has problems
        # with the new default linker in Xcode 15. See e.g.
        # https://developer.apple.com/documentation/xcode-release-notes/xcode-15-release-notes#Linking
        # https://discussions.apple.com/thread/255137447
        # https://developer.apple.com/forums/thread/737707
        # https://github.com/Homebrew/homebrew-core/issues/145991
        configure.ldflags-append  -Wl,-ld_classic
    }
}

but this port does not seem to be respecting the above flags.

Changed 8 months ago by cjones051073 (Chris Jones)

Attachment: main2.log added

comment:3 Changed 7 months ago by greyhare

I'm seeing the same problems that Chris is seeing. I can attach a log if it'll help.

comment:4 Changed 7 months ago by cjones051073 (Chris Jones)

Cc: cjones051073 added

comment:5 Changed 7 months ago by greyhare

Cc: greyhare added

comment:6 Changed 7 months ago by greyhare

If I switch to the +gcc13+openblas variant, I get the failure in #68014.

comment:7 Changed 7 months ago by andeux (Andy Lewis)

I get the same error as Chris Jones reports, both on an M2 machine and on an older intel mac. So I don't think this linker issue is architecture related, and may be an entirely different bug than the one in the original report.

comment:8 Changed 7 months ago by bal-agates

I get the same error that Chris Jones reports:

duplicate LC_RPATH '/opt/local/lib/libgcc' in '/opt/local/lib/libopenblas-r1.dylib'

I am building on MacBook Pro 2021 (M1) with Sonoma (14.0) with Xcode 15.0 installed. Due to a recent [unanticipated] macOS upgrade I deleted all MacPorts and tried to rebuild everything. py311-scipy is one of the few ports that didn't build. I really need this to build QGIS3.

libopenblas-r1.dylib is provided by port OpenBLAS. Note that I successfully built hugin-app that also depends on port OpenBLAS. One difference I see between between py311-scipy (link problems) and hugin-app (builds) is that py311-scipy depends on gcc13. So I suspect something in the linker tool chain. There appears to be a lot of "magic" in the Apple linker tool chain [cached system libraries] that I do not understand. Here is what I see on some relevant library paths on my system.

$ otool -L /opt/local/lib/libopenblas-r1.dylib
/opt/local/lib/libopenblas-r1.dylib:
	/opt/local/lib/libopenblas-r1.dylib (compatibility version 0.0.0, current version 0.0.0)
	@rpath/libgfortran.5.dylib (compatibility version 6.0.0, current version 6.0.0)
	@rpath/libquadmath.0.dylib (compatibility version 1.0.0, current version 1.0.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1336.0.0)

$ otool -L /opt/local/lib/libgcc/libgfortran.5.dylib
/opt/local/lib/libgcc/libgfortran.5.dylib:
	@rpath/libgfortran.5.dylib (compatibility version 6.0.0, current version 6.0.0)
	@rpath/libquadmath.0.dylib (compatibility version 1.0.0, current version 1.0.0)
	@rpath/libgcc_s.1.1.dylib (compatibility version 1.0.0, current version 1.1.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1336.0.0)

$ otool -L /opt/local/lib/libgcc/libgcc_s.1.1.dylib
/opt/local/lib/libgcc/libgcc_s.1.1.dylib (architecture arm64):
	@rpath/libgcc_s.1.1.dylib (compatibility version 1.0.0, current version 1.1.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1336.0.0)

Note that "/usr/lib/libSystem.B.dylib" is one of those cached libraries that doesn't exist but somehow gets magically resolved. I am not sure how to troubleshoot further.

comment:9 Changed 7 months ago by ddrum2000

Cc: ddrum2000 added

comment:10 Changed 7 months ago by khorton (Kevin Horton)

Cc: khorton added

comment:11 Changed 7 months ago by bal-agates

Cc: bal-agates added

Changed 7 months ago by bal-agates

Attachment: otool-libopenblas.txt added

otool -l of libopenblas.dylib

comment:12 Changed 7 months ago by bal-agates

I have verified on my system libopenblas.dylib does have duplicate LC_RPATH's. See the attached otool-libopenblas.txt (with slightly different command line options than what I used above). A snippet is:

Load command 15
          cmd LC_RPATH
      cmdsize 40
         path /opt/local/lib/libgcc (offset 12)
Load command 16
          cmd LC_RPATH
      cmdsize 40
         path /opt/local/lib/libgcc (offset 12)

I have read that some older linkers will ignore this but that newer [Apple] linkers generate errors.

In the log file it looks like "clang" generated the duplicate LC_RPATH error. I am a little confused because the py311-scipy portfile has

compilers.setup         require_fortran -clang -gcc44 -gcc45 -gcc46 \
                        -gcc47 -gcc48 -g95

Doesn't "-clang" mean to disqualify clang? The log file also has

:debug:main compiler clang 1500.0.40.1 not blacklisted because it doesn't match {clang < 1001}

This version of clang appears to be the version supplied by Apple with Xcode 15.0 which I believe is sensitive to duplicate LC_RPATH.

On my system I currently have

$ port select --summary                                                                                           
Name        Selected        Options
====        ========        =======
clang       mp-clang-17     mp-clang-14 mp-clang-16 mp-clang-17 none
gcc         mp-gcc13        mp-gcc13 none

$ gcc --version
Apple clang version 15.0.0 (clang-1500.0.40.1)

$ which gcc
/usr/bin/gcc

but the version of gcc13 from MacPort gcc13 should be???

$ opt/local/bin/arm64-apple-darwin23-gcc-mp-13 --version
arm64-apple-darwin23-gcc-mp-13 (MacPorts gcc13 13.2.0_3+stdlib_flag) 13.2.0

Questions:

1) Is the best path to figure out why libopenblas.dylib has duplicate LC_RPATH's and fix?

2) Why was clang getting used building py311-scipy? Is this just cmake trying to determine whether BLAS and LAPACK libraries exist?

3) What was the intent of the py311-scipy portfile with "-clang" and is it behaving as expected?

comment:13 Changed 7 months ago by cjones051073 (Chris Jones)

I do not believe the duplicate LC_RPATH messages are the real issue here. They are just warnings, so whilst something we may need to clean up ourselves (assuming GCC upstream doesn't do something to prevent them) I think for now we ignore them.

I am sure the issue here is just the new linker in Xcode 15.0 causing problems when asked to link against binaries made with GCC.

There are ways to address this, by forcing the linker to run a 'classic' version instead. However, the rumours are Apple have fixed this away with Xcode 15.1 so another option is to just wait for this to come out and see if it indeed cures all the Xcode 15.0 issues we are currently seeing.

Last edited 7 months ago by cjones051073 (Chris Jones) (previous) (diff)

comment:14 Changed 7 months ago by markmentovai (Mark Mentovai)

@cjones051073 comment:13

I do not believe the duplicate LC_RPATH messages are the real issue here. They are just warnings, so whilst something we may need to clean up ourselves (assuming GCC upstream doesn't do something to prevent them) I think for now we ignore them.

I’m not so sure. They’re not warnings, they’re the source of linker failures. Observe:

mark@arm-and-hammer zsh% cat main.c
int main(int argc, char * argv[]) { return 0; }
mark@arm-and-hammer zsh% otool -l /opt/local/lib/libopenblas.dylib| grep -B1 -A2 LC_RPATH
Load command 15
          cmd LC_RPATH
      cmdsize 40
         path /opt/local/lib/libgcc (offset 12)
Load command 16
          cmd LC_RPATH
      cmdsize 40
         path /opt/local/lib/libgcc (offset 12)
Load command 17
          cmd LC_RPATH
      cmdsize 32
         path @loader_path (offset 12)
Load command 18
          cmd LC_RPATH
      cmdsize 32
         path /opt/local/lib (offset 12)
Load command 19
          cmd LC_RPATH
      cmdsize 72
         path /opt/local/lib/gcc13/gcc/arm64-apple-darwin23/13.2.0 (offset 12)
Load command 20
          cmd LC_RPATH
      cmdsize 40
         path /opt/local/lib/gcc13 (offset 12)
mark@arm-and-hammer zsh% clang main.c -L/opt/local/lib -lopenblas && echo succeeded || echo failed
ld: duplicate LC_RPATH '/opt/local/lib/libgcc' in '/opt/local/lib/libopenblas-r1.dylib'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
failed

But if the duplicate LC_RPATH is removed, the link succeeds:

mark@arm-and-hammer zsh% cp /opt/local/lib/libopenblas.dylib .             
mark@arm-and-hammer zsh% install_name_tool -delete_rpath /opt/local/lib/libgcc libopenblas.dylib
mark@arm-and-hammer zsh% otool -l libopenblas.dylib | grep -B1 -A2 LC_RPATH
Load command 15
          cmd LC_RPATH
      cmdsize 40
         path /opt/local/lib/libgcc (offset 12)
Load command 16
          cmd LC_RPATH
      cmdsize 32
         path @loader_path (offset 12)
Load command 17
          cmd LC_RPATH
      cmdsize 32
         path /opt/local/lib (offset 12)
Load command 18
          cmd LC_RPATH
      cmdsize 72
         path /opt/local/lib/gcc13/gcc/arm64-apple-darwin23/13.2.0 (offset 12)
Load command 19
          cmd LC_RPATH
      cmdsize 40
         path /opt/local/lib/gcc13 (offset 12)
mark@arm-and-hammer zsh% clang main.c -L. -lopenblas && echo succeeded || echo failed      
succeeded

You may think that this isn’t compelling enough, in case install_name_tool might be doing something else to the module. (It does, it makes changes in the __LINKEDIT segment.) So also consider:

mark@arm-and-hammer zsh% python3
Python 3.11.6 (main, Oct  6 2023, 10:09:23) [Clang 15.0.0 (clang-1500.0.40.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> b = bytearray(open('/opt/local/lib/libopenblas.dylib', 'rb').read())
>>> path = b'/opt/local/lib/libgcc'
>>> b[b.find(path, b.find(path) + 1) + len(path) - 1] += 1
>>> open('libopenblas.dylib', 'wb').write(b)
13067496
>>> exit()
mark@arm-and-hammer zsh% otool -l libopenblas.dylib | grep -B 1 -A 2 LC_RPATH
Load command 15
          cmd LC_RPATH
      cmdsize 40
         path /opt/local/lib/libgcc (offset 12)
Load command 16
          cmd LC_RPATH
      cmdsize 40
         path /opt/local/lib/libgcd (offset 12)
Load command 17
          cmd LC_RPATH
      cmdsize 32
         path @loader_path (offset 12)
Load command 18
          cmd LC_RPATH
      cmdsize 32
         path /opt/local/lib (offset 12)
Load command 19
          cmd LC_RPATH
      cmdsize 72
         path /opt/local/lib/gcc13/gcc/arm64-apple-darwin23/13.2.0 (offset 12)
Load command 20
          cmd LC_RPATH
      cmdsize 40
         path /opt/local/lib/gcc13 (offset 12)
mark@arm-and-hammer zsh% clang main.c -L. -lopenblas && echo succeeded || echo failed                                 
succeeded

An “easy” workaround for this would be to run install_name_tool -delete_rpath on libopenblas.dylib. But it would be better to find where the duplicate is coming from in the first place, and fix that.

comment:15 Changed 7 months ago by markmentovai (Mark Mentovai)

Cc: markmentovai added

comment:16 Changed 7 months ago by kencu (Ken)

I have installed Xcode 15.1 beta and the 15.1 beta CLTs (because life is too short to waste time with broken linker).

The build failed as above.

I then ran:

sudo install_name_tool -delete_rpath /opt/local/lib/libgcc /opt/local/lib/libopenblas-r1.dylib

and after that the build completed without issue:

% port -v installed py311-scipy
The following ports are currently installed:
  py311-scipy @1.10.1_0+gfortran+openblas (active) requested_variants='' platform='darwin 23' archs='arm64' date='2023-10-20T20:53:55-0700'
Last edited 7 months ago by kencu (Ken) (previous) (diff)

comment:17 Changed 7 months ago by markmentovai (Mark Mentovai)

ld-classic, which MacPorts gcc is configured to use on macOS ≥ 14 with Xcode ≥ 15, says that it’s ignoring the duplicate -rpath, but it actually isn’t.

mark@arm-and-hammer zsh% cat lib.c 
void LibFunc() {}
mark@arm-and-hammer zsh% clang -c lib.c -o lib.o
mark@arm-and-hammer zsh% ld-classic -dylib lib.o -o liblib.dylib -L$(xcrun --show-sdk-path)/usr/lib -lSystem -rpath /opt/local/lib/libgcc -rpath /opt/local/lib/libgcc
ld: warning: duplicate -rpath '/opt/local/lib/libgcc' ignored
mark@arm-and-hammer zsh% otool -l liblib.dylib | grep -B 1 -A 2 LC_RPATH
Load command 11
          cmd LC_RPATH
      cmdsize 24
         path /opt/local/lib/libgcc (offset 12)
Load command 12
          cmd LC_RPATH
      cmdsize 24
         path /opt/local/lib/libgcc (offset 12)

Compared to ld (not -classic), which says that it’s ignoring the duplicate and then actually ignores the duplicate.

mark@arm-and-hammer zsh% ld -dylib lib.o -o liblib.dylib -L$(xcrun --show-sdk-path)/usr/lib -lSystem -rpath /opt/local/lib/libgcc -rpath /opt/local/lib/libgcc 
ld: warning: duplicate -rpath '/opt/local/lib/libgcc' ignored
mark@arm-and-hammer zsh% otool -l liblib.dylib | grep -B 1 -A 2 LC_RPATH
Load command 11
          cmd LC_RPATH
      cmdsize 24
         path /opt/local/lib/libgcc (offset 12)

When driven by, say, gcc-mp-13, the compiler driver inserts its own -rpath /opt/local/lib/libgcc. This is in addition to the -rpath /opt/local/lib/libgcc that it inserts when directed to do so by the OpenBLAS build, which specifies -Wl,-rpath,/opt/local/lib/libgcc.

mark@arm-and-hammer zsh% diff -U7 \
    <(gcc-mp-13 -dynamiclib lib.o -o liblib.dylib -v 2>&1 | grep collect2 | tr ' ' '\n') \
    <(gcc-mp-13 -dynamiclib lib.o -o liblib.dylib -rpath /opt/local/lib/libgcc -v 2>&1 | grep collect2 | tr ' ' '\n')
--- /dev/fd/11	2023-10-21 00:17:58
+++ /dev/fd/12	2023-10-21 00:17:58
@@ -11,14 +11,16 @@
 0.0
 -o
 liblib.dylib
 -L/opt/local/lib/gcc13/gcc/arm64-apple-darwin23/13.2.0
 -L/opt/local/lib/gcc13/gcc/arm64-apple-darwin23/13.2.0/../../..
 lib.o
 -dylib
+-rpath
+/opt/local/lib/libgcc
 -lemutls_w
 -lgcc
 -lSystem
 -lgcc
 -no_compact_unwind
 -rpath
 /opt/local/lib/libgcc

The extra -rpath in the OpenBLAS build is coming from PortGroup compilers 1.0. This appears to be incorrect for the macOS ≥ 14, Xcode ≥ 15 combination. When I remove that locally and build OpenBLAS, I wind up with:

mark@arm-and-hammer zsh% otool -l /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/OpenBLAS-0.3.24/libopenblas-r1.dylib | grep -B 1 -A 2 LC_RPATH
Load command 15
          cmd LC_RPATH
      cmdsize 40
         path /opt/local/lib/libgcc (offset 12)
Load command 16
          cmd LC_RPATH
      cmdsize 32
         path @loader_path (offset 12)
Load command 17
          cmd LC_RPATH
      cmdsize 32
         path /opt/local/lib (offset 12)
Load command 18
          cmd LC_RPATH
      cmdsize 72
         path /opt/local/lib/gcc13/gcc/arm64-apple-darwin23/13.2.0 (offset 12)
Load command 19
          cmd LC_RPATH
      cmdsize 40
         path /opt/local/lib/gcc13 (offset 12)

This is positive, but then the OpenBLAS build doesn’t complete because another portion of it uses clang, which doesn’t know anything about the libgcc -rpath.

So perhaps the solution is to patch MacPorts gcc to handle this situation, to go along with the --with-darwin-extra-rpath patch, since that’s really the underlying cause: if a single -Wl,-rpath matching the --with-darwin-extra-rpath is found, let the first one ride and omit the rest.

Or maybe, adopt the simpler approach of just ignoring a duplicate -rpath heading to the linker, which the ld-classic says it’s going to do anyway but isn’t actually doing.

Last edited 7 months ago by markmentovai (Mark Mentovai) (previous) (diff)

comment:18 Changed 7 months ago by kencu (Ken)

it might get tricky to always have no duplicate rpath (or other) flags on the link line, but of course we should do our best. They come in as ldflags, or as -Wl,flags, so they don’t always exactly match on the driver line…

once xcode 15.1 is released, our gcc should be able to start using ld instead of ld-classic.

if I read your analysis correctly, that might be enough to fix this issue.

Last edited 7 months ago by kencu (Ken) (previous) (diff)

comment:19 Changed 7 months ago by mf2k (Frank Schima)

Cc: mf2k added

comment:20 Changed 7 months ago by mf2k (Frank Schima)

I can confirm that commenting out line 182 of the gcc13 Portfile and rebuilding anything that gcc13 touches (including OpenBLAS) allowed py311-scipy to build for me. I'm using the Xcode 15.1 and CLT beta.

comment:21 Changed 7 months ago by Dave-Allured (Dave Allured)

Cc: Dave-Allured added

comment:22 Changed 7 months ago by markmentovai (Mark Mentovai)

I do think that it’s possible to fix this for Xcode 15.0, and in the process, fix a latent bug that may not have been appreciated until now. This can be done in a way that’s not a hack and isn’t wrong for future versions of Xcode.

We don’t have the full source for the new ld, but some of its source code is shared with dyld:

mark@arm-and-hammer zsh% ld -v 2>&1 | head -1
@(#)PROGRAM:ld  PROJECT:dyld-1015.7

The duplicate LC_RPATH messages are in the shared source that we can see, and come from mach_o::Header::validSemanticsRPath. There, you can see the logic that determines whether this situation is a warning or an error: the condition is based on enforceNoDupRPath, which relies on mach_o::Policy::enforceNoDuplicateRPaths, which is in turn based on the SDK version. enforceNoDuplicateRPaths is not supposed to become a hard error until Platform::Epoch::fall2024, which would be next year’s batch of OS releases. So why are we seeing this as a hard error today, in fall 2023?

The SDK used to link a module is embedded in its LC_BUILD_VERSION load command, and can be configured by the -platform_version option to ld. The compiler driver will normally pass rational values to the linker using this option. For example, here’s what Xcode 15.0.1 clang-1500.0.40.1 does:

mark@arm-and-hammer zsh% clang -dynamiclib lib.c -o liblib.dylib -Wl,-rpath,/tmp/lib -Wl,-rpath,/tmp/lib -Wl,-ld_classic -v 2>&1 | grep /ld | tr ' ' '\n' | grep -A 3 platform_version
-platform_version
macos
14.0.0
14.0
mark@arm-and-hammer zsh% otool -l liblib.dylib | grep -B 1 -A 7 LC_BUILD_VERSION
Load command 8
      cmd LC_BUILD_VERSION
  cmdsize 32
 platform 1
    minos 14.0
      sdk 14.0
   ntools 1
     tool 3
  version 907.0
mark@arm-and-hammer zsh% otool -l liblib.dylib | grep -B 1 -A 2 LC_RPATH
Load command 11
          cmd LC_RPATH
      cmdsize 24
         path /tmp/lib (offset 12)
Load command 12
          cmd LC_RPATH
      cmdsize 24
         path /tmp/lib (offset 12)

And even though I’ve managed to embed two identical LC_RPATH load commands (by using ld_classic), I can link against this module with “only” a warning:

mark@arm-and-hammer zsh% clang main.c -L. -llib && echo succeeded || echo failed
ld: warning: duplicate LC_RPATH are deprecated ('/tmp/lib')
succeeded

But repeat with MacPorts gcc:

mark@arm-and-hammer zsh% gcc-mp-13 -dynamiclib lib.c -o liblib.dylib -Wl,-rpath,/tmp/lib -Wl,-rpath,/tmp/lib -Wl,-ld_classic -v 2>&1 | grep collect2 | tr ' ' '\n' | grep -A 3 platform_version
-platform_version
macos
14.0.0
0.0
mark@arm-and-hammer zsh% otool -l liblib.dylib | grep -B 1 -A 7 LC_BUILD_VERSION 
Load command 8
      cmd LC_BUILD_VERSION
  cmdsize 32
 platform 1
    minos 14.0
      sdk n/a
   ntools 1
     tool 3
  version 907.0
mark@arm-and-hammer zsh% clang main.c -L. -llib && echo succeeded || echo failed 
ld: duplicate LC_RPATH '/tmp/lib' in '/private/tmp/liblib.dylib'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
failed

man ld documents -platform_version:

     -platform_version platform min_version sdk_version
             This is set to indicate the platform, oldest supported version of
             that platform that output is to be used on, and the SDK that the
             output was built against.  platform is a numeric value as defined
             in <mach-o/loader.h>, or it may be one of the following strings:
             • macos
[…]
             Specifying a newer min or SDK version enables the linker to
             assume features of that OS or SDK in the output file. The format
             of min_version and sdk_version is a version number such as 10.13
             or 10.14

gcc passing -platform_version macos 14.0.0 0.0 is causing the SDK to not be properly recorded (the “n/a” in otool -l output), which is inadvertently triggering no-duplicate-LC_RPATH enforcement that we wouldn’t ordinarily see until next year.

I can show that simply adding the proper SDK version to the module allows it to be linked against even though it contains multiple identical LC_RPATH load commands:

mark@arm-and-hammer zsh% gcc-mp-13 -dynamiclib lib.c -o liblib.dylib -Wl,-rpath,/tmp/lib -Wl,-rpath,/tmp/lib -Wl,-ld_classic -Wl,-platform_version,macos,14.0.0,14.0
ld: warning: duplicate -rpath '/tmp/lib' ignored
ld: warning: passed two min versions (14.0, 14.0.0) for platform macOS. Using 14.0.0.
mark@arm-and-hammer zsh% otool -l liblib.dylib | grep -B 1 -A 7 LC_BUILD_VERSION 
Load command 8
      cmd LC_BUILD_VERSION
  cmdsize 32
 platform 1
    minos 14.0
      sdk 14.0
   ntools 1
     tool 3
  version 907.0
mark@arm-and-hammer zsh% clang main.c -L. -llib && echo succeeded || echo failed 
ld: warning: duplicate LC_RPATH are deprecated ('/tmp/lib')
succeeded

So where does the 0.0 that gcc uses come from? Well, unfortunately, it’s a hard-code at gcc’s gcc/config/darwin.h. This is new in gcc 032b5da1fc78 (2023-07-13). The older method, using the older -macosx_version_min (or newer synonym for it, -macos_version_min), doesn’t record this invalid SDK value.

mark@arm-and-hammer zsh% ld-classic -syslibroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.sdk/ -dylib -platform_version macos 14.0.0 0.0 -o liblib.dylib lib.o -lSystem -rpath /tmp/lib -rpath /tmp/lib
ld: warning: duplicate -rpath '/tmp/lib' ignored
mark@arm-and-hammer zsh% otool -l liblib.dylib | grep -B 1 -A 7 LC_BUILD_VERSION 
Load command 8
      cmd LC_BUILD_VERSION
  cmdsize 32
 platform 1
    minos 14.0
      sdk n/a
   ntools 1
     tool 3
  version 907.0
mark@arm-and-hammer zsh% ld-classic -syslibroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.sdk/ -dylib -macos_version_min 14.0.0 -o liblib.dylib lib.o -lSystem -rpath /tmp/lib -rpath /tmp/lib
ld: warning: duplicate -rpath '/tmp/lib' ignored
mark@arm-and-hammer zsh% otool -l liblib.dylib | grep -B 1 -A 7 LC_BUILD_VERSION 
Load command 8
      cmd LC_BUILD_VERSION
  cmdsize 32
 platform 1
    minos 14.0
      sdk 14.0
   ntools 1
     tool 3
  version 907.0

So the easy thing to do here would be to revert gcc 032b5da1fc78 (using -macos_version_min instead of -macosx_verison_min as the latter will now issue deprecation warnings) which infers the SDK version based on -syslibroot and records it properly. Longer-term, gcc should calculate the proper SDK version and pass it to -platform_version itself.

Under no circumstance should gcc be passing an SDK value of 0.0. Apple has taken to enabling and disabling features based on the SDK that a module was linked against, and by providing an invalid version in this spot, gcc-linked modules are very likely to continue to experience these sorts of unexplained disturbances.

comment:23 Changed 7 months ago by jowens (John Owens)

Cc: jowens added

comment:24 Changed 7 months ago by jowens (John Owens)

I am not sure this is related / the same: my install is failing because "No BLAS/LAPACK libraries found" but I have OpenBLAS installed, so not sure what's gone wrong. I'm vaguely thinking this is a different problem but my install will eventually fail for the same reason as those above?

$ sudo port clean py311-scipy
--->  Cleaning py311-scipy
$ port installed | grep -i openblas
  OpenBLAS @0.3.24_0+gcc13+lapack+native (active)
$ sudo port install py311-scipy
--->  Computing dependencies for py311-scipy
--->  Fetching archive for py311-scipy
--->  Attempting to fetch py311-scipy-1.10.1_0+gfortran+openblas.darwin_23.arm64.tbz2 from http://mirror.fcix.net/macports/packages/py311-scipy
--->  Attempting to fetch py311-scipy-1.10.1_0+gfortran+openblas.darwin_23.arm64.tbz2 from https://packages.macports.org/py311-scipy
--->  Attempting to fetch py311-scipy-1.10.1_0+gfortran+openblas.darwin_23.arm64.tbz2 from https://ywg.ca.packages.macports.org/mirror/macports/packages/py311-scipy
--->  Fetching distfiles for py311-scipy
--->  Verifying checksums for py311-scipy
--->  Extracting py311-scipy
--->  Applying patches to py311-scipy
--->  Configuring py311-scipy
--->  Building py311-scipy
Error: Failed to build py311-scipy: command execution failed
Error: See /opt/local/var/macports/logs/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_python_py-scipy/py311-scipy/main.log for details.
Error: Follow https://guide.macports.org/#project.tickets if you believe there is a bug.
Error: Processing of port py311-scipy failed
$ tail -40 /opt/local/var/macports/logs/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_python_py-scipy/py311-scipy/main.log
:info:build     the LAPACK environment variable.
:info:build   return getattr(self, '_calc_info_{}'.format(name))()
:info:build INFO: lapack_src_info:
:info:build INFO:   NOT AVAILABLE
:info:build INFO:
:info:build /opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/numpy/distutils/system_info.py:1974: UserWarning:
:info:build     Lapack (http://www.netlib.org/lapack/) sources not found.
:info:build     Directories to search for the sources can be specified in the
:info:build     numpy/distutils/site.cfg file (section [lapack_src]) or by setting
:info:build     the LAPACK_SRC environment variable.
:info:build   return getattr(self, '_calc_info_{}'.format(name))()
:info:build INFO:   NOT AVAILABLE
:info:build INFO:
:info:build Traceback (most recent call last):
:info:build   File "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_python_py-scipy/py311-scipy/work/scipy-1.10.1/setup.py", line 533, in <module>
:info:build     setup_package()
:info:build   File "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_python_py-scipy/py311-scipy/work/scipy-1.10.1/setup.py", line 529, in setup_package
:info:build     setup(**metadata)
:info:build   File "/opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/numpy/distutils/core.py", line 136, in setup
:info:build     config = configuration()
:info:build              ^^^^^^^^^^^^^^^
:info:build   File "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_python_py-scipy/py311-scipy/work/scipy-1.10.1/setup.py", line 431, in configuration
:info:build     raise NotFoundError(msg)
:info:build numpy.distutils.system_info.NotFoundError: No BLAS/LAPACK libraries found. Note: Accelerate is no longer supported.
:info:build To build Scipy from sources, BLAS & LAPACK libraries need to be installed.
:info:build See site.cfg.example in the Scipy source directory and
:info:build https://docs.scipy.org/doc/scipy/reference/building/index.html for details.
:info:build Command failed:  cd "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_python_py-scipy/py311-scipy/work/scipy-1.10.1" && /opt/local/Library/Frameworks/Python.framework/Versions/3.11/bin/python3.11 setup.py --no-user-cfg config_fc --fcompiler gnu95 --f77exec /opt/local/bin/gfortran-mp-13 --f77flags='-m64 -Os -fno-second-underscore' --f90exec /opt/local/bin/gfortran-mp-13 --f90flags='-m64 -Os -fno-second-underscore' config --cc /usr/bin/clang --include-dirs /opt/local/include --library-dirs /opt/local/lib build -j10
:info:build Exit code: 1
:error:build Failed to build py311-scipy: command execution failed
:debug:build Error code: CHILDSTATUS 83145 1
:debug:build Backtrace: command execution failed
:debug:build     while executing
:debug:build "system {*}$notty {*}$callback {*}$nice $fullcmdstring"
:debug:build     invoked from within
:debug:build "command_exec -callback portprogress::target_progress_callback build"
:debug:build     (procedure "portbuild::build_main" line 8)
:debug:build     invoked from within
:debug:build "$procedure $targetname"
:error:build See /opt/local/var/macports/logs/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_python_py-scipy/py311-scipy/main.log for details.

comment:25 Changed 7 months ago by RivetBenoit (Benoit Rivet)

John Owens : read carefully the main.log file, you may be able to pinpoint the primary cause of failure. The 40 last lines of your main.log are obviously informing you that something went wrong before, but the real culprit is mentioned before. Who knows ? Something like :

duplicate LC_RPATH '/opt/local/lib/libgcc' in '/opt/local/lib/libopenblas-r1.dylib'

may be the culprit.

comment:26 Changed 7 months ago by jowens (John Owens)

@RivetBenoit: Thank you! Yes, I see

:info:build ld: duplicate LC_RPATH '/opt/local/lib/libgcc' in '/opt/local/lib/libopenblas-r1.dylib'

up above. I appreciate your thoughtful answer!

comment:27 Changed 6 months ago by jpinedaf (Jaime Pineda)

Cc: jpinedaf added

comment:28 Changed 6 months ago by nilason (Nicklas Larsson)

Cc: nilason added

comment:29 Changed 6 months ago by jmroot (Joshua Root)

Port: py-scipy added; py311-scipy removed

comment:30 Changed 6 months ago by amagela (Anthony M. Agelastos)

Cc: amagela added

comment:31 Changed 6 months ago by greyhare

So is there a fix or workaround for this? I didn't quite understand all the discussion.

comment:32 Changed 6 months ago by bal-agates

I saw there was an update to the OpenBLAS port so I upgraded all my ports and then tried to install py311-scipy. On my system the build of py311-scipy still fails in the same way.

I think the ultimate problem is library libopenblas-r1.dylib contains two references to /opt/local/lib/libgcc. This is an error but apparently most older linkers ignored it. Apparently the Xcode 15.0.X linker treats this as an error. Chris Jones suggested that the Xcode 15.1 linker might solve the issue. Xcode 15.1 hasn't been released yet but I did see "Xcode 15.1 beta 3" was released on 14-Nov-2023 so it probably is getting close to being released.

I have not studied the OpenBLAS port as to why there are two references to /opt/local/lib/libgcc. It might be as simple as one reference comes from being manually specified and another auto-generated by cmake? I have a poor understanding of cmake and what triggers it to do things. Are there any port maintainers for OpenBLAS on this Issues distribution?

The only short term workaround I can think of is to manually remove the duplicate reference from libopenblas-r1.dylib. Mark Mentovai commented above that he did this and it fixed the problem. Not sure what command he used. Later on Ken (above) did the same and it fixed the problem. Ken listed the command he used.

comment:33 Changed 6 months ago by markmentovai (Mark Mentovai)

There’s a gcc bug (or two) that should be fixed even independently of Xcode 15.1. I describe this all in comment:22.

Xcode 15.1 may make the situation tolerable at least temporarily, but it’s likely to regress again in the future, such as the macOS 15/Xcode 16 timeframe.

comment:32:

The only short term workaround I can think of is to manually remove the duplicate reference from libopenblas-r1.dylib. Mark Mentovai commented above that he did this and it fixed the problem. Not sure what command he used.

I did show this in comment:14. It was install_name_tool -delete_rpath.

comment:34 Changed 6 months ago by kencu (Ken)

Mark, if you want to pursue the gcc needs a patch idea, you’d best open an issue in Iain’s repo and work it out there with Iain:

https://github.com/iains/gcc-13-branch

Last edited 6 months ago by kencu (Ken) (previous) (diff)

comment:35 Changed 6 months ago by kencu (Ken)

gcc adds an rpath to all it’s builds automatically.

MacPorts is adding another rpath to the libgcc libraries in the compilers portgroup, at least.

Some portfiles add yet another rpath as well themselves.

Lots of that needs to be stripped out.

comment:36 Changed 6 months ago by mkuron (Michael Kuron)

Cc: mkuron added

comment:37 Changed 5 months ago by cooljeanius (Eric Gallager)

Cc: cooljeanius added

comment:38 Changed 5 months ago by bal-agates

I am now able to build py311-scipy on macOS 14.2 arm64. A number of things have changed on my system. I suspect 3) below had the most to do with fixing the problem.

1) I upgraded from macOS 14.1 to 14.2.

2) I upgraded Xcode from 15.0.1 to 15.1

3) I upgraded all "outdated" ports on my system. I noticed the OpenBLAS Portfile had changed and that port was rebuilt. OpenBLAS no longer installs /opt/local/lib/libopenblas-r1.dylib. It does install /opt/local/lib/libopenblas.dylib and that does NOT have duplicate RPATH for libgcc.

4) The py-scipy Portfile had its revision bumped with comment "OpenBLAS dependants: rev-bump after switch to cmake".

I was also able to build QGIS3 that is dependent on py311-scipy.

If others confirm, I think you can close this issue.

comment:39 Changed 5 months ago by kencu (Ken)

Resolution: fixed
Status: assignedclosed

You are welcome!

We will gradually remove the duplicate rpaths from every port that was touched by the compilers portgroup.

Unfortunately manually, as the default in the portgroup is still to duplicate them, but ... everything takes time.

Note: See TracTickets for help on using tickets.