Opened 7 months ago

Closed 7 months ago

Last modified 7 months ago

#62727 closed defect (fixed)

OpenBLAS @0.3.14: ld: file not found: loader_path

Reported by: Gregory-Gelfond (Gregory Gelfond) Owned by: michaelld (Michael Dickens)
Priority: Normal Milestone:
Component: ports Version: 2.6.4
Keywords: arm64 Cc: NicosPavlov, michaelld (Michael Dickens), mascguy (Christopher Nielsen), cjones051073 (Chris Jones), Schamschula (Marius Schamschula)
Port: OpenBLAS

Description (last modified by Gregory-Gelfond (Gregory Gelfond))

Installing py39-matplotlib using the latest toolchain and the invocation sudo port install py39-matplotlib on an Apple M1 laptop fails on OpenBLAS:

--->  Fetching distfiles for OpenBLAS
--->  Attempting to fetch OpenBLAS-0.3.14.tar.gz from https://github.com/xianyi/OpenBLAS/tarball/v0.3.14
--->  Verifying checksums for OpenBLAS                                          
--->  Extracting OpenBLAS
--->  Applying patches to OpenBLAS
--->  Configuring OpenBLAS
--->  Building OpenBLAS
Error: Failed to build OpenBLAS: command execution failed
Error: See /opt/local/var/macports/logs/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/main.log for details.
Error: Follow https://guide.macports.org/#project.tickets to report a bug.
Error: Processing of port py39-matplotlib failed

I've attached the main.log file.

Attachments (2)

main.log (3.2 MB) - added by Gregory-Gelfond (Gregory Gelfond) 7 months ago.
main.log file
main_ankita.log (3.7 MB) - added by ankitadhar 7 months ago.

Change History (54)

Changed 7 months ago by Gregory-Gelfond (Gregory Gelfond)

Attachment: main.log added

main.log file

comment:1 Changed 7 months ago by Gregory-Gelfond (Gregory Gelfond)

Description: modified (diff)

Minor revision (added the port invocation used)

comment:2 Changed 7 months ago by jmroot (Joshua Root)

Possibly a duplicate of #61700?

comment:3 Changed 7 months ago by Gregory-Gelfond (Gregory Gelfond)

It's possible but I'm not sure. Firstly, it was for a different origination point py39-matplotlib instead of py38-matplotlib, and I'm not sure if the distinction is important. Secondly, the suggested workaround in #61700 is to try running port install OpenBLAS +native. This doesn't work and leads to a compile error as well.

comment:4 Changed 7 months ago by NicosPavlov

It seems to be another error. The version of python should not have any influence, but while #61700 fails to compile some code, this is the linker which here fails after having compiled everything:

...
:info:build perl ./gensymbol osx arm64 _ 0 0  0 0 0 0 "" "" 0 0 1 1 1 1 > osx.def
:info:build /opt/local/bin/gfortran-mp-devel -O3 -Wall -frecursive -fno-optimize-sibling-calls -fPIC -march=armv8-a  -all_load -headerpad_max_install_names -install_name "/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/xianyi-OpenBLAS-2f6d35c/exports/../libopenblas.1.dylib" -dynamiclib -o ../libopenblas-r1.dylib ../libopenblas-r1.a -Wl,-exported_symbols_list,osx.def  -L/opt/local/lib -L/opt/local/lib/gcc-devel/gcc/aarch64-apple-darwin20/11.0.1 -L/opt/local/lib/gcc-devel/gcc/aarch64-apple-darwin20/11.0.1/../../.. -Wl,-rpath,,loader_path -Wl,-rpath,/opt/local/lib -Wl,-rpath,/opt/local/lib/gcc-devel/gcc/aarch64-apple-darwin20/11.0.1 -Wl,-rpath,/opt/local/lib/gcc-devel/gcc/aarch64-apple-darwin20/11.0.1/../../.. -L/opt/local/lib -L/opt/local/lib/gcc-devel/gcc/aarch64-apple-darwin20/11.0.1 -L/opt/local/lib/gcc-devel/gcc/aarch64-apple-darwin20/11.0.1/../../.. -Wl,-rpath,,loader_path -Wl,-rpath,/opt/local/lib -Wl,-rpath,/opt/local/lib/gcc-devel/gcc/aarch64-apple-darwin20/11.0.1 -Wl,-rpath,/opt/local/lib/gcc-devel/gcc/aarch64-apple-darwin20/11.0.1/../../..  -lgfortran -lm -lSystem -lgfortran -lm -lSystem
:info:build ld: file not found: loader_path
:info:build collect2: error: ld returned 1 exit status
...

I indeed do not see what this "loader_path" is doing here. I checked in my case (x86 CPU), all linkage is done through direct reference to path (i.e. "-L/opt/local/lib -L/opt/local/lib/gcc10/gcc/x86_64-apple-darwin19/10.3.0 ...").

comment:5 Changed 7 months ago by mascguy (Christopher Nielsen)

It looks like OpenBLAS is being built with the following variants on ARM Macs, when implicitly installed as a dependency: +gccdevel+lapack+native. And that's causing oodles of buildbot failures.

Does anyone know why those specific variants are selected ONLY for an ARM build? Is ARM support only included in the development version of GCC at the moment...?

Last edited 7 months ago by mascguy (Christopher Nielsen) (previous) (diff)

comment:6 Changed 7 months ago by mascguy (Christopher Nielsen)

Cc: mascguy added

comment:7 Changed 7 months ago by freedomtan ("freedom" Koan-Sin Tan)

I think this is caused by recent gcc-devel & libgcc-devel, because I was able to build it about one month. When I changed gcc-devel and ligcc-devel from 11-20210418 back to 11-20210220, I was able to build OpenBLAS-0.3.14.

no, this is different from #61700.

comment:8 Changed 7 months ago by mascguy (Christopher Nielsen)

This may be causing a downstream issue, when generating Python bindings for opencv4:

issue:62744 - py39-opencv4 4.5.2: staging into destroot fails

comment:9 in reply to:  7 ; Changed 7 months ago by ankitadhar

how to change gcc-devel and libgcc-devel? Or should I wait for the fixes?

Replying to freedomtan:

I think this is caused by recent gcc-devel & libgcc-devel, because I was able to build it about one month. When I changed gcc-devel and ligcc-devel from 11-20210418 back to 11-20210220, I was able to build OpenBLAS-0.3.14.

no, this is different from #61700.

comment:10 in reply to:  9 ; Changed 7 months ago by freedomtan ("freedom" Koan-Sin Tan)

what I did was port uninstall gcc-level libgcc-devel, changed /opt/local/var/macports/sources/rsync.macports.org/macports/release/tarballs/ports/lang/gcc-devel/Portfile to use 11-20210220 instead of 11-20210418 and port install them again.

Replying to ankitadhar:

how to change gcc-devel and libgcc-devel? Or should I wait for the fixes?

Replying to freedomtan:

I think this is caused by recent gcc-devel & libgcc-devel, because I was able to build it about one month. When I changed gcc-devel and ligcc-devel from 11-20210418 back to 11-20210220, I was able to build OpenBLAS-0.3.14.

no, this is different from #61700.

comment:11 Changed 7 months ago by NicosPavlov

I could not confirm that it is actually the issue as I don't have an M1 processor, but If you want to install an older version of gcc-devel, you can follow the procedure detailed in how to install older ports.

Last edited 7 months ago by NicosPavlov (previous) (diff)

comment:12 Changed 7 months ago by kencu (Ken)

there are indeed some very recent changes to the gcc arm64 branch that involve switching to using @loader_path on most systems, e.g.

https://github.com/iains/gcc-darwin-arm64/commit/fb623616ef18fb9bf8254e41351a08a92ba3f91f

https://github.com/iains/gcc-darwin-arm64/commit/65675a9f317f13b72a20869271bcc127fcfb12b0

It would not be a surprise to find hiccups show up in this process.

Last edited 7 months ago by kencu (Ken) (previous) (diff)

comment:13 Changed 7 months ago by kencu (Ken)

Nicos - you can help debug this on any system by switching to that branch on Intel if you like.

It's the new plan, and will be coming out in gcc-11 soon enough I imagine.

Last edited 7 months ago by kencu (Ken) (previous) (diff)

comment:14 in reply to:  10 Changed 7 months ago by ankitadhar

Thank you @freedomtan for your response. However, I am still getting error, even after changing 11-20210418 to 11-20210220 in Portfile of gcc-devel. Do I need to change anywhere else?

Replying to freedomtan:

what I did was port uninstall gcc-level libgcc-devel, changed /opt/local/var/macports/sources/rsync.macports.org/macports/release/tarballs/ports/lang/gcc-devel/Portfile to use 11-20210220 instead of 11-20210418 and port install them again.

Replying to ankitadhar:

how to change gcc-devel and libgcc-devel? Or should I wait for the fixes?

Replying to freedomtan:

I think this is caused by recent gcc-devel & libgcc-devel, because I was able to build it about one month. When I changed gcc-devel and ligcc-devel from 11-20210418 back to 11-20210220, I was able to build OpenBLAS-0.3.14.

no, this is different from #61700.

Changed 7 months ago by ankitadhar

Attachment: main_ankita.log added

comment:15 Changed 7 months ago by ankitadhar

Attached my log file (main_ankita.log).

comment:16 Changed 7 months ago by NicosPavlov

...
:debug:main epoch: in tree: 4 installed: 4
:debug:main gcc-devel 11-20210418_0 exists in the ports tree
:debug:main gcc-devel 11-20210418_0  is the latest installed
:debug:main gcc-devel 11-20210418_0  is active
:debug:main Merging existing variants '' into variants
...

You have not installed the previous version of gcc. After changing the version in the portfile, you also need to install it.

comment:17 in reply to:  16 ; Changed 7 months ago by ankitadhar

I did the following:

  1. port uninstall gcc-level libgcc-devel
  2. changed Portfile
  3. sudo port install

Did I miss anything?

Replying to NicosPavlov:

...
:debug:main epoch: in tree: 4 installed: 4
:debug:main gcc-devel 11-20210418_0 exists in the ports tree
:debug:main gcc-devel 11-20210418_0  is the latest installed
:debug:main gcc-devel 11-20210418_0  is active
:debug:main Merging existing variants '' into variants
...

You have not installed the previous version of gcc. After changing the version in the portfile, you also need to install it.

comment:18 in reply to:  17 ; Changed 7 months ago by NicosPavlov

You can refer to how to install older ports for detailed instructions.

I did the following:

  1. port uninstall gcc-level libgcc-devel
  2. changed Portfile
  3. sudo port install

This command sudo port install (if that is exactly what you ran) does not work. You need to specify which ports to install, i.e. sudo port install gcc-devel libgcc-devel. You can also check which ports are installed with port installed.

Last edited 7 months ago by ryandesign (Ryan Schmidt) (previous) (diff)

comment:19 in reply to:  18 ; Changed 7 months ago by ankitadhar

I am still confused. When I had installed macports, I had downloaded a package which did the task for me. In how to install older ports, I can am able to clone the repository.

My doubts:

  1. If I clone commit of my interest in say 'Documents', and then go to lang/gcc-devel and run sudo port install, then I get the following error:

Error: Unable to execute port: Could not open file: /Users/ankitadhar/Documents/macports-ports/lang/gcc-devel/Portfile. If this is the right way of doing then how to overcome this error.

  1. If I am required to replace all the ports in the path /opt/local/var/macports/sources/rsync.macports.org/macports/release/tarballs/ports/lang, then what next steps do I need to follow?

I wish to thank you for the time and support. I am really thankful, cuz I am clueless without these chats.

Replying to NicosPavlov:

You can refer to how to install older ports for detailed instructions.

I did the following:

  1. port uninstall gcc-level libgcc-devel
  2. changed Portfile
  3. sudo port install

This command sudo port install (if that is exactly what you ran) does not work. You need to specify which ports to install, i.e. sudo port install gcc-devel libgcc-devel. You can also check which ports are installed with port installed.

Last edited 7 months ago by ryandesign (Ryan Schmidt) (previous) (diff)

comment:20 in reply to:  19 ; Changed 7 months ago by ankitadhar

I hope my question is clear. If not then do comment.

Replying to ankitadhar:

I am still confused. When I had installed macports, I had downloaded a package which did the task for me. In how to install older ports, I can am able to clone the repository.

My doubts:

  1. If I clone commit of my interest in say 'Documents', and then go to lang/gcc-devel and run sudo port install, then I get the following error:

Error: Unable to execute port: Could not open file: /Users/ankitadhar/Documents/macports-ports/lang/gcc-devel/Portfile. If this is the right way of doing then how to overcome this error.

  1. If I am required to replace all the ports in the path /opt/local/var/macports/sources/rsync.macports.org/macports/release/tarballs/ports/lang, then what next steps do I need to follow?

I wish to thank you for the time and support. I am really thankful, cuz I am clueless without these chats.

Replying to NicosPavlov:

You can refer to how to install older ports for detailed instructions.

I did the following:

  1. port uninstall gcc-level libgcc-devel
  2. changed Portfile
  3. sudo port install

This command sudo port install (if that is exactly what you ran) does not work. You need to specify which ports to install, i.e. sudo port install gcc-devel libgcc-devel. You can also check which ports are installed with port installed.

Last edited 7 months ago by ryandesign (Ryan Schmidt) (previous) (diff)

comment:21 in reply to:  20 Changed 7 months ago by NicosPavlov

Replying to ankitadhar: Sorry, my past answer was possibly not that clear, I did not follow that you had cloned the repo already.

My doubts:

  1. If I clone commit of my interest in say 'Documents', and then go to lang/gcc-devel and run sudo port install, then I get the following error:

Error: Unable to execute port: Could not open file: /Users/ankitadhar/Documents/macports-ports/lang/gcc-devel/Portfile. If this is the right way of doing then how to overcome this error.

This method should be correct, but there is apparently a permission problem. Please read the instructions in the link I gave you carefully, your issue is already described in there.

Please note that this is also getting off topic for the ticket. If you still have issues in installing a older version of a port, the macports-users mailing list might be a better place for such questions.

comment:22 Changed 7 months ago by ryandesign (Ryan Schmidt)

You can't use $HOME/Documents or other locations inside $HOME that other users (specifically the macports user) cannot read.

comment:23 in reply to:  13 Changed 7 months ago by cjones051073 (Chris Jones)

Replying to kencu:

Nicos - you can help debug this on any system by switching to that branch on Intel if you like.

It's the new plan, and will be coming out in gcc-11 soon enough I imagine.

GCC 11 is now out.

https://gcc.gnu.org/gcc-11/changes.html

I did quickly try a build yesterday, but it fell over on my macOS11(Intel) machine with stage{2/3} comparison failures (sorry, no log at this time...).

I haven't had time to poke at it beyond not really spotting any obvious reasons comparing my new port file to the gcc10/devel ones.

If anyone is interested to have a go themselves, you can see how far I got in this branch

https://github.com/cjones051073/macports-ports/tree/gcc11

specifically the last commit there.

p.s. @kencu I also note clang 12 is also now out ;)

comment:24 Changed 7 months ago by cjones051073 (Chris Jones)

Cc: cjones051073 added

comment:25 Changed 7 months ago by kencu (Ken)

FYI gcc-11 does not yet have Iain's @loader_path work in it.

Only the arm64 branch currently has it (and that is why we are seeing the @loader_path issues only there.)

I'm talking with Iain now -- he is aware of this @loader_path issue causing failures in some builds. We're working on it.

For now -- I'm afraid the latest arm64 branch will have these issues until a plan gets worked out. Everyone affected will have to roll back, or fix OpenBLAS and the other affected builds to properly handle the link flags.

The error is in the builds, not with the arm64 gcc branch, but that matters little to anyone when the build is not working. When it might be fixed, and what the fix might be, is TBA.

clang-12 is out, I have it built here -- just thinking about doing the LLVM_ENABLE_PROJECTS approach to it rather than the moving the parts around approach to controlling the build, which means rejigging a bunch of patches and testing out the destrooting. I wanted to use NINJA, but NINJA can't work with our selective destrooting plan, for example.

Working on it.

Last edited 7 months ago by kencu (Ken) (previous) (diff)

comment:26 Changed 7 months ago by kencu (Ken)

by the way, Iain tells me gcc-10 is 100% good on 10.4 up, and it can bootstrap with gcc-3.4 or later.

The only reason I have not enabled all that is our modifications to cctools make the gcc build use clang for the assembler sometimes, and that blows up the i386 builds. So I have written up some patches to disable the cctools assembler mods when building gcc using the ENV VARS we have for that, but it is -- messy -- to force gcc to obey ENV VARs during the build as it is specifically written to santize them out... so there's that.

Last edited 7 months ago by kencu (Ken) (previous) (diff)

comment:27 Changed 7 months ago by cjones051073 (Chris Jones)

I just stumbled over a comment in the home-brew recipe for gcc pointing to a patched gcc10 tarball for arm

https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/gcc.rb

I've updated the gcc10 port to use this

https://github.com/macports/macports-ports/commit/2e4c80e7ec18d1dd3b5d8222531598f653955e9a

I don't have an arm machine to directly test on, but I checked the 'arm' version configured and started building fine on my macOS11(Intel) machine here. So I pushed it to the buildbots to see what it makes of it.

If it works out OK, we could then make gcc10 the default on arm, instead of the current devel port. This port was always intended not for general use, but to test out upcoming versions. It was only used for arm in the absence of anything else...

Last edited 7 months ago by ryandesign (Ryan Schmidt) (previous) (diff)

comment:28 Changed 7 months ago by kencu (Ken)

Homebrew decided to carry a patchset of Iain's patches backported to gcc10 instead of using his current gcc-devel tip.

Those patches may well get the @loader_path treatment soon enough I expect, as Iain is moving that into gcc mainline.

We should fix OpenBLAS -- it's link link is wrong and it is messing up the @loader_path line.

Last edited 7 months ago by kencu (Ken) (previous) (diff)

comment:29 Changed 7 months ago by kencu (Ken)

Oh, BTW -- we have no ARM buildbot any more.

comment:30 in reply to:  28 Changed 7 months ago by cjones051073 (Chris Jones)

Replying to kencu:

Homebrew decided to carry a patchset of Iain's patches backported to gcc10 instead of using his current gcc-devel tip.

Those patches may well get the @loader_path treatment soon enough I expect, as Iain is moving that into gcc mainline.

The build is from a fixed tarball, so Iain might well update things in his GitHub gcc10 branch I suppose, but they won't hit the gcc10 version until we switch to a different tarball..

We should fix OpenBLAS -- it's link link is wrong and it is messing up the @loader_path line.

I don't disagree. But there is I think a real advantage to switching back to using a stable gcc10 build on arm, rather than the gcc-devel port which as I said was *never* intended to be used for port building, and we only did as at the time there was no alternative.

I didn't realise there was a back port to the stable gcc10 branch, if I had I would have suggested using it sooner.

Last edited 7 months ago by cjones051073 (Chris Jones) (previous) (diff)

comment:31 in reply to:  29 Changed 7 months ago by cjones051073 (Chris Jones)

Replying to kencu:

Oh, BTW -- we have no ARM buildbot any more.

yes we do....

https://build.macports.org/waterfall

Ryan restored it about a week or so ago.

comment:32 Changed 7 months ago by radarhere (Andrew Murray)

Cc: radarhere added

comment:33 Changed 7 months ago by cjones051073 (Chris Jones)

gcc10 is working fine. I've updated the default compilers PG so gcc10 is now the default variant for arm, as intel. OpenBLAS has rebuilt fine with this new default

https://build.macports.org/builders/ports-11_arm64-builder/builds/19303/steps/install-port/logs/stdio

note, you may need to manually switch any ports you have installed from the old to new default variants. Just (force) uninstall any using +gccdevel and then reinstall with the new defaults.

comment:34 Changed 7 months ago by cjones051073 (Chris Jones)

Note, this is not to say the issues OpenBLAS has with gccdevel should not be investigated, and that still can be by just manually selecting the +gccdevel variant. At least though for those that just want a working build the default now provides this.

comment:35 Changed 7 months ago by cjones051073 (Chris Jones)

Owner: set to michaelld
Status: newassigned

comment:36 Changed 7 months ago by cjones051073 (Chris Jones)

Summary: OpenBLAS fails to build on Apple M1OpenBLAS fails to build on Apple M1 using gccdevel

comment:37 Changed 7 months ago by radarhere (Andrew Murray)

Cc: radarhere removed

comment:38 Changed 7 months ago by kencu (Ken)

FX's backport of Iain's changes is really no less experimental, and tbh came weeks after our working gccdevel allowed all our arm ports to build.

The reason we are here is because Iain decided to push his @loader_path fix and we updated to that.

Those @loader_path fixes are planned for all the gcc branches, so stay tuned!

Last edited 7 months ago by kencu (Ken) (previous) (diff)

comment:39 Changed 7 months ago by ryandesign (Ryan Schmidt)

Keywords: arm64 added
Summary: OpenBLAS fails to build on Apple M1 using gccdevelOpenBLAS @0.3.14: ld: file not found: loader_path

comment:40 Changed 7 months ago by mascguy (Christopher Nielsen)

It looks like this port just failed to build on ARM again, two days ago:

https://build.macports.org/builders/ports-11_arm64-builder/builds/20119

So are we expecting this to build successfully on ARM now, with the switch to gcc10, etc?

comment:41 Changed 7 months ago by kencu (Ken)

we would be expecting that, yes.

Do you have an arm machine? Perhaps you can take a peek at the failing file with otool and see what is going on. Does libgfortan have the proper install name now? If so, why does openblas not use it? If not, -- well why not? It should have the right name, as the @loader_path changes are not (and never were) in libgcc10.

Enquiring minds want to know!

comment:42 Changed 7 months ago by mascguy (Christopher Nielsen)

I'd happily help if I could, but I'm still chugging along with a 2008 MacPro. Heck, I can't even get a working Big Sur VM going. Lol

comment:43 Changed 7 months ago by mascguy (Christopher Nielsen)

FYI, it looks like ScaleWay is offering time on M1 Mac Mini's for 0.10 euro/hour:

https://www.scaleway.com/en/pricing/#apple-silicon

comment:44 in reply to:  43 Changed 7 months ago by mascguy (Christopher Nielsen)

Replying to mascguy:

FYI, it looks like ScaleWay is offering time on M1 Mac Mini's for 0.10 euro/hour:

https://www.scaleway.com/en/pricing/#apple-silicon

I'll probably take advantage of this soon, to help assist with fixing ARM builds like OpenBLAS.

While the minimum reservation is 24 hours, it's still only a few dollars/day. And boy would it be great to get more of these ports fixed and working!

I'd encourage others to consider it too.

comment:45 Changed 7 months ago by Schamschula (Marius Schamschula)

Cc: Schamschula added

comment:47 Changed 7 months ago by cjones051073 (Chris Jones)

Resolution: fixed
Status: assignedclosed

comment:48 Changed 7 months ago by kencu (Ken)

Well, at least OpenBLAS is now properly fixed, so now we can stop thinking the issue was with the changes in Iain's "reference" gcc-on-arm branch.

He will be pleased about that.

comment:50 Changed 7 months ago by cjones051073 (Chris Jones)

Yet another problem

https://build.macports.org/builders/ports-11_arm64-builder/builds/20479/steps/install-port/logs/stdio

ImportError: dlopen(/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/numpy/core/_multiarray_umath.cpython-39-darwin.so, 2): Library not loaded: @rpath/libgcc_s.1.1.dylib
  Referenced from: /opt/local/lib/libopenblas-r1.dylib
  Reason: image not found

something must still be screwy with the OpenBLAS build as I have no idea how its getting a reference to libgcc_s.1.1.dylib .... it should be libgcc_s.1.dylib

comment:51 Changed 7 months ago by cjones051073 (Chris Jones)

Hang on, I think I misinterpreted the above error. It looks like the issue is actually in {lib}gcc11 in that libgcc_s.1.1.dylib is a real library that needs to be included...

Note: See TracTickets for help on using tickets.