Opened 6 months ago

Closed 5 months ago

#68810 closed defect (fixed)

OpenBLAS: libopenblas.0.dylib cannot find symbol _xerbla_

Reported by: erikbs Owned by: NicosPavlov
Priority: Normal Milestone:
Component: ports Version:
Keywords: Cc: michaelld (Michael Dickens), catap (Kirill A. Korinsky), Dave-Allured (Dave Allured)
Port: OpenBLAS

Description

After the recent major changes to the OpenBLAS Portfile (migration to CMake etc.), libopenblas.0.dylib fails to find the symbol _xerbla_ when loaded. This breaks e.g. py-numpy, which fails on import, and py-scipy, which depends on NumPy being imported successfully.

Here is the output when calling import numpy from the Python 3.11 REPL after installing a py311-numpy version that depends on the new OpenBLAS version:

>>> import numpy
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/numpy/core/__init__.py", line 24, in <module>
    from . import multiarray
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/numpy/core/multiarray.py", line 10, in <module>
    from . import overrides
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/numpy/core/overrides.py", line 8, in <module>
    from numpy.core._multiarray_umath import (
ImportError: dlopen(/opt/local/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/numpy/core/_multiarray_umath.cpython-311-darwin.so, 2): Symbol not found: _xerbla_
  Referenced from: /opt/local/lib/libopenblas.0.dylib
  Expected in: flat namespace
 in /opt/local/lib/libopenblas.0.dylib

Change History (11)

comment:1 Changed 6 months ago by erikbs

Port: removed

comment:2 Changed 6 months ago by ryandesign (Ryan Carsten Schmidt)

Cc: michaelld catap added
Owner: set to NicosPavlov
Status: newassigned

comment:3 Changed 6 months ago by catap (Kirill A. Korinsky)

erikbs which system do you use? I've used py-numpy as a test port when make that migration, and it works well on macOS 13:

Python 3.10.13 (main, Aug 25 2023, 02:21:32) [Clang 13.1.6 (clang-1316.0.21.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> 

and the same for py311:

Python 3.11.6 (main, Oct  2 2023, 18:01:19) [Clang 13.1.6 (clang-1316.0.21.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>>

comment:4 Changed 6 months ago by jmroot (Joshua Root)

It may be relevant which py-numpy subport is being used, as per #68807.

comment:5 in reply to:  3 Changed 6 months ago by erikbs

I am on 10.9 and I use the default variant configuration for NumPy, which is +openblas +gfortran, I think. It is the OpenBLAS option that is the important one, since it is the OpenBLAS dylib that cannot be loaded.

comment:6 Changed 6 months ago by erikbs

I removed the patch that enables weak linking on older platforms as a test. The build then fails:

...
:info:build [  0%] Building C object driver/others/CMakeFiles/driver_others.dir/xerbla.c.o
:info:build cd /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/driver/others && /opt/local/bin/clang-mp-16  -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/OpenBLAS-0.3.25 -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build -pipe -O3 -DNDEBUG -I/opt/local/include -arch x86_64  -DHAVE_C11 -Wall -m64 -mavx2 -mavx -msse -msse2 -msse3 -mssse3 -msse4.1 -fPIC -DSMALL_MATRIX_OPT -DNO_AVX512 -DSMP_SERVER -DNO_WARMUP -DMAX_CPU_NUMBER=8 -DMAX_PARALLEL_NUMBER=1 -DMAX_STACK_ALLOC=2048 -DNO_AFFINITY -DVERSION="\"0.3.25\"" -DBUILD_SINGLE -DBUILD_DOUBLE -DBUILD_COMPLEX -DBUILD_COMPLEX16 -arch x86_64 -mmacosx-version-min=10.9 -MD -MT driver/others/CMakeFiles/driver_others.dir/xerbla.c.o -MF CMakeFiles/driver_others.dir/xerbla.c.o.d -o CMakeFiles/driver_others.dir/xerbla.c.o -c /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/OpenBLAS-0.3.25/driver/others/xerbla.c
...
:info:build [ 11%] Building C object interface/CMakeFiles/interface.dir/CMakeFiles/xerbla.c.o
:info:build cd /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/interface && /opt/local/bin/clang-mp-16  -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/OpenBLAS-0.3.25 -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build -pipe -O3 -DNDEBUG -I/opt/local/include -arch x86_64  -DHAVE_C11 -Wall -m64 -mavx2 -mavx -msse -msse2 -msse3 -mssse3 -msse4.1 -fPIC -DSMALL_MATRIX_OPT -DNO_AVX512 -DSMP_SERVER -DNO_WARMUP -DMAX_CPU_NUMBER=8 -DMAX_PARALLEL_NUMBER=1 -DMAX_STACK_ALLOC=2048 -DNO_AFFINITY -DVERSION="\"0.3.25\"" -DBUILD_SINGLE -DBUILD_DOUBLE -DBUILD_COMPLEX -DBUILD_COMPLEX16 -arch x86_64 -mmacosx-version-min=10.9 -MD -MT interface/CMakeFiles/interface.dir/CMakeFiles/xerbla.c.o -MF CMakeFiles/interface.dir/CMakeFiles/xerbla.c.o.d -o CMakeFiles/interface.dir/CMakeFiles/xerbla.c.o -c /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/interface/CMakeFiles/xerbla.c
...
:info:build ar: creating archive libopenblas.a
:info:build /opt/local/bin/ranlib: file: libopenblas.a(xerbla.c.o) has no symbols
:info:build /opt/local/bin/ranlib: file: libopenblas.a(xerbla.c.o) has no symbols
:info:build /opt/local/bin/ranlib: file: libopenblas.a(xerbla.c.o) has no symbols
:info:build /opt/local/bin/ranlib: file: libopenblas.a(la_constants.f90.o) has no symbols
:info:build /opt/local/bin/ranlib: file: libopenblas.a(xerbla.c.o) has no symbols
:info:build /opt/local/bin/ranlib: file: libopenblas.a(la_constants.f90.o) has no symbols
:info:build /opt/local/bin/ranlib: file: libopenblas.a(xerbla.c.o) has no symbols
:info:build /opt/local/bin/ranlib: file: libopenblas.a(la_constants.f90.o) has no symbols
:info:build /opt/local/bin/ranlib: file: libopenblas.a(xerbla.c.o) has no symbols
:info:build /opt/local/bin/ranlib: file: libopenblas.a(la_constants.f90.o) has no symbols
:info:build /opt/local/bin/ranlib: file: libopenblas.a(xerbla.c.o) has no symbols
:info:build /opt/local/bin/ranlib: file: libopenblas.a(la_constants.f90.o) has no symbols
:info:build sh -c '/opt/local/bin/ar -ru libopenblas.a /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/driver/others/CMakeFiles/driver_others.dir/xerbla.c.o && exit 0'
:info:build /opt/local/bin/ranlib: file: libopenblas.a(xerbla.c.o) has no symbols
:info:build /opt/local/bin/ranlib: file: libopenblas.a(la_constants.f90.o) has no symbols
:info:build sh -c 'echo "" | /opt/local/bin/gfortran-mp-13 -o dummy.o -c -x f95-cpp-input - '
:info:build f951: Warning: Reading file '<stdin>' as free form
:info:build sh -c '/opt/local/bin/gfortran-mp-13 -fpic -shared -Wl,-all_load -Wl,-force_load,libopenblas.a -Wl,-noall_load dummy.o -o /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/lib/libopenblas.0.3.dylib'
:info:build ld: warning: option -noall_load is obsolete and being ignored
:info:build Undefined symbols for architecture x86_64:
:info:build   "_xerbla_", referenced from:
:info:build       _sgemv_ in libopenblas.a(sgemv.c.o)
:info:build       _sger_ in libopenblas.a(sger.c.o)
:info:build       _strsv_ in libopenblas.a(strsv.c.o)
:info:build       _strmv_ in libopenblas.a(strmv.c.o)
:info:build       _ssyr2_ in libopenblas.a(ssyr2.c.o)
:info:build       _sgbmv_ in libopenblas.a(sgbmv.c.o)
:info:build       _ssbmv_ in libopenblas.a(ssbmv.c.o)
:info:build       ...
:info:build      (maybe you meant: _xerbla_array_)
:info:build ld: symbol(s) not found for architecture x86_64
:info:build collect2: error: ld returned 1 exit status

There are two xerbla.c.o files. Standing in the build directory:

sh-3.2# find . -iname xerbla.c.o
./driver/others/CMakeFiles/driver_others.dir/xerbla.c.o
./interface/CMakeFiles/interface.dir/CMakeFiles/xerbla.c.o

The second does not contain any symbols, but the first one does:

sh-3.2# nm -gU ./driver/others/CMakeFiles/driver_others.dir/xerbla.c.o | grep _xerbla_
0000000000000000 T _xerbla_

However, libopenblas.a does not contain the _xerbla_ symbol:

sh-3.2# nm -gU libopenblas.a | grep _xerbla_
no symbols
no symbols
0000000000000000 T _xerbla_array_

Even when I run

/opt/local/bin/ar -ru libopenblas.a /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/driver/others/CMakeFiles/driver_others.dir/xerbla.c.o

manually, libopenblas.a still does not contain it. It warns about missing symbols, but that seems to be because it references the other xerbla.c.o file (and another file without symbols), but the command does not fail:

sh-3.2# /opt/local/bin/ar -ru libopenblas.a /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/driver/others/CMakeFiles/driver_others.dir/xerbla.c.o
/opt/local/bin/ranlib: file: libopenblas.a(xerbla.c.o) has no symbols
/opt/local/bin/ranlib: file: libopenblas.a(la_constants.f90.o) has no symbols
sh-3.2# echo $?
0

I get ar/ranlib from cctools:

The following ports are currently installed:
  cctools @949.0.1_3+llvm90 (active)

Per this comment, libopenblas.a should have contained _xerbla_ (“even if as a weak symbol”, but __attribute__((weak)) is #ifdef-ed to only apply to ELF).

I have no idea why, but when I did this:

sh-3.2# cc -o xx.o -I$(pwd) -I ../OpenBLAS-0.3.25/ -c ../OpenBLAS-0.3.25/driver/others/xerbla.c
sh-3.2# /opt/local/bin/ar -ru libopenblas.a xx.o
sh-3.2# chown macports libopenblas.a

followed by

install -o openblas +gcc13 +lapack +native

in the MacPorts shell, the linking succeeds and the build finishes. Even NumPy works

So why does

cc -o xx.o -I$(pwd) -I ../OpenBLAS-0.3.25/ -c ../OpenBLAS-0.3.25/driver/others/xerbla.c

produce a usable object file when

cd /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/driver/others && /opt/local/bin/clang-mp-16  -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/OpenBLAS-0.3.25 -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build -pipe -O3 -DNDEBUG -I/opt/local/include -arch x86_64  -DHAVE_C11 -Wall -m64 -mavx2 -mavx -msse -msse2 -msse3 -mssse3 -msse4.1 -fPIC -DSMALL_MATRIX_OPT -DNO_AVX512 -DSMP_SERVER -DNO_WARMUP -DMAX_CPU_NUMBER=8 -DMAX_PARALLEL_NUMBER=1 -DMAX_STACK_ALLOC=2048 -DNO_AFFINITY -DVERSION="\"0.3.25\"" -DBUILD_SINGLE -DBUILD_DOUBLE -DBUILD_COMPLEX -DBUILD_COMPLEX16 -arch x86_64 -mmacosx-version-min=10.9 -MD -MT driver/others/CMakeFiles/driver_others.dir/xerbla.c.o -MF CMakeFiles/driver_others.dir/xerbla.c.o.d -o CMakeFiles/driver_others.dir/xerbla.c.o -c /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/OpenBLAS-0.3.25/driver/others/xerbla.c

does not?

My cc is:

sh-3.2# cc --version
clang version 17.0.6
Target: x86_64-apple-darwin13.4.0
Thread model: posix
InstalledDir: /opt/local/libexec/llvm-17/bin

comment:7 Changed 6 months ago by erikbs

produce a usable object file when

cd /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/driver/others && /opt/local/bin/clang-mp-16  -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/OpenBLAS-0.3.25 -I/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build -pipe -O3 -DNDEBUG -I/opt/local/include -arch x86_64  -DHAVE_C11 -Wall -m64 -mavx2 -mavx -msse -msse2 -msse3 -mssse3 -msse4.1 -fPIC -DSMALL_MATRIX_OPT -DNO_AVX512 -DSMP_SERVER -DNO_WARMUP -DMAX_CPU_NUMBER=8 -DMAX_PARALLEL_NUMBER=1 -DMAX_STACK_ALLOC=2048 -DNO_AFFINITY -DVERSION="\"0.3.25\"" -DBUILD_SINGLE -DBUILD_DOUBLE -DBUILD_COMPLEX -DBUILD_COMPLEX16 -arch x86_64 -mmacosx-version-min=10.9 -MD -MT driver/others/CMakeFiles/driver_others.dir/xerbla.c.o -MF CMakeFiles/driver_others.dir/xerbla.c.o.d -o CMakeFiles/driver_others.dir/xerbla.c.o -c /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/OpenBLAS-0.3.25/driver/others/xerbla.c

does not

Turns out that it actually does … If I manually run this command when the build fails (to regenerate xerbla.c.o) and then resume the build using port install -o, everything works just fine. I can even do

sudo port install -s -o openblas +gcc13 +lapack +native
sudo rm /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_math_OpenBLAS/OpenBLAS/work/build/driver/others/CMakeFiles/driver_others.dir/xerbla.c.o*
sudo port install -s -o openblas +gcc13 +lapack +native

to make it complete successfully.

Thinking that there must have been something with the build order that caused things to fail, I copied the object file from both runs to a temporary location to compare them. The result confused me:

md5 */xerbla.c.o
MD5 (verkar/xerbla.c.o) = 57ac55a93b3cde59adeaaccb658f6206
MD5 (verkar_ikkje/xerbla.c.o) = 57ac55a93b3cde59adeaaccb658f6206

The files are identical. And sure enough, a simple touch <..>/xerbla.c.o (instead of rm) also makes the build succeed! In fact, after experimenting with timestamps, it seems that it is enough if ./driver/others/CMakeFiles/driver_others.dir/xerbla.c.o is newer than ./interface/CMakeFiles/interface.dir/CMakeFiles/xerbla.c.o. man ar revealed that this is the expected behaviour for the -u option.

My preliminary conclusion is that it is *not* weak linking that is the solution on older Mac OS X versions, but rather one of these options:

  1. Ensure that the the correct xerbla.c.o is either linked first or compiled last
  2. Make ar update xerbla.c.o even though the modification time is older than the existing entry. This can be done by using ar -rs instead of ar -ru.

Luckily xerbla.c.o is linked separately and not as part of a bulk operation, so we can safely change ar -ru to ar -rs without consequences for other object files (I found about 75 .o files that are not unique in the build tree).

I have submitted a pull request to OpenBLAS: https://github.com/OpenMathLib/OpenBLAS/pull/4353

comment:8 Changed 6 months ago by catap (Kirill A. Korinsky)

erikbs, can you open a PR to OpenBLAS and OpenBLAS-devel ports with backport of this patch?

comment:9 in reply to:  8 Changed 6 months ago by erikbs

Replying to catap:

erikbs, can you open a PR to OpenBLAS and OpenBLAS-devel ports with backport of this patch?

Good idea; https://github.com/macports/macports-ports/pull/21650

Do you have time to test it on a couple of the versions you tested the weak linking solution on (except Mavericks of course)? I think weak linking no longer is necessary, so I removed that.

comment:10 Changed 6 months ago by Dave-Allured (Dave Allured)

Cc: Dave-Allured added

comment:11 Changed 5 months ago by erikbs

Resolution: fixed
Status: assignedclosed

In 349e801fa01b9834148ab4e27d1c199ce0abfef2/macports-ports (master):

OpenBLAS: fix linking on older Mac versions

Tweak ar options instead of using weak linking. Ensure that the
_xerbla_ symbol exists by always writing the correct xerbla.c.o file
to the archive regardless of compilation order.

Fixes: #68810

Note: See TracTickets for help on using tickets.