Ticket #35508 (closed defect: fixed)
arpack port does not work on Lion with GFortran 4.6.2 due to Accelerate problem
| Reported by: | gcrosswhite@… | Owned by: | mmoll@… |
|---|---|---|---|
| Priority: | Normal | Milestone: | |
| Component: | ports | Version: | 2.1.2 |
| Keywords: | Cc: | ||
| Port: | arpack |
Description
I have seen problem this before and thought it had been squashed in this port but it has appeared again.
ARPACK has a problem in that it uses the BLAS routine ZDOTC which has a different calling convention in Accelerate.framework then that used by GFortran which causes crashes that I have encountered in my code. I know that this was the source of the problem because when I downloaded arpack-ng and patched it manually, replacing
X = ZDOTC(....)
with
call ZDOTC(X,...)
then the problems went away.
I am not sure how people would prefer to see this problem solved, but I could submit a patch making the changes above if you all would like.
Attachments
Change History
comment:2 Changed 10 months ago by macsforever2000@…
- Owner changed from macports-tickets@… to mmoll@…
comment:3 Changed 10 months ago by mmoll@…
I have Mountain Lion installed and can't reproduce this. I just reinstalled arpack @3.1.1_2+accelerate+gcc46+openmpi. Can you attach your main.log file?
comment:4 Changed 10 months ago by gcrosswhite@…
I didn't see anything in /opt/local/var/macports/logs, but I wasn't expecting to as the port builds just fine; the problem is that the resulting library is not okay because it segfaults at runtime because it is using the wrong calling convention for some BLAS routines such as zdotc.
To create a simple test case that illustrates the problem, I compiled the test program zndrv1.f in EXAMPLES/COMPLEX of the main ARPACK distribution and linked it against the MacPorts build of libarpack.a. The result was:
$ gfortran zndrv1.f /opt/local/lib/libarpack.a -framework Accelerate $ ./a.out zsh: segmentation fault ./a.out
We can see where the segmentation fault is coming from by using gdb:
$ gdb ./a.out GNU gdb 6.3.50-20050815 (Apple version gdb-1752) (Sat Jan 28 03:02:46 UTC 2012) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-apple-darwin"...Reading symbols for shared libraries .. (gdb) run Starting program: /Users/gcross/Downloads/ARPACK/EXAMPLES/COMPLEX/a.out Reading symbols for shared libraries +++++................................ done Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000015 0x00007fff87f02d9a in zdotc_ () (gdb) backtrace #0 0x00007fff87f02d9a in zdotc_ () #1 0x0000000100009f8f in zneupd_ () #2 0x0000000100002ac1 in MAIN__ () #3 0x0000000100003833 in main ()
So in conclusion the crash is related to zdotc, and when I linked the test program against my own version of libarpack.a which did the replacement I discussed earlier the program ran just fine:
$ gfortran zndrv1.f /usr/local/lib/libarpack.a -framework Accelerate
$ ./a.out
Ritz values (Real, Imag) and relative residuals
-----------------------------------------------
Col 1 Col 2 Col 3
Row 1: 7.16197D+02 1.02958D+03 6.80426D-15
Row 2: 7.16197D+02 -1.02958D+03 9.03466D-15
Row 3: 6.87583D+02 1.02958D+03 1.11184D-14
Row 4: 6.87583D+02 -1.02958D+03 1.58575D-14
_NDRV1
======
Size of the matrix is 100
The number of Ritz values requested is 4
The number of Arnoldi vectors generated (NCV) is 20
What portion of the spectrum: LM
The number of converged Ritz values is 4
The number of Implicit Arnoldi update iterations taken is 25
The number of OP*x is 392
The convergence criterion is 1.11022302462515654E-016
So, this doesn't quite answer your question, but it is the closest answer I can think of at the moment that provides you with a log that records the problem, as well as an example easily available test case that triggers it.
comment:5 Changed 10 months ago by mmoll@…
Ah, I get it now. If you could submit a patch, that'd be great.
Changed 10 months ago by gcrosswhite@…
- Attachment patches.tar.gz added
Patches to change all CDOTC and ZDOTC calls to work with Accelerate.
comment:6 Changed 10 months ago by gcrosswhite@…
I did a grep through the sources and changed every call to either CDOTC or ZDOTC so that they were treated like subroutines with the return value stored in the first argument rather than like functions. I did some spot checks to make sure that the resulting library is good; the changes made the double-precision complex valued tests work (e.g., zndrv* in EXAMPLES/COMPLEX) but for some reason lots of other test including the single-precision complex tests in COMPLEX/ fail both before and after makings the changes; however, they do so with an error message rather than a segfault so I don't think that their problem is related to this one, and in particular these changes don't seem to be making anything worse.
I have attached the patches for all of the files that I changed; there are 24 in total: 4 base files * 2 precisions * 3 modes (sequential, parallel MPI, parallel BLACS).
VERY IMPORTANT: You most likely already were going to do this but just to be sure: make sure that this patch is only applied when using Accelerate! This is because only Accelerate has the weird ABI issue that requires this rather strange form of patch in order to work the quirk, so if the path it is applied when using, say, atlas, then it will actually break things rather than fixing them.
comment:7 Changed 10 months ago by mmoll@…
I committed a change in the Portfile that applies your patches in r96280. Please give it a try. One of the patches, patch-SRC-cneupd.f.diff, was 0 bytes. Is that correct?
Changed 10 months ago by gcrosswhite@…
- Attachment patch-SRC-cneupd.f.diff added
Corrected patch for the file SRC/cneupd.f
comment:8 Changed 10 months ago by gcrosswhite@…
Ugh, indeed you caught that one of my patches got screwed up somehow; the corrected version has been attached above. As cneupd.f is not used by my own program, I will try out the new port now.
comment:10 Changed 10 months ago by mmoll@…
- Status changed from new to closed
- Resolution set to fixed
Thanks for your patches. The last patch was added in r96338. Closing this issue.


In the future, please fill in the Port field and Cc the port maintainer(s).