New Ticket     Wiki     Browse Source     Timeline     Roadmap     Ticket Reports     Search

Ticket #22936 (new defect)

Opened 3 months ago

Last modified 7 weeks ago

numpy atlas segmentation fault

Reported by: jjstickel@… Owned by: mcalhoun@…
Priority: Normal Milestone:
Component: ports Version: 1.8.1
Keywords: Cc: jameskyle@…, ram@…
Port: py25-numpy py26-numpy atlas

Description

Not sure if this is a bug in atlas or numpy (I think atlas). If I run the following test script involving a multiplication of two non-square matrices of a reasonable size, I get a segmentation fault:

#!/usr/bin/env python
"""
segfaults with numpy.dot for reasonably sized arrays
"""
import numpy as np
n = 100
 # segfualt for n > 51
B = np.dot( np.random.rand(n,n), np.random.rand(n,n+1) )
print(B.shape)

A workaround is to use the no_atlas variant of numpy. I tested with both py25-numpy and py26-numpy. I am on Tiger with Intel cpu.

Change History

  Changed 3 months ago by jjstickel@…

This may be related to this bug:

 http://projects.scipy.org/numpy/ticket/551

But it should be fixed for numpy>=1.3, and so I am not sure.

  Changed 3 months ago by macsforever2000@…

  • cc jameskyle@…, ram@… added
  • keywords numpy atlas removed
  • port set to py25-numpy py26-numpy atlas
  • owner changed from macports-tickets@… to mcalhoun@…

This works fine for me with py26-numpy on Snow Leopard. I even tested with n=500. Can you verify your installed versions?

port installed python25 python26 py25-numpy py26-numpy atlas
python_select -s

  Changed 3 months ago by jjstickel@…

 $ port installed python25 python26 py25-numpy py26-numpy atlas
The following ports are currently installed:
  atlas @3.8.3_1 (active)
  py25-numpy @1.3.0_1
  py25-numpy @1.3.0_1+no_atlas (active)
  py26-numpy @1.3.0_0
  py26-numpy @1.3.0_0+no_atlas (active)
  python25 @2.5.4_9+darwin+darwin_8+macosx (active)
  python26 @2.6.4_0+darwin (active)
$ python_select -s
python25
$ /opt/local/bin/python2.6 numpy_dot_test.py
(100, 101)
$ sudo port deactivate py26-numpy
Password:
--->  Deactivating py26-numpy
$ sudo port activate py26-numpy @1.3.0_0
--->  Activating py26-numpy @1.3.0_0
$ /opt/local/bin/python2.6 numpy_dot_test.py
Segmentation fault
$ uname -mr
8.11.1 i386

Same thing with atlas vs. no_atlas variant of py25-numpy.

  Changed 2 months ago by jjstickel@…

I also found segfaults occurring with py25-scipy on Tiger. Again, using the no_atlas variant provides a workaround. I tried re-installing atlas, but that did not help. Unless someone has some suggestions, I guess I can just use "no_atlas" until I upgrade to snow-leopard sometime this year.

  Changed 2 months ago by jjstickel@…

Another note: same problem in octave, but unfortunately octave does not (yet) have a no-atlas variant (see bug #22997). Atlas is definitely buggy on Tiger, both PPC and Intel. Is anyone is motivated to fix atlas on Tiger, that would be great. If not, I would be satisfied for this bug to be closed after #22997 is resolved.

Thanks, Jonathan

follow-up: ↓ 7   Changed 2 months ago by jameskyle@…

I have been unable to reproduce this error on my system.

python2.6, py26-numpy @1.4 10.6, Mac Pro

-james

in reply to: ↑ 6 ; follow-up: ↓ 8   Changed 2 months ago by jjstickel@…

Replying to jameskyle@…:

I have been unable to reproduce this error on my system. python2.6, py26-numpy @1.4 10.6, Mac Pro

Right: it seems that atlas is broken on Tiger (10.4). Are there some simple atlas tests that I can run to prove it?

Thanks, Jonathan

in reply to: ↑ 7   Changed 7 weeks ago by mark.lescroart@…

I have the exact same problem running OSX Leopard (python2.6, py26-numpy@1.4 10.5, Macbook Pro) on two separate systems (mine and my labmate's essentially identical system - his is only a newer Macbook Pro, same OS, same Macports / numpy versions).

Any more progress on this? Is there any way to test ATLAS directly?

Thanks,

Mark

Replying to jjstickel@…:

Replying to jameskyle@…:

I have been unable to reproduce this error on my system. python2.6, py26-numpy @1.4 10.6, Mac Pro

Right: it seems that atlas is broken on Tiger (10.4). Are there some simple atlas tests that I can run to prove it? Thanks, Jonathan

  Changed 7 weeks ago by jameskyle@…

I worked a bit with mark above and he provided some good feedback. The problem seems pretty elusive, but I did notice one discrepancy between my install on 10.6 and his install on tiger.

py26-numpy (and atlas) are built using gcc43 by default.

On my system, besides the core atlas libraries, the py26-numpy libs are only linked to the libSystem.B.dylib library.

/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy/core/_dotblas.so:
        /opt/local/lib/libptf77blas.dylib (compatibility version 0.0.0, current version 0.0.0)
        /opt/local/lib/libptcblas.dylib (compatibility version 0.0.0, current version 0.0.0)
        /opt/local/lib/libatlas.dylib (compatibility version 0.0.0, current version 0.0.0)
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 124.1.1)

The cblas libraries on 10.4, however, were also linked against the system gcc libraries:

/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/numpy/core/_dotblas.so:
 /opt/local/lib/libptf77blas.dylib (compatibility version 0.0.0, current version 0.0.0)
 /opt/local/lib/libptcblas.dylib (compatibility version 0.0.0, current version 0.0.0)
 /opt/local/lib/libatlas.dylib (compatibility version 0.0.0, current version 0.0.0)
 /usr/lib/libgcc_s.1.dylib (compatibility version 1.0.0, current version 1.0.0) <==================
 /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 111.1.4)

I could easily see cross linking leading to incorrect symbol lookups and the segfault behavior being seen.

This would also be in the py26-numpy build process, if I have time this weekend I'll give it a closer look.

-james

  Changed 7 weeks ago by jameskyle@…

I should add that mark said he did *not* compile with the no_gcc43 variant.

Note: See TracTickets for help on using tickets.