Opened 11 years ago

Closed 9 years ago

#38766 closed defect (invalid)

Building atlas with clang 3.3 needs excessive memory

Reported by: bgschaid@… Owned by: Veence (Vincent)
Priority: Normal Milestone:
Component: ports Version: 2.1.3
Keywords: Cc: cooljeanius (Eric Gallager), jere@…, theorikbn@…
Port: atlas

Description (last modified by ryandesign (Ryan Carsten Schmidt))

Tried to do a routine upgrade of the installed software. Amongst others:

port outdated
The following installed ports are outdated:
atlas                          3.10.1_2 < 3.10.1_3 

The installed atlas is

port installed atlas
The following ports are currently installed:
  atlas @3.10.1_2+gcc45 (active)

The update started by fetching clang3.3 which made me suspicious because in the past I already had the experience that clang needs much more memory than gcc, but I figured "Hey. Surely the guy who packaged it knows what he's doing". Compilation of atlas took several hours. When I left the computer and returned an hour later it turned out that the disk which previously had 20Gig free had filled because of bloated swap-files. Computer was thus unusable. Only thing I could see from a htop I had running was some program of user macports with 14Gig Virtual and 4.5Gig residual memory

As gcc45 is no longer an option for atlas I did a "port upgrade atlas +gcc46" and the compilation finished after a quarter of an hour. Maybe that has to do with

Warning: GCC compilers on MacOS do not support AVX: downgrading.

Anyway: would be nice if the default settings for the port would be such that it compiles without problems on machines with moderate memory installation (mine has 8Gig. The maximum that fits into that model). Those who need the extra speed that CLang might provide can always choose that variant I think

Change History (14)

comment:1 Changed 11 years ago by ryandesign (Ryan Carsten Schmidt)

Description: modified (diff)
Owner: changed from macports-tickets@… to vince@…

It is intentional that atlas now defaults to clang; see r104549.

Yes, it's known that clang can use much more memory than gcc in some circumstances.

MacPorts usually starts multiple compiler processes, and by default it limits this to one process per CPU core or 1 process per GB of memory, whichever is less. But these limits were decided upon before we started using clang. Perhaps we should decrease this to one process per 2 GB of memory when clang is in use.

Individual ports can override this e.g. using use_parallel_build no to turn off parallel building entirely, and to my surprise, the atlas portfile already does this. So either the atlas build system is taking matters into its own hands about how many jobs to start, in which case it should be disabused of that notion, or a single clang process is taking that much memory, in which case that's very unfortunate.

comment:2 Changed 11 years ago by cooljeanius (Eric Gallager)

Cc: egall@… added

Cc Me!

comment:3 Changed 11 years ago by bgschaid@…

Replying to ryandesign@…:

It is intentional that atlas now defaults to clang; see r104549.

OK. But judging from the version number Clang 3.3 is still in development. Wouldn't it make sense to only use stable versions (Clang 3.2) as the default? Maybe atlas compiled with reasonable memory usage with a previous version of 3.3. I guess you're not testing every dependent package when a new version of clang 3.3 is uploaded. So that would be an extremely moving target

Yes, it's known that clang can use much more memory than gcc in some circumstances.

MacPorts usually starts multiple compiler processes, and by default it limits this to one process per CPU core or 1 process per GB of memory, whichever is less. But these limits were decided upon before we started using clang. Perhaps we should decrease this to one process per 2 GB of memory when clang is in use.

Individual ports can override this e.g. using use_parallel_build no to turn off parallel building entirely, and to my surprise, the atlas portfile already does this. So either the atlas build system is taking matters into its own hands about how many jobs to start, in which case it should be disabused of that notion, or a single clang process is taking that much memory, in which case that's very unfortunate.

That was something I noticed: that in the beginning the build only used one CPU (my first thought: "somebody is feeling uneasy about CLang"). Later the CPU-indicator was fully "filled". But its hard to tell whether this was accurate with the machine being almost unresponsive

comment:4 Changed 11 years ago by Veence (Vincent)

Hallo!

Clang 3.2 has been released with a bug in the AVX assembler, so if you try to compile Atlas on a recent hardware, it will produce a faulty code. The bug has been fixed early in the clang3.3 stage, so it is a logical choice to use it as a compiler, even though it is only a developement version. If you use clang3.2 or gcc instead, you have to give up using AVX and downgrade to SSE, and you lose 50% bandwidth.

As for the memory use, I’d bet this is not caused directly by Clang, but by the way Atlas probes the machine in order to pick up the best parameters for compilation: it tries every possible combination and then evaluate the resulting code by running it in a loop. By the way, Atlas has precompiled values for gcc47, so it has not to go through this time-consuming phase. However, since clang failed to pass most of the tests up to the 3.3 version, no values are available for it.

Last edited 11 years ago by Veence (Vincent) (previous) (diff)

comment:5 Changed 11 years ago by Veence (Vincent)

Resolution: wontfix
Status: newclosed

I’ve compiled Atlas with clang3.3 more than once and never stumbled upon such a phenomenon. Memory utilization seem to be fairly close to that of gcc. It might have been caused by the vectorizing routines that I have disabled since. So please go ahead and rebuild, if it’s not already done.

comment:6 Changed 11 years ago by tanner@…

Hello, The same problem occurs on my i7 with 10GB RAM and Macports 2.1.3 and atlas-3.10.1. After several hours of compilation the program build/tune/lapack/xdlanbsrch_pt use excessive amounts of RAM and CPU and I had to kill it.

comment:7 Changed 11 years ago by Veence (Vincent)

This is certainly a bug in clang 3.3 when compiling for Corei7 architecture. It does not happen on my Corei5 machine, on which Atlas builds in ca. 2 hours with 8 GiB RAM. It would probably be worth looking in the LLVM bugzilla to see if something of that ilk has already been reported.

comment:8 Changed 11 years ago by bgschaid@…

Well. I'd beg to differ: I didn't say it in the original posting but my machine is an Corei5.

Maybe it is even worse: a previous version (the original problem was 6 weeks ago) of LLVM/CLang had that problem on i5. Now they "fixed" it and something similar pops up on i7. There seems to be more development going on on the CLang side that on the Gcc-side ... which is not always a good thing.

As long as CLang is such a moving target I'd suggest to err onto the side of lower performance (but higher stability) and set a stable compiler suite as the default settings. Those who really need the performance of Atlas (and don't just install it because it is the dependency for some other package) can always set that variant

comment:9 Changed 10 years ago by lpsinger (Leo Singer)

Resolution: wontfix
Status: closedreopened

I'm seeing some crazy CPU and memory usage on my Core i7 MacBook Pro running Mavericks.

I am installing the port with default variants, i.e. atlas @3.10.1_5 +mpclang33.

The process that is going insane is indeed xdlanbsrch_pt.

comment:10 Changed 10 years ago by regnauld@…

Same here - 10.9, clang3.3, i7 MBA. 3-4 hours so far, no end in sight, temperature on the CPU pegged at 101 C and 320% (!) CPU usage.

comment:11 Changed 10 years ago by Veence (Vincent)

Ok, it seems clang3.3 is buggy with Core i7 processors. What options did you choose?

comment:12 Changed 9 years ago by jere@…

Cc: jere@… added

Cc Me!

comment:13 Changed 9 years ago by theorikbn@…

Cc: theorikbn@… added

Cc Me!

comment:14 Changed 9 years ago by Veence (Vincent)

Resolution: invalid
Status: reopenedclosed

This should be now superseded by new versions of clang. Closing.

Note: See TracTickets for help on using tickets.