Opened 17 months ago

Last modified 17 months ago

#66307 assigned defect

Scalapack: will not configure properly on PPC since mpi PG cannot handle mpich-gcc* but wants mpich-default

Reported by: barracuda156 Owned by: catap (Kirill A. Korinsky)
Priority: Normal Milestone:
Component: ports Version: 2.8.0
Keywords: powerpc, leopard, snowleopard Cc:
Port: scalapack

Description

Existing configure fails with mpich-gcc. To begin with, it tries to install mpich-default, warning that it gonna fail. After removing mpi specification line, configure fails on:

-- Check for working Fortran compiler: /opt/local/bin/gfortran-mp-12 - skipped
-- Found MPI_C: /usr/lib/libmpi.dylib (found version "2.0") 
-- Could NOT find MPI_Fortran (missing: MPI_Fortran_LIB_NAMES MPI_Fortran_F77_HEADER_DIR MPI_Fortran_MODULE_DIR MPI_Fortran_WORKS) 
-- Could NOT find MPI (missing: MPI_Fortran_FOUND) (found version "2.0")
-- Found MPI_LIBRARY : FALSE 
CMake Error at CMakeLists.txt:74 (message):
  --> MPI Library NOT FOUND -- please set MPI_BASE_DIR accordingly --


-- Configuring incomplete, errors occurred!

Yet mpich is installed with gfortran. The problem is that ${mpi.fc} is not a name of a compiler, but only a prefix; naturally, it fails to be found.

This works correctly:

pre-configure {
    configure.args-append \
        -DMPI_C_COMPILER=${mpi.cc}-mpich-gcc12 \
        -DMPI_Fortran_COMPILER=${mpi.fc}-mpich-gcc12 \
        -DMPIEXEC=${prefix}/bin/${mpi.exec} \
        -DLAPACK_LIBRARIES="-L${prefix}/lib ${linalglib}"
}

Obviously, a general fix should be added either to the PG or the port. I just quoted what fixed the failure locally.

Also, this should be added, otherwise gcc-4.2 is picked and it fails on configure due to unsupported flags:

compiler.blacklist  *gcc-4.*

Attachments (1)

science-mpich-default-4.0.3-Porttree.zip (2.3 KB) - added by kencu (Ken) 17 months ago.
greatly simplified mpich-default that I use on older systems

Download all attachments as: .zip

Change History (29)

comment:1 Changed 17 months ago by catap (Kirill A. Korinsky)

Sergey, may I ask you to open PR? :)

comment:2 in reply to:  1 Changed 17 months ago by barracuda156

Replying to catap:

Sergey, may I ask you to open PR? :)

Sure, but we need a solution for mpi PG for mpich-gcc to be correctly supported or otherwise “required” should be dropped from mpi options.

comment:3 Changed 17 months ago by kencu (Ken)

Not to mention that gcc12 in not presently a supported compiler on macports for systems < SnowLeopard or for PPC.

The mpich ecosystem is quite difficult to approach, and a general fix for older systems is not simple.

In the end, I just wrote up my own mpich-default port in about 10 minutes that builds and runs on every system from 10.4 PPC on up.

comment:4 in reply to:  3 ; Changed 17 months ago by barracuda156

Replying to kencu:

Not to mention that gcc12 in not presently a supported compiler on macports for systems < SnowLeopard or for PPC.

The time has come. I will make a PR today to move Leopard to modern libgcc and enable building gcc10+ with gcc10-bootstrap. I have been using that ever since gcc10-bootstrap was made by catap across three systems, I reckon this is a thorough testing.

The mpich ecosystem is quite difficult to approach, and a general fix for older systems is not simple.

In the end, I just wrote up my own mpich-default port in about 10 minutes that builds and runs on every system from 10.4 PPC on up.

That can be another solution. Could you commit your update to mpich-default or share it, so that we don’t do the work twice?

Last edited 17 months ago by barracuda156 (previous) (diff)

comment:5 Changed 17 months ago by kencu (Ken)

I will post up my mpich-default port here for you to try out. I just use it as a drop-in replacement for the current one in MacPorts, and don't touch anything else in the Portfiles or PortGroups.

Once mpich-default is installed, scalapack configures easily, finds all the correct compilers, and builds through to the end on 10.4 PPC without any alterations to the scalapack Portfile.

There is, however, a linking error at the very final step that I didn't look into yet though...

comment:6 in reply to:  4 ; Changed 17 months ago by kencu (Ken)

Replying to barracuda156:

The time has come. I will make a PR today to move Leopard to modern libgcc and enable building gcc10+ with gcc10-bootstrap. I have been using that ever since gcc10-bootstrap was made by catap across three systems, I reckon this is a thorough testing.

Sounds good! You need to:

  1. make sure it builds on 10.4, 10.5 and 10.6 PPC and 10.4 and 10.5 Intel
  2. control for the possibility that a user has a newer clang installed on 10.5 Intel that will be used as the assembler (this messes up gcc sometimes).
  3. make some kind of consideration for the fact that libgcc8,9,10,and 11 will be missing (I presume) so the libgcc Port will need to somehow handle that
  4. have a "force-deactivate" phase such as the one done when we upgraded those systems from libgcc6 to libgcc7 a few years ago, and like the one I did when I updated 10.6 Intel from libgcc7 to libgcc12.

You will most likely need some help, as these things are somewhat hard to do right. Catap here has developed many of the needed skills, and there have been a few folks around with strong opinions on things that will be able to dig in and test your proposal.

Good luck!

comment:7 Changed 17 months ago by kencu (Ken)

Summary: Scalapack: configure options breaking build on PPCScalapack: will not configure properly if mpi is removed from the Portfile

comment:8 Changed 17 months ago by kencu (Ken)

I changed the title of your ticket here, as scalapack configures 100% fine if it finds mpich-default to configure against.

However, if you remove the mpi specification line, I suppose it is not a big surprise that it won't configure right.

comment:9 in reply to:  8 Changed 17 months ago by barracuda156

Replying to kencu:

I changed the title of your ticket here, as scalapack configures 100% fine if it finds mpich-default to configure against.

However, if you remove the mpi specification line, I suppose it is not a big surprise that it won't configure right.

Well, I did not remove mpi from the portfile, of course. I only removed required option, which wanted mpich-default, which in turn warned it is broken. As it is, settings are wrong. Either mpich-default has to be fixed for PPC or mpich-gcc enabled correctly in mpi PG.

Last edited 17 months ago by barracuda156 (previous) (diff)

comment:10 Changed 17 months ago by barracuda156

Summary: Scalapack: will not configure properly if mpi is removed from the PortfileScalapack: will not configure properly on PPC since mpi PG cannot handle mpich-gcc* but wants mpich-default

comment:11 in reply to:  6 Changed 17 months ago by barracuda156

Replying to kencu:

You will most likely need some help, as these things are somewhat hard to do right. Catap here has developed many of the needed skills, and there have been a few folks around with strong opinions on things that will be able to dig in and test your proposal.

Good luck!

Thank you. I will certainly need help re Intel part: this is not something I am able to do.

I will check re force deactivation (point 4). As for point 3, all libgcc build fine, though I am not sure we need silly ports that do not install anything but take several hours to build – just to delete its build directory, leaving a line in the registry. AFAIR, libgcc8 installs exactly nothing. libgcc9 installs extensions dylibs, though those perhaps should be installed by libgcc10, if at all needed (Iain said we can use symlinks instead). libgcc10 installs a Fortran dylib – again, not sure if that is something needed. libgcc11 installs nothing again.

comment:12 Changed 17 months ago by barracuda156

Okay, so to be precise, here is what libgccs install:

libgcc6:

libgfortran.3.dylib

libgcc7:

libgfortran.4.dylib

libgcc8: NOTHING

libgcc9:

libgcc_ext.10.4.dylib
libgcc_ext.10.5.dylib

libgcc10: NOTHING

libgcc11: NOTHING

libgcc12: full runtime, of which into libgcc:

libatomic.1.dylib
libatomic.dylib
libgcc_ehs.1.1.dylib
libgcc_ehs.dylib
libgcc_s.1.1.dylib
libgcc_s.1.dylib
libgcc_s.dylib
libgfortran.5.dylib
libgfortran.dylib
libgomp.1.dylib
libgomp.dylib
libitm.1.dylib
libitm.dylib
libobjc-gnu.4.dylib
libobjc-gnu.dylib
libssp.0.dylib
libssp.dylib
libstdc++.6.dylib
libstdc++.dylib

That is, we got three parasitic dependencies which install strictly nothing but are required to be built. On the fastest G5 – I have G5 Quad with 16 GB RAM and SSD – it still takes about 3 hours per arch, so if we consider Leopard and universal builds, that translated into 18 hours of useless compilation.

Last edited 17 months ago by barracuda156 (previous) (diff)

comment:13 Changed 17 months ago by kencu (Ken)

i have not seen that all gccs 8-11 build on 10.4 through 10.6 Intel and PPC. Esp build in MacPorts environment.

Where does your assertion come from?

OTOH, I also see no reason to support all those either.

Last edited 17 months ago by kencu (Ken) (previous) (diff)

comment:14 Changed 17 months ago by kencu (Ken)

btw, we already do use symlinks where it works.

And sure, once it was realized that libgcc11, for example, installs nothing, the libgcc11 port could have been set up to skip building it. You’d have to take that up with Chris, who set that up.

Last edited 17 months ago by kencu (Ken) (previous) (diff)

comment:15 Changed 17 months ago by kencu (Ken)

Oh, I remember why Chris set it up to always build.

Iain changes things around sometimes between version bumps, so gcc11.1 sometimes installed libraries that gcc11.0 did not, for example, and it was hard to keep up with that.

Also, different OS versions and different archs installed different libraries.

So rather than try to keep track of all that nonsense, the portfile just looks at the dylibs and installs what is missing. Sometimes that is nothing, sometimes that is not nothing.

comment:16 in reply to:  13 Changed 17 months ago by barracuda156

Replying to kencu:

i have not seen that all gccs 8-11 build on 10.4 through 10.6 Intel and PPC. Esp build in MacPorts environment.

Where does your assertion come from?

I have built those myself on 10.5.8 and 10A190. In Macports environment, of course. I have used gcc10–gcc12, they all work on Leopard, SL PPC and SL Rosetta.

Not sure if we have any case for gcc8–gcc10 at all, and certainly not for parasitic libgccs (gcc10-bootstrap is the only gcc10 that is essential). We do want to keep gcc11 for the time-being, since gcc12 has occasional failures, see discussion here: https://github.com/iains/darwin-toolchains-start-here/discussions/41

comment:17 in reply to:  15 ; Changed 17 months ago by barracuda156

Replying to kencu:

Oh, I remember why Chris set it up to always build.

Iain changes things around sometimes between version bumps, so gcc11.1 sometimes installed libraries that gcc11.0 did not, for example, and it was hard to keep up with that.

Also, different OS versions and different archs installed different libraries.

So rather than try to keep track of all that nonsense, the portfile just looks at the dylibs and installs what is missing. Sometimes that is nothing, sometimes that is not nothing.

Well, you perhaps remember what Iaian said in that regard: it should not be necessary at all.

However, I do not want to push this change – that may prove too hard, and can cause PR to be closed unnecessarily. Also, there is no reason to do it in one go. Once the current PR is merged, we can discuss what to do with unneeded libgccs. For gcc12 we only need gcc10-bootstrap and libgcc12, not other versions.

  1. S. By the way, there is one issue which I forgot about: blacklisting of gccs in portfile of gcc12 has a weird effect of causing dependency cycle. I have no idea why. But I had to remove at least blacklist of gcc-4.2, even though it is not used for anything at all, in order for gcc10-bootstrap do its work.

comment:18 Changed 17 months ago by kencu (Ken)

OK, please post up the actual proven build successes you have (in the PR, or some other place than this scalapack ticket).

No committer will be able to take anyone's word for this as there are too many opportunities for errors and too many systems to cover off.

comment:19 in reply to:  17 Changed 17 months ago by kencu (Ken)

Replying to barracuda156:

Replying to kencu:

Oh, I remember why Chris set it up to always build.

Iain changes things around sometimes between version bumps, so gcc11.1 sometimes installed libraries that gcc11.0 did not, for example, and it was hard to keep up with that.

Also, different OS versions and different archs installed different libraries.

So rather than try to keep track of all that nonsense, the portfile just looks at the dylibs and installs what is missing. Sometimes that is nothing, sometimes that is not nothing.

Well, you perhaps remember what Iaian said in that regard: it should not be necessary at all.

It is needed though, as it caused build failures othewise when libraries could not be found.

However, I do not want to push this change – that may prove too hard, and can cause PR to be closed unnecessarily. Also, there is no reason to do it in one go. Once the current PR is merged, we can discuss what to do with unneeded libgccs.

check.

For gcc12 we only need gcc10-bootstrap and libgcc12, not other versions.

There will be some work to do, as currently libgcc7 depends on libgcc8 which depends on ... through to libgcc12. So that has to be sorted out.

  1. S. By the way, there is one issue which I forgot about: blacklisting of gccs in portfile of gcc12 has a weird effect of causing dependency cycle. I have no idea why. But I had to remove at least blacklist of gcc-4.2, even though it is not used for anything at all, in order for gcc10-bootstrap do its work.

No idea why this would happen.

comment:20 Changed 17 months ago by kencu (Ken)

here is my port of mpich-default 4.0.3. Passes all tests.

3.4.2 built on all systems. I have only tried this updated one so far on Tiger PPC, though.

To run the test suite, you have to "sudo port select python3 python310" or similar, as I didn't as yet rewrite a full python into the test files.

Changed 17 months ago by kencu (Ken)

greatly simplified mpich-default that I use on older systems

comment:21 in reply to:  20 Changed 17 months ago by barracuda156

Replying to kencu:

here is my port of mpich-default 4.0.3. Passes all tests.

Thank you! Any idea why was it even disabled?

I looked through the existing mpich port, and it takes little to enable mpich-default from there. Also, it appears that mpich-default is technically non-different from mpich-gcc12 (or whatever is the default system compiler). The only thing required is blacklist old GCCs, like you have done, or otherwise set C++11 standard (TBH, I did not check if that is required, but given that gcc-4.2 does not build it, failing immediately at configure, I guess yes), so that correct GCC is used.

  1. S. Also +native needs a fix for PPC, like I did for folly or smth alike (-march=native is not supported with PPC).

comment:22 in reply to:  18 ; Changed 17 months ago by barracuda156

Replying to kencu:

OK, please post up the actual proven build successes you have (in the PR, or some other place than this scalapack ticket).

No committer will be able to take anyone's word for this as there are too many opportunities for errors and too many systems to cover off.

In fact you should be able to see that from statistics. For example, you can see libgcc9 installed for PPC: https://ports.macports.org/port/libgcc9/stats/ TBH, I do not get why you doubt that so much, given that we know for the fact that gcc11 and gcc12 build and work on PPC across three systems, starting from Leopard. Moreover, all GCCs are supported by upstream – so they must build, unless Macports break something on its side. There is nothing surprising in that.

comment:23 in reply to:  22 ; Changed 17 months ago by kencu (Ken)

Replying to barracuda156:

TBH, I do not get why you doubt that so much

I doubt it so much because I've been watching well-meaning PRs fail for many years now, and I have in particular seen how fragile building gcc can be, especially in the "non-sterile" MacPorts environment.

How do you think I got to know Iain so well in the first place ;>

Anyway, if it builds so easily, folks should have no trouble showing that. I can't get anything useful off of the stats website for this question as I would have no idea how they built it, if they did build it.

comment:24 in reply to:  23 ; Changed 17 months ago by barracuda156

Replying to kencu:

Anyway, if it builds so easily, folks should have no trouble showing that.

Well, I have no trouble showing that, but what would you consider the evidence? :)

  1. S. But again, we do not need any of libgccs aside of libgcc12 in order to move to gcc12.

comment:25 Changed 17 months ago by barracuda156

On a side-note, I have built mpich-default on 10.6 by simply allowing it in the portfile. Wonder why at all it was banned in the first place. It does not seem to need any kind of hacks whatsoever.

comment:26 in reply to:  24 Changed 17 months ago by kencu (Ken)

Replying to barracuda156:

  1. S. But again, we do not need any of libgccs aside of libgcc12 in order to move to gcc12.

P.S. But again, gcc7 and several other older gccs will then be broken :>

Last edited 17 months ago by kencu (Ken) (previous) (diff)

comment:27 in reply to:  24 Changed 17 months ago by kencu (Ken)

Replying to barracuda156:

Well, I have no trouble showing that, but what would you consider the evidence? :)

Somebody putting their name down beside verification that the given gcc did indeed build on the given system with the given PR.

So that later, if/when the build fails, we know who to ask.

comment:28 Changed 17 months ago by kencu (Ken)

Anyway, I think it's time to stop talking about hypotheticals in this scalapack ticket, and put the effort into the actual PR. So no more responses about libgcc/gcc from me here.

Re: mpich-default -- that was a project I undertook 18 months ago that I eventually gave up on and wrote my own. If you can sell a PR to the maintainers, feel free to float one.

Note: See TracTickets for help on using tickets.