Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#63031 closed defect (wontfix)

mpich-default @3.4.1_3: seems to be disabled on Leopard -- but installs easily

Reported by: kencu (Ken) Owned by: mascguy (Christopher Nielsen)
Priority: Normal Milestone:
Component: ports Version:
Keywords: Cc: eborisch (Eric A. Borisch), jmroot (Joshua Root)
Port: mpich-defaul

Description (last modified by kencu (Ken))

I was getting around to upgrading the ports on some of my older systems, and on my main Leopard workhorse, mpich-default would not upgrade.

It said it was not supported on Leopard, which came as a surprise to me, as it installed without any trouble at all a few weeks ago, and the version has not changed.

Things in the mpich-* world have become rather impressively more complicated since the last time I looked, with the port trying to do a lot of figuring out what compilers do and don't work on which systems. That is such a moving target it will be hard to keep that logic current. For example, it says that gcc9 won't build on 10.7, or that clang-11 won't build on 10.6, but they do, AFAIK, and if they don't build today, they will very shortly.

Anyway, after a fair amount of trying to figure out what is going on. I (think I) found the switch to flip that was disabing mpich-default on Leopard, and with this little patch:

$ diff -u Portfile `port file mpich-default`
--- Portfile	2021-06-05 19:31:01.000000000 -0700
+++ /opt/local/var/macports/sources/rsync.macports.org/macports/release/tarballs/ports/science/mpich/Portfile	2021-06-05 19:12:23.000000000 -0700
@@ -82,7 +82,7 @@
 dict set clist gcc7 {macports-gcc-7}
 
 # Only enable default (gcc), and Xcode clang, for MacOS 10.7 and later
-if { ${os.major} >= 11 } {
+if { ${os.major} >= 1 } {
     dict set clist default {}
     dict set clist clang   {clang}
 } else {

all was well in the world once again. mpich-default builds through with the lowly, 15 year old /usr/bin/gcc-4.2 on Leopard, so it's compiler requirements would appear to be quite modest, in the end.

$ port -v installed mpich-default
The following ports are currently installed:
  mpich-default @3.4.1_1+gcc7 requested_variants='+gcc7' platform='darwin 9' archs='i386' date='2021-03-17T14:22:49-0700'
  mpich-default @3.4.2_0+gcc7 (active) requested_variants='' platform='darwin 9' archs='i386' date='2021-06-05T19:28:20-0700'

Change History (46)

comment:1 Changed 3 years ago by kencu (Ken)

Description: modified (diff)

comment:2 Changed 3 years ago by kencu (Ken)

I know it is often tempting to try to look at buildbot failures and then go into certain Portfiles to disable the building.

But -- it's useful to consider if these buildbot failures are something deep that will never be fixed (like qt15.x not building on Tiger) or something that is quite a bit more likely to be repaired once someone has 10 seconds (like gcc9 not currently building on Lion, apparently -- can't verify that).

For what it is worth:

Every version of gcc builds all the way back to Tiger. I know MacPorts has turned off the builds of gcc8+ on older systems presently. There is no good reason for that, except that I just haven't got round to fixing that yet.

clangs all the way up to clang-10 are running fine on my Leopard Intel system.

$ uname -a
Darwin macpro-as-Leopard.local 9.8.0 Darwin Kernel Version 9.8.0: Wed Jul 15 16:55:01 PDT 2009; root:xnu-1228.15.4~1/RELEASE_I386 i386

$ port -v installed | grep clang
  clang-3.4 @3.4.2_14 (active) requested_variants='' platform='darwin 9' archs='i386' date='2021-03-11T16:01:03-0800'
  clang-3.7 @3.7.1_7 (active) requested_variants='' platform='darwin 9' archs='i386' date='2021-03-11T16:34:54-0800'
  clang-7.0 @7.1.0_0+emulated_tls+libstdcxx (active) requested_variants='+emulated_tls+libstdcxx' platform='darwin 9' archs='i386' date='2021-03-11T21:46:49-0800'
  clang-8.0 @8.0.1_1+emulated_tls+libstdcxx requested_variants='+emulated_tls+libstdcxx' platform='darwin 9' archs='i386' date='2021-03-12T16:28:34-0800'
  clang-9.0 @9.0.1_3+emulated_tls+libstdcxx requested_variants='+emulated_tls+libstdcxx' platform='darwin 9' archs='i386' date='2021-03-12T00:05:33-0800'
  clang-10 @10.0.1_4+emulated_tls+libstdcxx requested_variants='+emulated_tls+libstdcxx' platform='darwin 9' archs='i386' date='2021-03-13T00:47:42-0800'
  clang_select @2.2_0 (active) requested_variants='' platform='darwin 9' archs='noarch' date='2021-03-11T15:00:30-0800'

We shouldn't spend a lot of time migrating the logic for which compilers are currently running on which systems too deeply into the Portfiles.

FIguring out how to turn this off on mpich-default was rather difficult to sort out.

Last edited 3 years ago by kencu (Ken) (previous) (diff)

comment:3 Changed 3 years ago by kencu (Ken)

PS: I just haven't tried to build clang-11 or clang-12 on Leopard yet -- that is actually what I was up to when I went to start upgrading these ports :>

comment:4 in reply to:  2 Changed 3 years ago by mascguy (Christopher Nielsen)

Cc: jmroot added

Well, per issue:62878 and issue:62887, it's desirable to block known build failures. Otherwise the buildbots waste time on ports that we know will fail.

Every version of gcc builds all the way back to Tiger. I know MacPorts has turned off the builds of gcc8+ on older systems presently. There is no good reason for that, except that I just haven't got round to fixing that yet.

Sounds like you have some work to do then. ;-)

Of note, both openmpi-gcc7 and mpich-gcc7 are supported across-the-board, without restriction. So that's one potential option, if those subports build successfully on 10.5.

Last edited 3 years ago by mascguy (Christopher Nielsen) (previous) (diff)

comment:5 Changed 3 years ago by kencu (Ken)

okey-dokey.

If anyone can see some way to help make this port's attempts to follow the compilers more accurate, please volunteer.

For now, I think my little tweak above just shuts the whole mechanism off, so that works for my personal ports.

Last edited 3 years ago by kencu (Ken) (previous) (diff)

comment:6 Changed 3 years ago by kencu (Ken)

Resolution: worksforme
Status: assignedclosed

comment:7 Changed 3 years ago by kencu (Ken)

Resolution: worksforme
Status: closedreopened

might as well leave this open I guess, for whoever else might come along later and want to know how to install this.

how to actually fix this properly I can't say. The portfile has become too obtuse for me to suggest a PR.

how to actually keep this portfile up-to-date with the compilers situation on macports is totally beyond me...

comment:8 in reply to:  7 Changed 3 years ago by mascguy (Christopher Nielsen)

how to actually keep this portfile up-to-date with the compilers situation on macports is totally beyond me...

You needn't worry, openmaintainer was specifically removed to avoid having too many cooks in the kitchen.

how to actually fix this properly I can't say. The portfile has become too obtuse for me to suggest a PR.

It's actually quite simple. Look for the following comment, in the "Target Compiler Logic" section of the port:

# Compilers supported across-the-board
dict set clist gcc7 {macports-gcc-7}

Simply put your additions there, without surrounding them with any if-then logic.

Last edited 3 years ago by mascguy (Christopher Nielsen) (previous) (diff)

comment:9 Changed 3 years ago by kencu (Ken)

I'll see if I can wrap my noggin around the portfile logic.

Last edited 3 years ago by kencu (Ken) (previous) (diff)

comment:10 Changed 3 years ago by kencu (Ken)

I will see if mpich-gcc7 covers the bases on the systems I'm using. Should do.

Last edited 3 years ago by kencu (Ken) (previous) (diff)

comment:11 Changed 3 years ago by kencu (Ken)

OK -- it looks like sundials, which is what I need, requires mpich-default, and AFAICT it is not satisfied by any other mpi subport.

So we need to change things it appears so at least that one port can install everywhere.

The mpich portfile already uses the compilers portgroup, and this seems to be updated very carefully with the current MP compiler settings, so I'm not completely sure why we couldn't just stay with that.

But be that as it may, if we can sort out how to get this building on the systems it built on a few weeks ago, that should be OK. All the other gazillion subports and variants of this behemoth can languish I guess, so long as the one everything needs, which seems to be mpich-default, works.

Unless I still don't see how the interaction between sundials and mpich-* works, which is entirely possible, sadly.

Last edited 3 years ago by kencu (Ken) (previous) (diff)

comment:12 Changed 3 years ago by mascguy (Christopher Nielsen)

Installing via sundials +openmpi +gcc7 or sundials +mpich +gcc7 should be possible, without changes to our MPI ports.

Do those not work? If so, what's failing?

comment:13 Changed 3 years ago by kencu (Ken)

how would someone know to do that?

how do you turn on mpich-default again?

Last edited 3 years ago by kencu (Ken) (previous) (diff)

comment:14 Changed 3 years ago by mascguy (Christopher Nielsen)

The last I checked, mpich-default doesn't build successfully on our buildbots, for 10.6. That's why it was disabled for 10.6 and earlier.

Last edited 3 years ago by mascguy (Christopher Nielsen) (previous) (diff)

comment:15 Changed 3 years ago by mascguy (Christopher Nielsen)

This is the buildbot failure for 10.6:

Undefined symbols:
  "___floatsitf", referenced from:
      flat_namespace undefines in libpmpi.dylib
  "___multf3", referenced from:
      flat_namespace undefines in libpmpi.dylib
  "___netf2", referenced from:
      flat_namespace undefines in libpmpi.dylib
  "___addtf3", referenced from:
      flat_namespace undefines in libpmpi.dylib
  "___eqtf2", referenced from:
      flat_namespace undefines in libpmpi.dylib
  "___gttf2", referenced from:
      flat_namespace undefines in libpmpi.dylib
  "___lttf2", referenced from:
      flat_namespace undefines in libpmpi.dylib
ld: symbol(s) not found

https://build.macports.org/builders/ports-10.6_x86_64-builder/builds/57038

comment:16 Changed 3 years ago by kencu (Ken)

OK, I see. Let me see what I can do about that.

Thanks.

comment:17 Changed 3 years ago by kencu (Ken)

Resolution: wontfix
Status: reopenedclosed

I am unable to decipher the tangle here to sort out how to properly re-enable mpich-default. I'm sorry; my failing. I just don't want to spend the time needed to figure out how to follow all the enabling and disabling going on.

I just wrote up my own little mpich-default portfile which installs without issues on all the systems I have using the default compiler on each system and with fortran enabled on all of them.

I will put it up in my repos.

comment:18 Changed 3 years ago by mascguy (Christopher Nielsen)

Ken, if you can provide the repo/branch, I'll see if it's feasible for openmpi/mpich.

comment:19 Changed 3 years ago by kencu (Ken)

I realize my frustration is showing :>

There is a certain elegant beauty to simplicity. Have a look at the formula for mpich on Homebrew.

Here's the Portfile I'm using on my systems to install mpich-default. It installs, with FORTAN support, using MacPorts default compiler, with no fuss. (I have installed 10.4, 10.5, 10.6, 10.7, 10.14, and 10.15 so far).

I haven't verified universal or just PowerPC yet.

https://github.com/kencu/myports/blob/master/science/mpich/Portfile

TBH, I can just wrap my head around this better.

Last edited 3 years ago by kencu (Ken) (previous) (diff)

comment:20 Changed 3 years ago by mascguy (Christopher Nielsen)

Resolution: wontfix
Status: closedreopened

Reopening, as there might be an opportunity to improve openmpi/mpich.

comment:21 in reply to:  19 Changed 3 years ago by mascguy (Christopher Nielsen)

Replying to kencu:

There is a certain elegant beauty to simplicity. Have a look at the formula for mpich on Homebrew.

While many of their formulas are indeed simple, they also limit macOS support to 10.14+. That alone makes things far easier. (Though frankly, I'm glad we support macOS releases back to 10.6. It's a key differentiator, and one that I'm proud to support!)

However, their formula for mpich is anything but simple. Which isn't a surprise, as some hoops are required to properly configure and build precisely what's desired:

https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/mpich.rb

Here's the Portfile I'm using on my systems to install mpich-default. It installs, with FORTAN support, using MacPorts default compiler, with no fuss. (I have installed 10.4, 10.5, 10.6, 10.7, 10.14, and 10.15 so far).

The existing mpich-default should work as-is, if you simply add the following to the section entitled "Compilers supported across-the-board:"

dict set clist default {}

What's the error/issue you're encountering, using our existing port with the aforementioned change?

Last edited 3 years ago by mascguy (Christopher Nielsen) (previous) (diff)

comment:22 Changed 3 years ago by kencu (Ken)

Chris, skipping their linux support and a bit of boilerplate, their entire formula is:

 args = %W[
      --disable-dependency-tracking
      --enable-fast=all,O3
      --enable-g=dbg
      --enable-romio
      --enable-shared
      --with-pm=hydra
      FC=gfortran-#{Formula["gcc"].any_installed_version.major}
      F77=gfortran-#{Formula["gcc"].any_installed_version.major}
      --disable-silent-rules
      --prefix=#{prefix}
      --mandir=#{man}
    ]

    # Flag for compatibility with GCC 10
    # https://lists.mpich.org/pipermail/discuss/2020-January/005863.html
    args << "FFLAGS=-fallow-argument-mismatch"
    args << "CXXFLAGS=-Wno-deprecated"
    args << "CFLAGS=-fgnu89-inline -Wno-deprecated"

    system "./configure", *args

    system "make"
    system "make", "install"
  end

Which is all I do too, in TCL.

I will see if I can try your suggested fix.

comment:23 Changed 3 years ago by kencu (Ken)

BTW -- there are no adjustments or fixes needed for any older systems -- this works on 10.4 Intel up.

(The darwin PPC build has a few bits of bitrot in it that I fixed by hand and may make a local patch for.)

comment:24 Changed 3 years ago by kencu (Ken)

I am OK closing this again. I will carry my own mpich Port forever, as it is so totally trivial to maintain.

comment:25 Changed 3 years ago by mascguy (Christopher Nielsen)

Okay, but still wondering: What's the error/issue you encountered, using our existing port (with the minor change in comment:21)?

comment:26 Changed 3 years ago by kencu (Ken)

I will be happy to give that a try.

BTW, gcc9 works fine again on 10.7 as of yesterday, and gcc8 remains the recommended gcc version for 10.6 at present. I believe the openmp port has different settings than that at present, so might need some ongoing updates.

comment:27 Changed 3 years ago by mascguy (Christopher Nielsen)

gcc7 is working for both openmpi/mpich on 10.6. gcc8 has been retired, to keep the number of compilers sane. (I believe you were on the discussion thread for the compiler pruning effort.)

comment:28 Changed 3 years ago by kencu (Ken)

Well, I saw you were pruning, but I was not on the pruning thread to the depth that I would have suggested retiring the primary gcc compiler for a system.

But it's fine. So long as mpich-default works everywhere, I think all will be well. For me, mostly it's for Octave, which presently runs on all our supported systems, and I'm trying not to lose that one important feature during this purging process.

Last edited 3 years ago by kencu (Ken) (previous) (diff)

comment:29 Changed 3 years ago by mascguy (Christopher Nielsen)

To repeat the basic question, from comment:25:

What is the specific error/issue you encountered, after applying the minor change mentioned in comment:21?

comment:30 Changed 3 years ago by mascguy (Christopher Nielsen)

Also, just a quick FYI:

Updated the mpiutil portgroup, to only add clang-9.0 as a build dependency for openmpi/mpich gcc builds on 10.6 through 10.8. That's one less variable in the 10.5 mix.

https://github.com/macports/macports-ports/commit/2745696a941e5c3b949b3efd9f4ba9a3b014813d

Last edited 3 years ago by mascguy (Christopher Nielsen) (previous) (diff)

comment:31 in reply to:  15 Changed 3 years ago by mascguy (Christopher Nielsen)

Ken, any thoughts on the missing symbols issue noted in comment:15?

comment:32 Changed 3 years ago by kencu (Ken)

I think somewhere in all this, mpich-default tries to blacklist all the macports-clang-* compilers on 10.6.

I see this all as being massively over-engineered at present. Why are we making this such a holey mess???

I just fixed my uber-simple mpich to work on PPC Tiger and Leopard as well, trivial patch.

https://github.com/kencu/myports/blob/master/science/mpich/Portfile

So that's every system now with one simple Portfile.

Last edited 3 years ago by kencu (Ken) (previous) (diff)

comment:33 Changed 3 years ago by kencu (Ken)

No doubt I don't understand the deep deep complexities of mpich, which requires all this on MacPorts (but nowhere else) for some purpose that I can't fathom.

Last edited 3 years ago by kencu (Ken) (previous) (diff)

comment:34 Changed 3 years ago by mascguy (Christopher Nielsen)

It's not trivial to support numerous target compilers, and universal support complicates it further.

To be fair though, I've also run both openmpi and mpich through my patented obfuscation tool. It's affectionately named, "The Ken Cunningham Infuriator (tm)." In addition to automated portfile refactoring, it also adds those big honking comment blocks which you love. LOL

comment:35 Changed 3 years ago by kencu (Ken)

So...

let's just dump all the target compilers. What on Earth are they for? Just use mpich with the default compiler.

That's all I (or anyone else) does, and it seems just peachey, on every system from Tiger up.

Get rid of 10,000 lines of gobbelygook tcl, and several Portgroups.

comment:36 Changed 3 years ago by mascguy (Christopher Nielsen)

Well, when using MacPorts target compilers, we can publish binaries. And that provides a better user experience.

As for the complexity, sometimes it's necessary. For example, can you trim the various LLVM-related ports down to 20 lines - without the help of a portgroup - and without losing any of the functionality provided today?

That's a rhetorical question, and doesn't need an answer.

Regardless, I'd still like to assist if possible. But it would help to keep the discussion focused on the issue, sans emotion.

Last edited 3 years ago by mascguy (Christopher Nielsen) (previous) (diff)

comment:37 in reply to:  36 Changed 3 years ago by kencu (Ken)

Resolution: wontfix
Status: reopenedclosed

You have a VM of 10.6 I believe, so you actually don't need me to do anything.

The current mpich-* setup here is incomprehensible to me. I has collapsed in tcl, portgroups, and obfuscation.

I'm just going to stick with my perfectly wonderful, single-port, mpich that works great on every system and publishes binaries that anyone can use. That is a fantastic user experience!

We should (IMHO) just totally blow up the current mpich setup and use mine, but I'm sure someone has a use case they want for some part of this.

comment:38 Changed 3 years ago by mascguy (Christopher Nielsen)

That's fine, though I'm also seeing the same missing symbols when testing in my 10.6 VM. If we can resolve that, 10.6 could be supported too...

comment:39 Changed 3 years ago by eborisch (Eric A. Borisch)

Ken, please be done with your slights. We get it. We're all volunteers here, and having someone complain about it in every single post gets a bit old.

The background is that once upon a time there was a desire to be able to test how the different compilers would compile the mpich library itself (for performance / correctness testing), in addition to then using the matching compilers (via the generated wrappers and libraries) to then compile "user" MPI code.

Once upon a time, building the library with the compiler you want to use for the "user" code was the recommended (by upstream) process to avoid unexpected / unsupported errors or configurations. I don't think that is still explicitly the case, but there is still a laundry list of things that need to line up between the library compiler and the "user" compiler for things to work right.

We can certainly have the discussion again if it is time to say "things generally work, there is only mpich-default" and move on. I've also considered re-doing the compiler-flavored-subports to just generate new wrapper scripts, but all use the mpich-default libraries. I have not had the time to look at those, and I don't have a schedule for when I will.

In the interim, Christopher has been doing a lot of work to merge (as much as possible) the mpich and openmpi ports such that they stay lined up more consistently. I'm sorry something has broken (potentially during this process; not really sure) but please be patient and supportive — and be done with the caustic comments peppered into your responses. Please add a kens-mpich port and start moving ports over to that if you have a better solution; indeed if you have a better solution that makes enough people happy, it will win; the joy of open source projects!

comment:40 Changed 3 years ago by kencu (Ken)

not slights. that's why i closed this.

used to work, doesn't now. I'm good. Mine works.

Last edited 3 years ago by kencu (Ken) (previous) (diff)

comment:41 Changed 3 years ago by mascguy (Christopher Nielsen)

I just remembered that I have an old 2007 MacBook Pro (Intel Core Duo x32), with a dual-boot 10.5/10.6 setup. So I'll fire that up with 10.5, and test with the local change mentioned in comment:8.

Stay tuned...

comment:42 Changed 3 years ago by mascguy (Christopher Nielsen)

Given the amount of time necessary to build ports from source, my 10.5 testing will have to wait until I can create a VM for my MacPro. I'll get to it eventually.

comment:43 Changed 3 years ago by kencu (Ken)

I don't think you need to.

Although I have been trying to support our user base that uses these systems (it's a dedicated bunch, see the MacRumors forums), it was never meant to get to the point where it caused reactions like this.

Last edited 3 years ago by kencu (Ken) (previous) (diff)

comment:44 Changed 3 years ago by mascguy (Christopher Nielsen)

Ken, there's no anger from anyone. It's just that some of the comments were a bit critical, sidetracking us from the actual issue.

I'm planning to move forward with testing on 10.5 regardless. It's simply a matter of getting a VM setup, and waiting for everything to build from source.

comment:45 Changed 3 years ago by kencu (Ken)

my comments were critical of how this port has changed, and how unapproachable it was to try to fix.

they were not critical of you personally.

Last edited 3 years ago by kencu (Ken) (previous) (diff)

comment:46 Changed 3 years ago by mascguy (Christopher Nielsen)

Nothing was taken personally. The point is simply that, going forward, it would help to keep discussions objective... and focused on the problem at hand.

Note: See TracTickets for help on using tickets.