Opened 17 years ago

Closed 17 years ago

Last modified 11 years ago

#11920 closed defect (fixed)

BUG: base-1.4.40 install with different variants silently fails to execute (!)

Reported by: gwhitney@… Owned by: macports-tickets@…
Priority: Normal Milestone: MacPorts 1.5
Component: base Version: 1.5
Keywords: variant registry Cc: gwhitneycom1@…, cooljeanius (Eric Gallager)
Port:

Description

The following transcript begins with a pristine installation of SVN trunk/base r24920 (just configured with prefix=/opt/a to not conflict with my "real" installation).

$ port sync
$ port installed
No ports are installed.
$ port install expat +no_static
--->  Fetching expat
--->  Attempting to fetch expat-2.0.0.tar.gz from http://downloads.sourceforge.net/expat
--->  Verifying checksum(s) for expat
--->  Extracting expat
--->  Configuring expat
--->  Building expat with target all
--->  Staging expat into destroot
--->  Installing expat 2.0.0_1+no_static
--->  Activating expat 2.0.0_1+no_static
--->  Cleaning expat

So far, so good. Now suppose I change my mind and decide I want to get rid of that no_static variant after all. I might quite reasonably execute:

$ port install expat -no_static
--->  Cleaning expat

Huh? Does that mean that my +no_static version was removed? What about installing the -no_static version? Better check.

$ port installed
The following ports are currently installed:
  expat @2.0.0_1+no_static (active)

Hmmm, looks like nothing happened. But there were no error messages on my new install request. Maybe -d will show what happened:

$ port -d install expat -no_static
DEBUG: Found port in file:///opt/a/var/db/dports/sources/rsync.rsync.darwinports.org_dpupdate_dports/textproc/expat
DEBUG: Changing to port directory: /opt/a/var/db/dports/sources/rsync.rsync.darwinports.org_dpupdate_dports/textproc/expat
DEBUG: Requested variant powerpc is not provided by port expat.
DEBUG: Requested variant darwin is not provided by port expat.
DEBUG: Requested variant macosx is not provided by port expat.
DEBUG: Skipping com.apple.main (expat) since this port is already installed
DEBUG: Skipping com.apple.fetch (expat) since this port is already installed
DEBUG: Skipping com.apple.checksum (expat) since this port is already installed
DEBUG: Skipping com.apple.extract (expat) since this port is already installed
DEBUG: Skipping com.apple.patch (expat) since this port is already installed
DEBUG: Skipping com.apple.configure (expat) since this port is already installed
DEBUG: Skipping com.apple.build (expat) since this port is already installed
DEBUG: Skipping com.apple.destroot (expat) since this port is already installed
DEBUG: Skipping com.apple.install (expat) since this port is already installed
DEBUG: Skipping com.apple.activate (expat) since this port is already active
DEBUG: Skipping com.apple.main (expat) since this port is already installed
--->  Cleaning expat
DEBUG: Executing com.apple.clean (expat)
--->  Removing workpath for expat
DEBUG: Removing directory: /opt/a/var/db/dports/build/_opt_a_var_db_dports_sources_rsync.rsync.darwinports.org_dpupdate_dports_textproc_expat/work
DEBUG: Removing symlink: /opt/a/var/db/dports/sources/rsync.rsync.darwinports.org_dpupdate_dports/textproc/expat/work

Hmm, somehow port thinks the -no_static version is already installed. Maybe deactivating the +no_static will help:

$ port deactivate expat
--->  Deactivating expat 
$ port install expat -no_static
--->  Activating expat 2.0.0_1+no_static
--->  Cleaning expat

Finally the light goes on: port somehow thinks the name of the port I'm asking for is expat 2.0.0_1+no_static.

(I realize that if I had done port install expat when I first changed my mind, I would have gotten some reasonable behavior: a new build followed by an error about not being able to activate a different version. But the point is that -no_static is supposed to work the same, it's what came to mind since in my head I was "turning the no_static variant off", and in the case where the variant is + by default, it's necessary to specify it this way, but it will still exercise the bug.)

Looking into the code, it was not hard to see what was happening: the code snippet in target_run that computes the portvariants string for accessing the registry pays no attention to the values of the variants, it merely concatenates the names of all the variants listed in this run with '+'s, even if that variant was requested with '-'. This is a serious bug that could have pernicious effects on the integrity of a user's repository. One striking example: suppose port foo has a default variant +bar, and someone installs it via port install foo -bar. That will go into the registry under the name foo @N.NN+bar. Now suppose this is deactivated for whatever reason and then a port install foo occurs. The registry will look for and find a foo @N.NN+bar version, it will install it, port installed will show foo @N.NN+bar installed and active, but it will be the wrong build of this port -- and very hard to detect, unless one goes and looks in the receipt file for the foo @N.NN+bar port and sees (surprisingly) that the variants attribute there is {} as opposed to the expected +bar.

Fortunately, fixing this bug is relatively straightforward; I will post a patch momentarily, with some further comments on my proposed patch.

Note that Ticket #2377 was almost on to this. The poster david.serpa noted that the portvariants string did not heed the value of each variant, and suggested that it might be nice if it did, e.g. to record when a default variant was turned off. All that was missing was a realization that paying attention to values wasn't just nice, it's critical to the integrity of the repository. Also the suggested patch on that ticket would not resolve the problem, since it records the - variants: that strategy can also lead to registry corruption, since if quux is another variant of foo not on by default, then port install foo and port install foo -quux will seem to be about different builds of the port, when really they refer to the same thing. I would have posted all this information to that ticket since it's so close, except that I don't have the authority to raise it back to a high-priority serious bug (I assume Blocker is the highest priority) and I didn't want this to go under the radar, since it's near to certain that there's someone else out there besides me who has experimented with various variants on and off and got their registry all in a kink.

Thanks for addressing this issue as speedily as you can -- I know there are a lot of demands on your time, but this one caused me silently to be using builds different from the ones I intended , which led to real consternation and which very likely is happening to others, perhaps unbeknownst to them. Regards, Glen

Attachments (3)

variant_registry.patch (4.7 KB) - added by gwhitneycom1@… 17 years ago.
proposed bugfix
variant_registry.2.patch (6.3 KB) - added by gwhitney@… 17 years ago.
enhanced patch, accounts for required variants not explicitly requested
variant_registry.3.patch (6.1 KB) - added by gwhitneycom1@… 17 years ago.
update patch to r25303

Download all attachments as: .zip

Change History (13)

Changed 17 years ago by gwhitneycom1@…

Attachment: variant_registry.patch added

proposed bugfix

comment:1 Changed 17 years ago by gwhitney@…

Some explication of the patch: The key portion is in portutil.tcl, where the snippet from target_run which computes the portvariants string is abstracted into a separate function (to highlight its key function) and corrected to include exactly the + variants for that run. Other points:

  1. It turns out this same snippet (as david.serpa noticed) is used in two other places, so eliminate those and move the computation of portvariants into eval_variants. That's a natural location for it, since that's where the final "true" list of variants is computed anyway, so why not record it right then? That way anyplace it's needed later, it will be available.
  2. Point 1 led to nothing left in the prerun for target 'activate', so I eliminated that altogether.
  3. I also took the liberty of changing the "skipping" message on an activate that's already active from ui_debug level to ui_msg, and including the variant string. The rationale is that it changes the output of a second duplicate install from
    $ port install expat +no_static
    ---> cleaning expat
    
    (which is somewhat cryptic: What was there to clean? Did you install something?) to
    $ port install expat +no_static
    Skipping com.apple.activate (expat +no_static) since this port is already active
    --->  Cleaning expat
    
    which IMHO is clearer, and would certainly have helped me find the problem faster had the message already been there. (I would have asked for -no_static and it would have told me it already had +no_static and I would have seen the discrepancy immediately.)

Changed 17 years ago by gwhitney@…

Attachment: variant_registry.2.patch added

enhanced patch, accounts for required variants not explicitly requested

comment:2 Changed 17 years ago by gwhitney@…

Testing the original patch further, I realized that unspecified variants which are required by specified ones are turned on, but that they were not being included in the portvariants registry string. They need to be, so that port knows that e.g. 'port install emacs-devel +gtk' produces the same install as 'port install emacs-devel +x11 +gtk'. The improved patch handles this as well, by computing the exact list of active variants just after they have all been succesfully 'eval'ed. (If any variants fail, port is about to crap out anyway.)

It turns out that this computation is also a convenient place to check for inconsistent variant specifications. Formerly, port install emacs-devel +gtk -x11 would happily proceed, with x11 turned on despite the explicit -x11, since it's required by gtk. Now it fails with:

$ port install emacs-devel +gtk -x11
Error: Inconsistent variant specification: emacs-devel variant +x11 is required by at least one of  +darwin_8 +gtk, but specified -x11
Error: Status 1 encountered during processing.

Changed 17 years ago by gwhitneycom1@…

Attachment: variant_registry.3.patch added

update patch to r25303

comment:3 Changed 17 years ago by jberry@…

Glen nice work. I think we need to get this into place and start working with it, and refine as needed to resolve any of your open questions. I'm note sure anybody actively involved with the project has too much knowledge with how some of that variants/deps stuff works: it's a bit of an archeological expedition. We need to just make it work right as we see it today.

On problem, as I recall, is that not storing the suppressed variants means that they are not reapplied during upgrade. That's probably unrelated to your current quest, but I thought I'd mention it.

comment:4 Changed 17 years ago by gwhitney@…

One problem, as I recall, is that not storing the suppressed variants means that they are not reapplied during upgrade. That's probably unrelated to your current quest, but I thought I'd mention it.

Yes, I agree, if variants.conf has +no_x11, and you install a port -no_x11, then next time you upgrade the port without arguments it will try to upgrade to +no_x11.

My belief is that there must be a canonical string for the registry that lets it tell whether two packages are effectively the same or not, and portvariants seems to be a decent name for that. In order to support preservation of -variants across upgrades, there would have to be an additional place to track the requested variants. The most natural way to do that would be to revive the distinction between "variations" (which are requested) and "variants" (which are applied). They are mostly kept separate, but conflated in a few places where they'd have to be teased apart. Then the requested variations could also be kept separately in each port installation on disk to provide the desired upgrade behavior.

However, that effort seems to me to be beyond the scope of the current project, and moreover, potentially wasted effort at the moment since if I read the relevant website correctly, it will be someone's Google SoC project to revamp the dependency structure. If it hasn't happened by the fall, it shouldn't be any harder to take care of then.

comment:5 Changed 17 years ago by kballard (Lily Ballard)

Resolution: fixed
Status: newclosed

Your patch looks good. I'm committing it with a few minor changes (all new functions should really use consistent whitespace - no mixed tabs/spaces; there's no reason to use [string equal] in a conditional when there's == or eq operators; append and lappend are your friends).

Committed in r26036

comment:6 Changed 17 years ago by gwhitney

Thanks for cleaning that up. I was suffering from not knowing which bits of the code represent the current desired conventions, since you have to admit there is a mixture in there, and also from having just picked up tcl for this, so e.e. the [string equal] was just copied verbatim from the previous code. Glad the patch was nevertheless helpful.

comment:7 Changed 16 years ago by jmpp@…

Milestone: MacPorts base bugs
Priority: HighNormal

comment:8 Changed 15 years ago by tobypeterson

Milestone: MacPorts base bugsMacPorts Future

Milestone MacPorts base bugs deleted

comment:9 Changed 15 years ago by jmroot (Joshua Root)

Milestone: MacPorts FutureMacPorts 1.5

comment:10 Changed 11 years ago by cooljeanius (Eric Gallager)

Cc: egall@… added

Cc Me!

Note: See TracTickets for help on using tickets.