Opened 2 years ago

Last modified 20 months ago

#57464 new enhancement

Buildbot: Don't keep huge ports installed

Reported by: ryandesign (Ryan Schmidt) Owned by: admin@…
Priority: Normal Milestone:
Component: buildbot/mpbb Version:
Keywords: Cc: mojca (Mojca Miklavec)
Port:

Description (last modified by ryandesign (Ryan Schmidt))

Our buildbot setup currently keeps all built ports installed on each worker. This is supposed to speed up builds so that we don't waste time constantly downloading and installing ports that are dependencies of a lot of other ports.

But there are also some inordinately large ports, such as some games, which aren't commonly used as dependencies. Here are some of the larger ports installed on the highsierra buildbot worker:

$ cd /opt/local/var/macports/software
$ du -sch alienarena-data/* geant*-data/* knp/* supertuxkart/* texlive-{fonts,latex}-extra/* typesafe-activator/* wesnoth/*
551M	alienarena-data/alienarena-data-7.66-20130827_0.darwin_17.noarch.tbz2
427M	geant4.10.0-data/geant4.10.0-data-4.10.0_0.darwin_17.noarch.tbz2
428M	geant4.10.1-data/geant4.10.1-data-4.10.1_0.darwin_17.noarch.tbz2
432M	geant4.10.2-data/geant4.10.2-data-4.10.2_1.darwin_17.noarch.tbz2
1005M	geant4.10.3-data/geant4.10.3-data-4.10.3_0.darwin_17.noarch.tbz2
1.1G	geant4.10.4-data/geant4.10.4-data-4.10.4_0.darwin_17.noarch.tbz2
266M	geant4.9.6-data/geant4.9.6-data-4.9.6_0.darwin_17.noarch.tbz2
1.0G	knp/knp-4.14_0.darwin_17.x86_64.tbz2
512M	supertuxkart/supertuxkart-0.9.3_0.darwin_17.x86_64.tbz2
534M	texlive-fonts-extra/texlive-fonts-extra-47288_0+doc.darwin_17.noarch.tbz2
453M	texlive-latex-extra/texlive-latex-extra-47410_0+doc.darwin_17.noarch.tbz2
663M	typesafe-activator/typesafe-activator-1.3.12_0.darwin_17.noarch.tbz2
365M	wesnoth/wesnoth-1.12.6_2.darwin_17.x86_64.tbz2
7.7G	total

It's getting to the point where builds are failing because there's not enough free disk space, and I am manually uninstalling some of these ports on the buildbot workers to free up space. If we could have mpbb automatically uninstall large ports, it would free up a good deal of disk space.

I'm not sure what do about dependencies. For example, geant4.10.4 is not large, but it depends on geant4.10.4-data which is.

Do we force the uninstallation of the large ports, leaving the ports depending on them broken, on the assumption that when the buildbot asks to install them again it will reinstall the dependencies and fix them?

Or do we have the buildbot uninstall not only the large ports but also all ports depending on them? That might be ok for geant, but would be bad for texlive-latex-extra, on which lots of things depend. We could base the decision of whether to uninstall on how many ports would be uninstalled. (If it's, say, 5 or fewer ports, uninstall them, otherwise leave them installed.)

Or do we instead make a general rule (not based on size) that we don't keep a port installed if nothing depends on it?

Some of these ports might not be distributable. Once we have nondistributable archives available on a private server that won't be a big problem anymore. We can wait until we resolve that ticket before deploying an implementation for this one.

Change History (11)

comment:1 Changed 2 years ago by ryandesign (Ryan Schmidt)

Description: modified (diff)

comment:2 Changed 2 years ago by mojca (Mojca Miklavec)

Just a short note. I'm a maintainer of geant4 and the data files are nothing else but "fetch that gigabyte of data & extract it". There's basically no penalty in building them (there are no build steps) and the files are exactly the same for all OS versions. After you bought this up ... maybe I should also add some restrictive licence just for the sake of decreasing the space we use on our mirror for binaries. This wouldn't solve the issue with storing them on the builder, but it would at least help a tiny tiny bit somewhere else. I would be totally in favour of simply always deleting geant4 data files from the builders, even if I'm not sure about the best way. If you run some kind of a cleanup job ... simply put the geant-data files uncoditionally there.

The only dependent port (Gate) is broken as I'm stuck in the upgrade process (the last few times when I tried it failed to build and I didn't spend enough time trying to come up with a proper fix; that said, I'm not aware of anyone else offering Gate as a dmg, so it would be cool to fix it rather than deleting it).

comment:3 Changed 2 years ago by mojca (Mojca Miklavec)

Cc: mojca added

comment:4 Changed 2 years ago by mojca (Mojca Miklavec)

Btw: a rule that says "uninstall the port unless something depends on it" won't help you a tiny bit in case of Geant since:

  • geant4.x depends on geant4.x-data
  • gate depends on geant4.x

So in case of geant: feel free to delete both (times different versions of geant, so probably 12 ports). We should keep geant4.x on the binary package server, but it does absolutely nothing useful on the builder. I may also remove the older versions one day.

comment:5 in reply to:  4 Changed 2 years ago by ryandesign (Ryan Schmidt)

Replying to mojca:

Btw: a rule that says "uninstall the port unless something depends on it" won't help you a tiny bit in case of Geant

I understand, I'm just trying to come up with multiple ideas for saving space on the workers. One idea is to uninstall huge ports. Another idea is to uninstall ports that nothing in MacPorts depends on. I'm not sure which would save more space. We could implement one or the other or both. We're already saving some space by using hfsCompression on the tools prefix. We could save a bit more space by using it on the main prefix as well, though that would probably impact performance.

What would be a good way to identify whether anything in MacPorts depends on a port? I would want "anything" to include ports that are not yet installed. Do we do something like this?

port=wine-crossover
if [ -z "$(/opt/local/bin/port echo depends:":$port(\s|$)")" ]; then
    echo uninstall it
fi

comment:6 Changed 2 years ago by ryandesign (Ryan Schmidt)

Description: modified (diff)

comment:7 Changed 2 years ago by cjones051073 (Chris Jones)

I think I would be in favour of trying both of the ideas you suggest. So a) don't keep anything installed if larger than some maximum size. Say 100M or so... and b) don't keep anything installed if nothing depends on it (as in this case if a builder needs to install it, then presumably its because there is an update to it, so the previous installation is anyway not used).

comment:8 Changed 23 months ago by ryandesign (Ryan Schmidt)

I gathered some stats from the buildworkers:

OS # Installed # Leaves % Leaves Installed Size Leaves Size % Leaves Size Registry Size
10.5 ppc 15749 9498 60.3% 23.6GiB 16.6GiB 70.6% 1.1GiB
10.6 i386 9468 5757 60.8% 23.8GiB 13.2GiB 55.6% 1.0GiB
10.6 17517 10367 59.2% 35.5GiB 20.1GiB 56.7% 1.4GiB
10.7 18108 10524 58.1% 35.3GiB 19.0GiB 53.7% 1.5GiB
10.8 18460 10721 58.1% 38.0GiB 20.7GiB 54.4% 1.6GiB
10.9 18852 10768 57.1% 41.7GiB 22.0GiB 52.6% 1.7GiB
10.10 18989 10802 56.9% 40.3GiB 21.9GiB 54.2% 1.7GiB
10.11 19065 10838 56.8% 40.1GiB 21.6GiB 53.8% 1.7GiB
10.12 10730 4751 44.3% 24.8GiB 11.2GiB 45.0% 1.1GiB
10.13 18795 10653 56.7% 37.4GiB 19.6GiB 52.5% 1.6GiB
10.14 17849 9813 55.0% 35.1GiB 18.8GiB 53.7% 1.5GiB

Looks like 55-60% of the installed ports are leaves, and uninstalling them would save 13–22GiB of disk space. In addition it will reduce the size of the registry, which will not only save a little extra disk space but should also speed up the tasks that have to rewrite the registry, such as installing or uninstalling ports.

(The 10.12 buildworker's numbers are low because its MacPorts registry got corrupted a few months ago and all ports were uninstalled at that time and it hasn't had a need to reinstall all of them yet.)

So it looks like just uninstalling leaves will be a significant savings with little downside.

Note that we cannot use the leaves pseudoport to identify leaves in the buildworker code, because it does not consider ports that are not yet installed. We would have to use a different method, such as I think the one I proposed above. But when we deploy that, we can do a one-time sudo /opt/local/bin/port uninstall leaves on all the workers to get things started.

comment:9 in reply to:  description Changed 20 months ago by ryandesign (Ryan Schmidt)

Summary: Builtbot: Don't keep huge ports installedBuildbot: Don't keep huge ports installed

Replying to ryandesign:

Or do we instead make a general rule (not based on size) that we don't keep a port installed if nothing depends on it?

https://github.com/macports/mpbb/pull/14

comment:10 Changed 20 months ago by Ryan Schmidt <git@…>

In 1392c2706a749a1dc73673c228039f7558e21e6d/mpbb (master):

Uninstall ports without dependents

See: #57464

comment:11 Changed 20 months ago by ryandesign (Ryan Schmidt)

Results of cleanup:

OS Disk space reclaimed by cleanup Disk space now free Remaining installed portsCleanup log
10.6 i386 7GiB 58GiB 5751log
10.6 14GiB 47GiB 8707log
10.7 12GiB 26GiB 8936log
10.8 16GiB 25GiB 9228log
10.9 17GiB 25GiB 9477log
10.10 18GiB 23GiB 9531log
10.11 18GiB 30GiB 9602log
10.12 12GiB 30GiB 7596log
10.13 19GiB 27GiB 9510log
10.14 unclear 20GiB 9101log

I am happy with these free disk space numbers. This should prevent most disk space related build failures.

Note: See TracTickets for help on using tickets.