Opened 5 years ago

Last modified 19 months ago

#57464 new enhancement

Buildbot: Don't keep huge ports installed

Reported by: ryandesign (Ryan Carsten Schmidt) Owned by: admin@…
Priority: Normal Milestone:
Component: buildbot/mpbb Version:
Keywords: Cc: mojca (Mojca Miklavec)
Port:

Description (last modified by ryandesign (Ryan Carsten Schmidt))

Our buildbot setup currently keeps all built ports installed on each worker. This is supposed to speed up builds so that we don't waste time constantly downloading and installing ports that are dependencies of a lot of other ports.

But there are also some inordinately large ports, such as some games, which aren't commonly used as dependencies. Here are some of the larger ports installed on the highsierra buildbot worker:

$ cd /opt/local/var/macports/software
$ du -sch alienarena-data/* geant*-data/* knp/* supertuxkart/* texlive-{fonts,latex}-extra/* typesafe-activator/* wesnoth/*
551M	alienarena-data/alienarena-data-7.66-20130827_0.darwin_17.noarch.tbz2
427M	geant4.10.0-data/geant4.10.0-data-4.10.0_0.darwin_17.noarch.tbz2
428M	geant4.10.1-data/geant4.10.1-data-4.10.1_0.darwin_17.noarch.tbz2
432M	geant4.10.2-data/geant4.10.2-data-4.10.2_1.darwin_17.noarch.tbz2
1005M	geant4.10.3-data/geant4.10.3-data-4.10.3_0.darwin_17.noarch.tbz2
1.1G	geant4.10.4-data/geant4.10.4-data-4.10.4_0.darwin_17.noarch.tbz2
266M	geant4.9.6-data/geant4.9.6-data-4.9.6_0.darwin_17.noarch.tbz2
1.0G	knp/knp-4.14_0.darwin_17.x86_64.tbz2
512M	supertuxkart/supertuxkart-0.9.3_0.darwin_17.x86_64.tbz2
534M	texlive-fonts-extra/texlive-fonts-extra-47288_0+doc.darwin_17.noarch.tbz2
453M	texlive-latex-extra/texlive-latex-extra-47410_0+doc.darwin_17.noarch.tbz2
663M	typesafe-activator/typesafe-activator-1.3.12_0.darwin_17.noarch.tbz2
365M	wesnoth/wesnoth-1.12.6_2.darwin_17.x86_64.tbz2
7.7G	total

It's getting to the point where builds are failing because there's not enough free disk space, and I am manually uninstalling some of these ports on the buildbot workers to free up space. If we could have mpbb automatically uninstall large ports, it would free up a good deal of disk space.

I'm not sure what do about dependencies. For example, geant4.10.4 is not large, but it depends on geant4.10.4-data which is.

Do we force the uninstallation of the large ports, leaving the ports depending on them broken, on the assumption that when the buildbot asks to install them again it will reinstall the dependencies and fix them?

Or do we have the buildbot uninstall not only the large ports but also all ports depending on them? That might be ok for geant, but would be bad for texlive-latex-extra, on which lots of things depend. We could base the decision of whether to uninstall on how many ports would be uninstalled. (If it's, say, 5 or fewer ports, uninstall them, otherwise leave them installed.)

Or do we instead make a general rule (not based on size) that we don't keep a port installed if nothing depends on it?

Some of these ports might not be distributable. Once we have nondistributable archives available on a private server that won't be a big problem anymore. We can wait until we resolve that ticket before deploying an implementation for this one.

Change History (18)

comment:1 Changed 5 years ago by ryandesign (Ryan Carsten Schmidt)

Description: modified (diff)

comment:2 Changed 5 years ago by mojca (Mojca Miklavec)

Just a short note. I'm a maintainer of geant4 and the data files are nothing else but "fetch that gigabyte of data & extract it". There's basically no penalty in building them (there are no build steps) and the files are exactly the same for all OS versions. After you bought this up ... maybe I should also add some restrictive licence just for the sake of decreasing the space we use on our mirror for binaries. This wouldn't solve the issue with storing them on the builder, but it would at least help a tiny tiny bit somewhere else. I would be totally in favour of simply always deleting geant4 data files from the builders, even if I'm not sure about the best way. If you run some kind of a cleanup job ... simply put the geant-data files uncoditionally there.

The only dependent port (Gate) is broken as I'm stuck in the upgrade process (the last few times when I tried it failed to build and I didn't spend enough time trying to come up with a proper fix; that said, I'm not aware of anyone else offering Gate as a dmg, so it would be cool to fix it rather than deleting it).

comment:3 Changed 5 years ago by mojca (Mojca Miklavec)

Cc: mojca added

comment:4 Changed 5 years ago by mojca (Mojca Miklavec)

Btw: a rule that says "uninstall the port unless something depends on it" won't help you a tiny bit in case of Geant since:

  • geant4.x depends on geant4.x-data
  • gate depends on geant4.x

So in case of geant: feel free to delete both (times different versions of geant, so probably 12 ports). We should keep geant4.x on the binary package server, but it does absolutely nothing useful on the builder. I may also remove the older versions one day.

comment:5 in reply to:  4 Changed 5 years ago by ryandesign (Ryan Carsten Schmidt)

Replying to mojca:

Btw: a rule that says "uninstall the port unless something depends on it" won't help you a tiny bit in case of Geant

I understand, I'm just trying to come up with multiple ideas for saving space on the workers. One idea is to uninstall huge ports. Another idea is to uninstall ports that nothing in MacPorts depends on. I'm not sure which would save more space. We could implement one or the other or both. We're already saving some space by using hfsCompression on the tools prefix. We could save a bit more space by using it on the main prefix as well, though that would probably impact performance.

What would be a good way to identify whether anything in MacPorts depends on a port? I would want "anything" to include ports that are not yet installed. Do we do something like this?

port=wine-crossover
if [ -z "$(/opt/local/bin/port echo depends:":$port(\s|$)")" ]; then
    echo uninstall it
fi

comment:6 Changed 5 years ago by ryandesign (Ryan Carsten Schmidt)

Description: modified (diff)

comment:7 Changed 5 years ago by cjones051073 (Chris Jones)

I think I would be in favour of trying both of the ideas you suggest. So a) don't keep anything installed if larger than some maximum size. Say 100M or so... and b) don't keep anything installed if nothing depends on it (as in this case if a builder needs to install it, then presumably its because there is an update to it, so the previous installation is anyway not used).

comment:8 Changed 5 years ago by ryandesign (Ryan Carsten Schmidt)

I gathered some stats from the buildworkers:

OS # Installed # Leaves % Leaves Installed Size Leaves Size % Leaves Size Registry Size
10.5 ppc 15749 9498 60.3% 23.6GiB 16.6GiB 70.6% 1.1GiB
10.6 i386 9468 5757 60.8% 23.8GiB 13.2GiB 55.6% 1.0GiB
10.6 17517 10367 59.2% 35.5GiB 20.1GiB 56.7% 1.4GiB
10.7 18108 10524 58.1% 35.3GiB 19.0GiB 53.7% 1.5GiB
10.8 18460 10721 58.1% 38.0GiB 20.7GiB 54.4% 1.6GiB
10.9 18852 10768 57.1% 41.7GiB 22.0GiB 52.6% 1.7GiB
10.10 18989 10802 56.9% 40.3GiB 21.9GiB 54.2% 1.7GiB
10.11 19065 10838 56.8% 40.1GiB 21.6GiB 53.8% 1.7GiB
10.12 10730 4751 44.3% 24.8GiB 11.2GiB 45.0% 1.1GiB
10.13 18795 10653 56.7% 37.4GiB 19.6GiB 52.5% 1.6GiB
10.14 17849 9813 55.0% 35.1GiB 18.8GiB 53.7% 1.5GiB

Looks like 55-60% of the installed ports are leaves, and uninstalling them would save 13–22GiB of disk space. In addition it will reduce the size of the registry, which will not only save a little extra disk space but should also speed up the tasks that have to rewrite the registry, such as installing or uninstalling ports.

(The 10.12 buildworker's numbers are low because its MacPorts registry got corrupted a few months ago and all ports were uninstalled at that time and it hasn't had a need to reinstall all of them yet.)

So it looks like just uninstalling leaves will be a significant savings with little downside.

Note that we cannot use the leaves pseudoport to identify leaves in the buildworker code, because it does not consider ports that are not yet installed. We would have to use a different method, such as I think the one I proposed above. But when we deploy that, we can do a one-time sudo /opt/local/bin/port uninstall leaves on all the workers to get things started.

comment:9 in reply to:  description Changed 5 years ago by ryandesign (Ryan Carsten Schmidt)

Summary: Builtbot: Don't keep huge ports installedBuildbot: Don't keep huge ports installed

Replying to ryandesign:

Or do we instead make a general rule (not based on size) that we don't keep a port installed if nothing depends on it?

https://github.com/macports/mpbb/pull/14

comment:10 Changed 5 years ago by Ryan Schmidt <git@…>

In 1392c2706a749a1dc73673c228039f7558e21e6d/mpbb (master):

Uninstall ports without dependents

See: #57464

comment:11 Changed 5 years ago by ryandesign (Ryan Carsten Schmidt)

Results of cleanup:

OS Disk space reclaimed by cleanup Disk space now free Remaining installed portsCleanup log
10.6 i386 7GiB 58GiB 5751log
10.6 14GiB 47GiB 8707log
10.7 12GiB 26GiB 8936log
10.8 16GiB 25GiB 9228log
10.9 17GiB 25GiB 9477log
10.10 18GiB 23GiB 9531log
10.11 18GiB 30GiB 9602log
10.12 12GiB 30GiB 7596log
10.13 19GiB 27GiB 9510log
10.14 unclear 20GiB 9101log

I am happy with these free disk space numbers. This should prevent most disk space related build failures.

comment:12 Changed 3 years ago by ryandesign (Ryan Carsten Schmidt)

In e5ae3c56cb2ba130400e77c9c217c261d6b820a3/mpbb (master):

Also uninstall ports with only 1 dependent

See: #57464

comment:13 Changed 3 years ago by ryandesign (Ryan Carsten Schmidt)

Results of cleanup:

OS Disk space reclaimed by cleanup Disk space now freeCleanup log
10.6 i386 3GiB 59GiBlog
10.6 x86_64 10GiB 60GiBlog
10.7 7GiB 48GiBlog
10.8 7GiB 47GiBlog
10.9 3GiB 43GiBlog
10.10 8GiB 44GiBlog
10.11 8GiB 28GiBlog
10.12 6GiB 40GiBlog
10.13 0.25GiB 28GiBlog
10.14 14GiB 37GiBlog
10.15 2GiB 44GiBlog
11 arm64 8GiB 352GiBlog
11 x86_64 6GiB 15GiBlog
Last edited 3 years ago by ryandesign (Ryan Carsten Schmidt) (previous) (diff)

comment:14 in reply to:  12 Changed 3 years ago by jmroot (Joshua Root)

Replying to ryandesign:

Also uninstall ports with only 1 dependent

Unfortunately this can be quite inefficient in certain cases. Port A may have only one dependent B, but B may have a large number of dependents, and A is installed and uninstalled each time one of B's dependents is built. One specific example:

% port echo depends:py39-roman
py39-docutils
% port echo rdepends:py39-docutils | wc -l
114

comment:15 Changed 3 years ago by ryandesign (Ryan Carsten Schmidt)

Thanks, I didn't anticipate that. I'll see if I can improve it.

comment:16 Changed 3 years ago by ryandesign (Ryan Carsten Schmidt)

In c8db5a3f36610a13f3c99915e32a6fa853982be9/mpbb (master):

More selectively uninstalling ports with only 1 dep

Don't uninstall all ports that have only 1 dependency. Only uninstall
those ports with 1 dependency if that dependency also has only 0 or 1
dependencies and so on.

See: #57464
See: #62621

comment:17 Changed 19 months ago by jmroot (Joshua Root)

In c9859c9a155a10924a932f1bd4939bc19a27431a/mpbb (master):

Do extra cleanup on portbuilder if space is low

See: #57464

comment:18 Changed 19 months ago by jmroot (Joshua Root)

In 8f12db95ced35509c22c7d9919220600bf190743/mpbb (master):

cleanup: delete stuff until there's enough space

See: #57464
See: #57869

Note: See TracTickets for help on using tickets.