Opened 5 years ago

Closed 4 years ago

#49498 closed defect (fixed)

Switching to a mirror causes a complete rebuild of the port index

Reported by: lpancescu (Laurențiu Păncescu) Owned by: admin@…
Priority: Low Milestone:
Component: server/hosting Version:
Keywords: Cc: ryandesign (Ryan Schmidt)
Port:

Description

I switched to a local mirror as a workaround to the network problems described in bug #49452. The rsync-ing was very fast, but the lack of an index made mirror switching pretty annoying: the first "port selfupdate" thinks that every single port is new, and rebuilds the entire index. So mirror users have to wait 30+ minutes for the index to build (see bug #49050) not once, but two times.

I was curious why that happens, so I base.tar from three different mirrors (sea.us.rsync.macports.org, lil.fr.rsync.macports.org and nue.de.rsync.macports.org). Since MacPorts 2.3.4 was released some time ago, I expected the archives to be identical. They aren't, although the file size is the same: diff reports every archive as binary different from the other two. The timestamps also differ: 20:02, 23:02, respectively 22:02 - all from today. After unpacking the archives into different directories, I used FileMerge to directly compare the directories: every single file is identical, but the tar archives aren't. So I redirected the output of "ls -lR" in each directory to a different file, then ran "diff -u" to compare them. Here's a part of the result:

 -rw-r--r--   1 laur  staff  1922 May 24  2014 pkg_mkindex.sh.in
-drwxr-xr-x   8 laur  staff   272 Oct 28 19:31 port
-drwxr-xr-x  35 laur  staff  1190 Oct 28 19:31 port1.0
-drwxr-xr-x   4 laur  staff   136 Oct 28 19:31 programs
-drwxr-xr-x  36 laur  staff  1224 Oct 28 19:31 registry2.0
+drwxr-xr-x   8 laur  staff   272 Oct 28 22:31 port
+drwxr-xr-x  35 laur  staff  1190 Oct 28 22:31 port1.0
+drwxr-xr-x   4 laur  staff   136 Oct 28 22:31 programs
+drwxr-xr-x  36 laur  staff  1224 Oct 28 22:31 registry2.0
 drwxr-xr-x  11 laur  staff   374 May 24  2014 tclobjc1.0
 drwxr-xr-x  19 laur  staff   646 May 24  2014 thread2.6
--rwxr-xr-x   1 laur  staff  4311 Oct 28 19:30 upgrade_sources_conf_default.tcl
+-rwxr-xr-x   1 laur  staff  4311 Oct 28 22:30 upgrade_sources_conf_default.tcl
 -rwxr-xr-x   1 laur  staff  4265 May 24  2014 upgrade_sources_conf_default.tcl.in

It seems that the timestamps of every single directory and of the few generated files (files like Makefile probably generated from Makefile.in) are set to the timestamp of the archive on the mirror. Perhaps the "fresh" directory timestamps are the reason for the index being rebuilt, and that each mirror runs some sort of build locally, at least daily, even when nothing has actually changed? I should have downloaded ports.tar.gz, to be sure, but it's quite big.

Change History (3)

comment:1 Changed 5 years ago by ryandesign (Ryan Schmidt)

Cc: ryandesign@… added
Priority: NormalLow

The primary rsync server in California contains the master copy of the data. The master data is created in various ways, depending on the rsync module in question. For distfiles, these are fetched after each port is committed. For packages, these are uploaded by the buildbot builders after every build. For ports and base, the mprsyncup script is used to create a tarball every 30 minutes by getting the latest information from the correct subdirectory of the Subversion repository.

The mirror servers create an identical copy of the master. There's nothing involved in this other than running the appropriate rsync commands to copy the data from the master's various rsync modules. The mirror servers each have their own schedule for when and how often they sync, at the discretion of that mirror's administrator. So although the master updates every half hour, one mirror might only sync every hour, another every four hours. So the reason for the differences you observed is likely because the tarball you got from the mirror was older than the tarball on the master.

You've apparently discovered that the timestamps of some of the files in the tarball change every time the tarball is recreated, even if there are no changes to the content. Possibly that could be optimized somehow. I have not read the mprsyncup script before so I haven't yet fully understood all of what it does.

The only reason why this appears to be a problem for you is that it causes the portindex to be regenerated. This is only happening because a server-side index for El Capitan does not exist yet. The mprsyncup script has already been modified to generate an index for El Capitan in r140729, but the administrator of the master server needs to apply that change to the server. We have requested this in #49050. Once this is done, this timestamp problem shouldn't really matter.

comment:2 Changed 5 years ago by lpancescu (Laurențiu Păncescu)

That's true, it's not an issue if the server-side indexes exist. I just ran "port selfupdate" without changing the mirror, and it only indexes the few changed ports, not all of them, so it was fast. But if I understood your comments in #49050 correctly, there's been no reaction whatsoever (even a refusal) from the server admin since October 1st, when Joshua Root sent that email, right?

comment:3 Changed 4 years ago by raimue (Rainer Müller)

Resolution: fixed
Status: newclosed

PortIndex for 10.11 El Capitan (darwin_15) and 10.12 Sierra (darwin_16) are on the mirrors already.

Note: See TracTickets for help on using tickets.