Opened 8 years ago

Last modified 2 years ago

#49759 reopened defect

fetch.type {git,bzr} hits buildbot timeout after 20 minutes without output

Reported by: ryandesign (Ryan Carsten Schmidt) Owned by: macports-tickets@…
Priority: Normal Milestone:
Component: buildbot/mpbb Version: 2.3.4
Keywords: buildbot Cc: jmroot (Joshua Root), mojca (Mojca Miklavec), larryv (Lawrence Velázquez)
Port:

Description

This buildbot build was interrupted after the third of seven ports, because there was no output for 20 minutes.

A git clone command was running at the time, but had not yet completed. Run locally on a fast network, that git clone completes in 2.5 minutes and the clone occupies 372MB. Run on the buildbot builder, the same command has only downloaded 211MB by the 20 minute mark. For whatever reason there is a slow network connection between our builder and this git server, at least right now.

The git clone commands are being run by MacPorts base with the -q flag to suppress all output. Can we change this so that some output is produced while cloning, so that the buildbot knows the build is not stuck? I agree the buildbot should abort builds that don't produce output for a time, but they shouldn't interrupt file transfers which are just progressing slowly, and this shouldn't cause subsequent ports in the portlist not to be tried.

Attachments (1)

bzr-git-progress.patch (1.2 KB) - added by ryandesign (Ryan Carsten Schmidt) 6 years ago.

Download all attachments as: .zip

Change History (18)

comment:1 Changed 8 years ago by ryandesign (Ryan Carsten Schmidt)

I've repeated the build and it finished fetching in about 18 minutes this time, so the build is running. I'd like a more permanent fix though that doesn't rely on network conditions or whatever's behind the slowdown.

comment:2 Changed 8 years ago by mojca (Mojca Miklavec)

Cc: mojca@… added

Cc Me!

comment:3 Changed 8 years ago by ryandesign (Ryan Carsten Schmidt)

bzr fetches should also produce output to prevent slow fetches from causing a timeout; see #49812.

comment:4 Changed 8 years ago by larryv (Lawrence Velázquez)

Cc: larryv@… added

I am not sure whether just omitting -q would suffice. If not, we could also use --progress. Excerpting from the Git 2.10.1 git-clone man page:

       --quiet, -q
           Operate quietly. Progress is not reported to the standard error
           stream.

       --verbose, -v
           Run verbosely. Does not affect the reporting of progress status to
           the standard error stream.

       --progress
           Progress status is reported on the standard error stream by default
           when it is attached to a terminal, unless -q is specified. This
           flag forces progress status even if the standard error stream is
           not directed to a terminal.

comment:5 Changed 7 years ago by raimue (Rainer Müller)

Keywords: buildbot added
Summary: git checkout should produce some outputfetch.type {git,bzr} hits buildbot timeout after 20 minutes without output

Can't we just configure a higher timeout on the buildbot for the ShellCommand instead?

comment:6 Changed 7 years ago by ryandesign (Ryan Carsten Schmidt)

That just kicks the can down the road rather than solving the actual problem.

comment:7 Changed 7 years ago by mojca (Mojca Miklavec)

We are hitting the timeout problem all the time at other places as well.

comment:8 Changed 7 years ago by ryandesign (Ryan Carsten Schmidt)

For example?

This ticket is specifically about fetching. Fetching via curl already periodically prints progress info so that a long fetch won't timeout. This ticket is requesting similar progress info be added for other fetching methods.

comment:9 Changed 7 years ago by mojca (Mojca Miklavec)

My comment was off-topic and would require a separate ticket, sorry for confusion.

But here's an example (be warned – it's off-topic):

It took 37 minutes before the build job timed out with

Dependency 'py27-setuptools' with variants '' has previously failed and is required.

I understand that PPC is slow, but I cannot understand how it can be soooooooo awfully slow that it needs 37 minutes just to figure out that a dependency is missing.

To add even more to off-topic, py-seuptools failed because of

--->  Attempting to fetch setuptools-28.8.0.tar.gz from http://distfiles.macports.org/py-setuptools
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
DEBUG: Fetching distfile failed: The requested URL returned error: 404
--->  Attempting to fetch setuptools-28.8.0.tar.gz from https://files.pythonhosted.org/packages/source/s/setuptools/

DEBUG: Fetching distfile failed: SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
--->  Attempting to fetch setuptools-28.8.0.tar.gz from https://pypi.python.org/packages/source/s/setuptools/

DEBUG: Fetching distfile failed: SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
Error: org.macports.fetch for port py27-setuptools returned: fetch failed

In fact http://distfiles.macports.org/py-setuptools still doesn't contain the latest version.

comment:10 Changed 7 years ago by jmroot (Joshua Root)

Resolution: fixed
Status: newclosed

In bb55225/macports-base:

Enable progress reporting for git and bzr fetch

Fixes: #49759

comment:11 Changed 7 years ago by jmroot (Joshua Root)

Milestone: MacPorts 2.4.0

comment:12 in reply to:  10 Changed 6 years ago by ryandesign (Ryan Carsten Schmidt)

Resolution: fixed
Status: closedreopened

Replying to jmroot:

In bb55225/macports-base:

Enable progress reporting for git and bzr fetch

Fixes: #49759

This doesn't fix the problem.

For git, removing -q and adding --progress makes it output progress, even when not using a tty, but the progress does not print newlines; it uses linefeeds so that the progress information stays on the same line while updating itself. But either the output is not flushed until it prints a newline, or buildbot is looking for newlines specifically, because buildbot will not consider it output unless it sees a newline at least once every 20 minutes.

For bzr, adding --verbose does nothing to affect the output of the checkout command, at least not that I can see; Dave Evans observed the same in comment:ticket:49812:19. Instead, setting the environment variable BZR_PROGRESS_BAR=text will cause it to print progress, even when not using a tty, but it's the same kind of progress as git outputs, with linefeeds instead of newlines.

tr can be used to turn the linefeeds into newlines, and this is what I do in the attached patch, which is a little messy but works for me for both bzr and git. bzr outputs progress once a second; git's output isn't as predictable. This might be a lot of output for ports that fetch huge amounts of data, like inkscape-devel or widelands-devel. It could be reduced by replacing tr -su '\r' '\n' with a custom program that only prints a line if it has been more than, say, 1 minute since the last line was printed. Then again, cvs and svn checkouts print one line per file, which can get pretty large too.

If we use tr, do we need to use findBinary to locate it?

I do not know if piping the output through tr or a custom program will hide a possible error exit status from bzr or git.

I didn't check whether other vcs systems like hg need similar treatment.

Changed 6 years ago by ryandesign (Ryan Carsten Schmidt)

Attachment: bzr-git-progress.patch added

comment:13 Changed 6 years ago by raimue (Rainer Müller)

Let me repeat my question from comment:5, why do we not increase the timeout of the ShellCommand for fetching instead of applying such workarounds? Or is the intention that the fetch should be more verbose in general?

comment:14 Changed 6 years ago by jmroot (Joshua Root)

There isn't a separate ShellCommand for fetching. Occasionally ports genuinely get stuck, and 20 minutes is already a long time to block other builds while waiting for timeout. There's also no guarantee that fetching won't sometimes take longer than whatever arbitrary time we choose to allow.

comment:15 in reply to:  13 Changed 6 years ago by ryandesign (Ryan Carsten Schmidt)

Replying to raimue:

Or is the intention that the fetch should be more verbose in general?

The intention is to make MacPorts print something periodically (to the log or debug/verbose output) when it is doing something. If MacPorts is doing something, like downloading, and hasn't printed anything for several minutes, I consider that wrong. My proposal attempts to fix that.

comment:16 Changed 6 years ago by neverpanic (Clemens Lang)

Component: basebuildbot/mpbb

comment:17 Changed 2 years ago by jmroot (Joshua Root)

Milestone: MacPorts 2.4.0
Note: See TracTickets for help on using tickets.