Opened 6 years ago

Last modified 3 years ago

#56793 closed defect

macports base: installing a port with many files appears to hang; file registration loop slows dramatically for file counts > 10,000 — at Version 15

Reported by: yan12125 (Chih-Hsuan Yen) Owned by: mascguy (Christopher Nielsen)
Priority: Normal Milestone: MacPorts 2.7.1
Component: base Version: 2.5.3
Keywords: Cc: jmroot (Joshua Root)
Port:

Description (last modified by mascguy (Christopher Nielsen))

I'm trying to install texlive-fonts-extra, which has 87,000 files in its destroot/ folder. port -d install texlive-fonts-extra appears hang - no messages for several minutes and the CPU usage keeps high at around ~98%.

I found that the installation process is running the loop at:

https://github.com/macports/macports-base/blob/8a67087e2194acf3d0037097b0cd7f1e5477e401/src/port1.0/portinstall.tcl#L416

If the loop is inevitable, how about adding some messages saying that the port process is busy doing something? It will improve the user experience greatly.

/cc the author of the aforementioned loop.

Change History (17)

comment:1 Changed 3 years ago by gpolitis (George Politis)

Thank you for reporting this issue, I would have killed the process if I hadn't seen this. Eventually, after several minutes, the installation finished successfully. A message saying that the port process is busy doing something as the OP suggests would have been nice.

comment:2 Changed 3 years ago by mascguy (Christopher Nielsen)

Cc: mascguy added

comment:3 Changed 3 years ago by mascguy (Christopher Nielsen)

I'm currently dealing with this too, for port texlive-fonts-extra. Except in my case, the process has been running for more than 20 minutes.

Since the port installs nearly 90,000 files, it's reasonable for the process to take some time. But ideally not that long.

Ryan/Josh, do you folks know if it would be feasible to migrate the logic in question, to native code? Or are there other potential options to improve the performance of this loop?

I'd gladly help with the development effort, though I'd need some hints/direction.

Last edited 3 years ago by mascguy (Christopher Nielsen) (previous) (diff)

comment:4 Changed 3 years ago by mascguy (Christopher Nielsen)

If there's no way to improve the performance of this, would it be possible to add a progress bar?

comment:5 Changed 3 years ago by mascguy (Christopher Nielsen)

Also, just curious: Have we considered using a dictionary, rather than an array, for the code in question?

I haven't (yet) debugged this enough to understand where the time is being spent. But based on research, dictionaries provide significantly faster performance, vs. an array. And it can't hurt...

comment:6 Changed 3 years ago by jmroot (Joshua Root)

I doubt the array is the performance bottleneck there; seems much more likely to be constructing all the file objects. An interface to set the property for a list of files might help. But let's not act on guesses when it comes to performance: profile!

comment:7 in reply to:  6 Changed 3 years ago by mascguy (Christopher Nielsen)

Replying to jmroot:

I doubt the array is the performance bottleneck there; seems much more likely to be constructing all the file objects. An interface to set the property for a list of files might help. But let's not act on guesses when it comes to performance: profile!

Agreed, profiling is the way to go. And port texlive-fonts-extra makes a great test case!

As a starting point, I added ui_debug statements to the file registration loop, and this is what I'm seeing: Initially the loop was quite fast. However, things gradually start slowing. And as we approach the 30,000th file, the loop has slowed dramatically.

By the time registration hits the 60,000th file, the process has gotten even slower.

In any case, I'll re-run this, and calculate timings. Stay tuned...

Last edited 3 years ago by mascguy (Christopher Nielsen) (previous) (diff)

comment:8 Changed 3 years ago by jmroot (Joshua Root)

I would suggest learning to use the profiler package in tcllib.

comment:9 in reply to:  8 Changed 3 years ago by mascguy (Christopher Nielsen)

Replying to jmroot:

I would suggest learning to use the profiler package in tcllib.

Great idea!

Changed 3 years ago by mascguy (Christopher Nielsen)

Changed 3 years ago by mascguy (Christopher Nielsen)

comment:10 Changed 3 years ago by mascguy (Christopher Nielsen)

Attached timing info, as well as a sample of the MacPorts tclsh process. The file registration work for texlive-fonts-extra took a little over 28 minutes, and total port installation time was approximately 30 minutes.

If you review the timing file (port-install-texlive-fonts-extra-timings.txt), you'll see the performance is very non-linear: It's extremely fast at first, but grows progressively slower. By the end, only a few hundred file registrations are occurring during every 15-second interval.

As for profiling, I started by adding package require Tclx at the top of portinstall.tcl, but that resulted in the following error:

DEBUG: unknown user id: /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_texlive-fonts-extra/texlive-fonts-extra

Is the profiler in the Tclx package, the one to go with? And if so, any ideas? If not, what should we be using?

Last edited 3 years ago by mascguy (Christopher Nielsen) (previous) (diff)

comment:11 Changed 3 years ago by mascguy (Christopher Nielsen)

Based on a cursory look at the process sample, I'm wondering if there is a less-than-optimal hashtable implementation somewhere? Or a case where the hashtable (within sqlite3, or wherever) needs to be initialized with a much larger default size?

Last edited 3 years ago by mascguy (Christopher Nielsen) (previous) (diff)

comment:12 Changed 3 years ago by mascguy (Christopher Nielsen)

Cc: mascguy removed
Owner: set to mascguy
Status: newassigned

comment:13 Changed 3 years ago by mascguy (Christopher Nielsen)

Summary: Installing a port with many files appears appears hangmacports base: installing a port with many files appears to hang
Version: 2.5.32.7.0

comment:14 Changed 3 years ago by mascguy (Christopher Nielsen)

Summary: macports base: installing a port with many files appears to hangmacports base: installing a port with many files appears to hang; file registration loop slows dramatically for file counts > 10,000

comment:15 Changed 3 years ago by mascguy (Christopher Nielsen)

Description: modified (diff)
Note: See TracTickets for help on using tickets.