Opened 4 years ago

Closed 11 months ago

#60960 closed defect (fixed)

py-tensorflow @2.3.0: java.io.IOException: Cannot run program… Too many open files

Reported by: essandess (Steve Smith) Owned by: emcrisostomo (Enrico Maria Crisostomo)
Priority: Normal Milestone:
Component: ports Version:
Keywords: Cc: cjones051073 (Chris Jones), chrstphrchvz (Christopher Chavez), mascguy (Christopher Nielsen)
Port: py-tensorflow

Description

Re: https://github.com/macports/macports-ports/pull/7575

py-tensorflow version 2.3.0 fails to build because:

ERROR: /opt/local/var/macports/build/_opt_local_ports_python_py-tensorflow/py37-tensorflow/work/tensorflow-tensorflow-b36436b/tensorflow/core/common_runtime/BUILD:328:11: C++ compilation of rule '//tensorflow/core/common_runtime:collective_executor_mgr' failed (Exit -1): wrapped_clang failed: error executing command 
  (cd /opt/local/var/macports/build/_opt_local_ports_python_py-tensorflow/py37-tensorflow/work/e3571a779784f9da03a7824d69817047/execroot/org_tensorflow && \
  exec env - \
    APPLE_SDK_PLATFORM=MacOSX \
    APPLE_SDK_VERSION_OVERRIDE=10.15 \
    PATH=/opt/local/bin:/opt/local/sbin:/bin:/sbin:/usr/bin:/usr/sbin \
    XCODE_VERSION_OVERRIDE=11.6.0.11E708 \
  external/local_config_cc/wrapped_clang '-D_FORTIFY_SOURCE=1' -fstack-protector -fcolor-diagnostics -Wall -Wthread-safety -Wself-assign -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG '-std=c++11' -iquote . -iquote bazel-out/host/bin -iquote external/com_google_absl -iquote bazel-out/host/bin/external/com_google_absl -iquote external/eigen_archive -iquote bazel-out/host/bin/external/eigen_archive -iquote external/local_config_sycl -iquote bazel-out/host/bin/external/local_config_sycl -iquote external/nsync -iquote bazel-out/host/bin/external/nsync -iquote external/gif -iquote bazel-out/host/bin/external/gif -iquote external/libjpeg_turbo -iquote bazel-out/host/bin/external/libjpeg_turbo -iquote external/com_google_protobuf -iquote bazel-out/host/bin/external/com_google_protobuf -iquote external/com_googlesource_code_re2 -iquote bazel-out/host/bin/external/com_googlesource_code_re2 -iquote external/farmhash_archive -iquote bazel-out/host/bin/external/farmhash_archive -iquote external/fft2d -iquote bazel-out/host/bin/external/fft2d -iquote external/highwayhash -iquote bazel-out/host/bin/external/highwayhash -iquote external/zlib -iquote bazel-out/host/bin/external/zlib -iquote external/double_conversion -iquote bazel-out/host/bin/external/double_conversion -isystem external/eigen_archive -isystem bazel-out/host/bin/external/eigen_archive -isystem external/nsync/public -isystem bazel-out/host/bin/external/nsync/public -isystem external/gif -isystem bazel-out/host/bin/external/gif -isystem external/com_google_protobuf/src -isystem bazel-out/host/bin/external/com_google_protobuf/src -isystem external/farmhash_archive/src -isystem bazel-out/host/bin/external/farmhash_archive/src -isystem external/zlib -isystem bazel-out/host/bin/external/zlib -isystem external/double_conversion -isystem bazel-out/host/bin/external/double_conversion -MD -MF bazel-out/host/bin/tensorflow/core/common_runtime/_objs/collective_executor_mgr/collective_executor_mgr.d -D__CLANG_SUPPORT_DYN_ANNOTATION__ -DEIGEN_MPL2_ONLY '-DEIGEN_MAX_ALIGN_BYTES=64' '-DEIGEN_HAS_TYPE_TRAITS=0' '-frandom-seed=bazel-out/host/bin/tensorflow/core/common_runtime/_objs/collective_executor_mgr/collective_executor_mgr.o' -isysroot __BAZEL_XCODE_SDKROOT__ -F__BAZEL_XCODE_SDKROOT__/System/Library/Frameworks -F__BAZEL_XCODE_DEVELOPER_DIR__/Platforms/MacOSX.platform/Developer/Library/Frameworks '-mmacosx-version-min=10.15' -g0 '-march=x86-64' -g0 '-std=c++14' -DEIGEN_AVOID_STL_ARRAY -Iexternal/gemmlowp -Wno-sign-compare '-ftemplate-depth=900' -fno-exceptions '-DTENSORFLOW_USE_XLA=1' -no-canonical-prefixes -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -c tensorflow/core/common_runtime/collective_executor_mgr.cc -o bazel-out/host/bin/tensorflow/core/common_runtime/_objs/collective_executor_mgr/collective_executor_mgr.o)
Execution platform: @local_execution_config_platform//:platform. Note: Remote connection/protocol failed with: execution failed
Action failed to execute: java.io.IOException: Cannot run program "/opt/local/var/macports/build/_opt_local_ports_python_py-tensorflow/py37-tensorflow/work/install/1eb24b6f9fb447fbef56fd6c7521f126/process-wrapper" (in directory "/opt/local/var/macports/build/_opt_local_ports_python_py-tensorflow/py37-tensorflow/work/e3571a779784f9da03a7824d69817047/execroot/org_tensorflow"): error=24, Too many open files
Target //tensorflow/tools/pip_package:build_pip_package failed to build
macOS 10.15.6 19G73
Xcode 11.6 11E708 

Sysem limit is:

ulimit -n
2560

Change History (7)

comment:1 Changed 4 years ago by ryandesign (Ryan Carsten Schmidt)

Cc: cjones051073 added
Owner: set to emcrisostomo
Port: py-tensorflow added
Status: newassigned
Summary: py-tensorflow version 2.3.0 fails to build: java.io.IOException: Cannot run program… Too many open filespy-tensorflow @2.3.0: java.io.IOException: Cannot run program… Too many open files

comment:2 Changed 4 years ago by essandess (Steve Smith)

FWIW, I set these system limits and started up port -vst install py37-tensorflow up again, but the build fails later on with the following.

sudo launchctl limit maxfiles 65536 200000
ulimit -n 65536
:info:build tensorflow/python/lib/core/bfloat16.cc:678:8: error: no matching function for call to object of type '(lambda at tensorflow/python/lib/core/bfloat16.cc:637:25)'
:info:build   if (!register_ufunc("less_equal", CompareUFunc<Bfloat16LeFunctor>,
:info:build        ^~~~~~~~~~~~~~
:info:build tensorflow/python/lib/core/bfloat16.cc:637:25: note: candidate function not viable: no overload of 'CompareUFunc' matching 'PyUFuncGenericFunction' (aka 'void (*)(char **, const long *, const long *, void *)') for 2nd argument
:info:build   auto register_ufunc = [&](const char* name, PyUFuncGenericFunction fn,
:info:build                         ^
:info:build tensorflow/python/lib/core/bfloat16.cc:682:8: error: no matching function for call to object of type '(lambda at tensorflow/python/lib/core/bfloat16.cc:637:25)'
:info:build   if (!register_ufunc("greater_equal", CompareUFunc<Bfloat16GeFunctor>,
:info:build        ^~~~~~~~~~~~~~

This PR may be relevant: https://github.com/tensorflow/tensorflow/pull/40654

comment:3 Changed 4 years ago by chrstphrchvz (Christopher Chavez)

As indicated by upstream discussion, the no matching function for call to object of type error is due to a breaking change in Numpy 1.19.0, which MacPorts updated to sometime after updating py-tensorflow was updated to 2.2.0. It is not a problem introduced by the 2.3.0 update. The issue is observed on builders for both py3x-tensorflow and py3x-tensorflow1.

I'm inclined to revise PR 8203 to try addressing that issue; doing so may unintentionally conflict with PR 7575.

comment:4 Changed 4 years ago by chrstphrchvz (Christopher Chavez)

Cc: chrstphrchvz added

comment:5 Changed 4 years ago by chrstphrchvz (Christopher Chavez)

In 68d57fc88f02d86ccc7478d738f27c70974868ab/macports-ports (master):

py-(tensorflow|tensorflow1): fix for NumPy 1.19.x

See: #60960#comment:2

Use LTS JDK as fallback
Public updates for openjdk12 ended 2019-09
Public updates for openjdk14 will end 2020-09
Public updates for openjdk11 will continue at least until 2023

Use legacysupport portgroup for clock_gettime()

Remove unused portgroup xcodeversion

Don't set use_mp_clang before using xcode_workaround

comment:6 Changed 11 months ago by mascguy (Christopher Nielsen)

Cc: mascguy added

comment:7 Changed 11 months ago by mascguy (Christopher Nielsen)

Resolution: fixed
Status: assignedclosed

This is no longer an issue, with the latest releases. Closing as fixed.

Note: See TracTickets for help on using tickets.