tensorflow 1.11 and cuda 8
up vote
1
down vote
favorite
TensorFlow 1.11 fails to build with CUDA 8. I tried opening an issue on github (Issue opened on Github #23256 [https://github.com/tensorflow/tensorflow/issues/23256]) but the tensorflow team's response is to just upgrade CUDA to 9 or downgrade Tensorflow to 1.10, which isn't an option for me. Trying to find a way to get TF1.11 to work with CUDA 8.
Attempting to build a docker container with TF 1.11 and CUDA 8 on an GeForce 1060 3GB GPU. An error keeps occurring in the build. Github Issue 22729 (#22729) was looked at but the work around didn't work for TF 1.11 and that's what is needed. The docker file is also below. Any help you can provide would be greatly appreciated.
System information
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
TensorFlow installed from (source or binary): Source
TensorFlow version: TF 1.11
Python version: 2.7
Installed using virtualenv? pip? conda?: Docker
Bazel version (if compiling from source): 0.15.0
GCC/Compiler version (if compiling from source): 7.3.0
CUDA/cuDNN version: 8.0/7
GPU model and memory: GeForce GTX 1060 3GB
Provide the exact sequence of commands / steps that you executed before running into the problem
sudo docker build --no-cache . -f Dockerfile.tf-1.11-py27-gpu.txt -t tf-1.11-py27-gpu
Thank you,
Kyle
Dockerfile.tf-1.11-py27-gpu
FROM nvidia/cuda:8.0-cudnn7-devel-ubuntu16.04
LABEL maintainer="Craig Citro <craigcitro@google.com>; Modified for Cuda 8 by Jack Harris"
RUN apt-get update && apt-get install -y --allow-downgrades --allow-change-held-packages --no-install-recommends
build-essential
cuda-command-line-tools-8-0
cuda-cublas-dev-8-0
cuda-cudart-dev-8-0
cuda-cufft-dev-8-0
cuda-curand-dev-8-0
cuda-cusolver-dev-8-0
cuda-cusparse-dev-8-0
curl
git
libcudnn7=7.2.1.38-1+cuda8.0
libcudnn7-dev=7.2.1.38-1+cuda8.0
libnccl2=2.2.13-1+cuda8.0
libnccl-dev=2.2.13-1+cuda8.0
libcurl3-dev
libfreetype6-dev
libhdf5-serial-dev
libpng12-dev
libzmq3-dev
pkg-config
python-dev
rsync
software-properties-common
unzip
zip
zlib1g-dev
wget
&&
rm -rf /var/lib/apt/lists/* &&
find /usr/local/cuda-8.0/lib64/ -type f -name 'lib*_static.a' -not -name 'libcudart_static.a' -delete &&
rm -f /usr/lib/x86_64-linux-gnu/libcudnn_static_v7.a
RUN apt-get update &&
apt-get install nvinfer-runtime-trt-repo-ubuntu1604-4.0.1-ga-cuda8.0 &&
apt-get update &&
apt-get install libnvinfer4=4.1.2-1+cuda8.0 &&
apt-get install libnvinfer-dev=4.1.2-1+cuda8.0
# Link NCCL libray and header where the build script expects them.
RUN mkdir /usr/local/cuda-8.0/lib &&
ln -s /usr/lib/x86_64-linux-gnu/libnccl.so.2 /usr/local/cuda/lib/libnccl.so.2 &&
ln -s /usr/include/nccl.h /usr/local/cuda/include/nccl.h
# TODO(tobyboyd): Remove after license is excluded from BUILD file.
#RUN gunzip /usr/share/doc/libnccl2/NCCL-SLA.txt.gz &&
# cp /usr/share/doc/libnccl2/NCCL-SLA.txt /usr/local/cuda/
# Add External Mount Points
RUN mkdir -p /external_lib
RUN mkdir -p /external_bin
RUN curl -fSsL -O https://bootstrap.pypa.io/get-pip.py &&
python get-pip.py &&
rm get-pip.py
RUN pip --no-cache-dir install
ipykernel
jupyter
keras_applications==1.0.5
keras_preprocessing==1.0.3
matplotlib
numpy
pandas
scipy
sklearn
mock
&&
python -m ipykernel.kernelspec
# Set up our notebook config.
#COPY jupyter_notebook_config.py /root/.jupyter/
# Jupyter has issues with being run directly:
# https://github.com/ipython/ipython/issues/7062
# We just add a little wrapper script.
# COPY run_jupyter.sh /
# Set up Bazel.
# Running bazel inside a `docker build` command causes trouble, cf:
# https://github.com/bazelbuild/bazel/issues/134
# The easiest solution is to set up a bazelrc file forcing --batch.
RUN echo "startup --batch" >>/etc/bazel.bazelrc
# Similarly, we need to workaround sandboxing issues:
# https://github.com/bazelbuild/bazel/issues/418
RUN echo "build --spawn_strategy=standalone --genrule_strategy=standalone"
>>/etc/bazel.bazelrc
# Install the most recent bazel release.
ENV BAZEL_VERSION 0.15.0
WORKDIR /
RUN mkdir /bazel &&
cd /bazel &&
curl -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" -fSsL -O https://github.com/bazelbuild/bazel/releases/download/$BAZEL_VERSION/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh &&
curl -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" -fSsL -o /bazel/LICENSE.txt https://raw.githubusercontent.com/bazelbuild/bazel/master/LICENSE &&
chmod +x bazel-*.sh &&
./bazel-$BAZEL_VERSION-installer-linux-x86_64.sh &&
cd / &&
rm -f /bazel/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh
# Download and build TensorFlow.
RUN git clone http://github.com/tensorflow/tensorflow --branch r1.11 --depth=1
WORKDIR /tensorflow
RUN sed -i 's/^#if TF_HAS_.*$/#if !defined(__NVCC__)/g' tensorflow/core/platform/macros.h
ENV TF_NCCL_VERSION=2
#RUN /bin/echo -e "/usr/bin/pythonnnnnnnnnnnnnnnnnnnnnyn8.0n/usr/local/cudan7.0n/usr/local/cudannnnnnnnn-march=nativennn" | ./configure
RUN /bin/echo -e "/usr/bin/pythonnnnnnnnnnnnnnnnnnnnnnnyn8.0n/usr/local/cudan7.0n/usr/local/cudannnnnnnnnnnnn-march=nativennn" | ./configure
#RUN /bin/echo -e "nnnnnnnnnnnnnnnnnnnnnnn-march=nativennn" | ./configure
# Configure the build for our CUDA configuration.
ENV CI_BUILD_PYTHON python
ENV PATH /external_bin:$PATH
ENV LD_LIBRARY_PATH /external_lib:/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH
ENV TF_NEED_CUDA 1
ENV TF_NEED_TENSORRT 1
ENV TF_CUDA_COMPUTE_CAPABILITIES=3.0,3.5,5.2,6.0,6.1
ENV TF_CUDA_VERSION=8.0
ENV TF_CUDNN_VERSION=7
# https://github.com/tensorflow/tensorflow/issues/17801
RUN ln -s /usr/local/cuda/lib64/stubs/libcuda.so /usr/local/cuda/lib64/stubs/libcuda.so.1 &&
ln -s /usr/local/cuda/nvvm/libdevice/libdevice.compute_50.10.bc /usr/local/cuda/nvvm/libdevice/libdevice.10.bc &&
LD_LIBRARY_PATH=/usr/local/cuda/lib64/stubs:$LD_LIBRARY_PATH
tensorflow/tools/ci_build/builds/configured GPU
bazel build -c opt --copt=-mavx --config=cuda
--cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"
tensorflow/tools/pip_package/build_pip_package &&
rm /usr/local/cuda/lib64/stubs/libcuda.so.1
RUN bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/pip
RUN pip --no-cache-dir install --upgrade /tmp/pip/tensorflow-*.whl &&
rm -rf /tmp/pip &&
rm -rf /root/.cache
# Clean up pip wheel and Bazel cache when done.
WORKDIR /root
# TensorBoard
EXPOSE 6006
# IPython
EXPOSE 8888
CMD [ "/bin/bash" ]
tf11cuda8.log - Log attached to github issue (too long to post here)
python-2.7 docker tensorflow
add a comment |
up vote
1
down vote
favorite
TensorFlow 1.11 fails to build with CUDA 8. I tried opening an issue on github (Issue opened on Github #23256 [https://github.com/tensorflow/tensorflow/issues/23256]) but the tensorflow team's response is to just upgrade CUDA to 9 or downgrade Tensorflow to 1.10, which isn't an option for me. Trying to find a way to get TF1.11 to work with CUDA 8.
Attempting to build a docker container with TF 1.11 and CUDA 8 on an GeForce 1060 3GB GPU. An error keeps occurring in the build. Github Issue 22729 (#22729) was looked at but the work around didn't work for TF 1.11 and that's what is needed. The docker file is also below. Any help you can provide would be greatly appreciated.
System information
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
TensorFlow installed from (source or binary): Source
TensorFlow version: TF 1.11
Python version: 2.7
Installed using virtualenv? pip? conda?: Docker
Bazel version (if compiling from source): 0.15.0
GCC/Compiler version (if compiling from source): 7.3.0
CUDA/cuDNN version: 8.0/7
GPU model and memory: GeForce GTX 1060 3GB
Provide the exact sequence of commands / steps that you executed before running into the problem
sudo docker build --no-cache . -f Dockerfile.tf-1.11-py27-gpu.txt -t tf-1.11-py27-gpu
Thank you,
Kyle
Dockerfile.tf-1.11-py27-gpu
FROM nvidia/cuda:8.0-cudnn7-devel-ubuntu16.04
LABEL maintainer="Craig Citro <craigcitro@google.com>; Modified for Cuda 8 by Jack Harris"
RUN apt-get update && apt-get install -y --allow-downgrades --allow-change-held-packages --no-install-recommends
build-essential
cuda-command-line-tools-8-0
cuda-cublas-dev-8-0
cuda-cudart-dev-8-0
cuda-cufft-dev-8-0
cuda-curand-dev-8-0
cuda-cusolver-dev-8-0
cuda-cusparse-dev-8-0
curl
git
libcudnn7=7.2.1.38-1+cuda8.0
libcudnn7-dev=7.2.1.38-1+cuda8.0
libnccl2=2.2.13-1+cuda8.0
libnccl-dev=2.2.13-1+cuda8.0
libcurl3-dev
libfreetype6-dev
libhdf5-serial-dev
libpng12-dev
libzmq3-dev
pkg-config
python-dev
rsync
software-properties-common
unzip
zip
zlib1g-dev
wget
&&
rm -rf /var/lib/apt/lists/* &&
find /usr/local/cuda-8.0/lib64/ -type f -name 'lib*_static.a' -not -name 'libcudart_static.a' -delete &&
rm -f /usr/lib/x86_64-linux-gnu/libcudnn_static_v7.a
RUN apt-get update &&
apt-get install nvinfer-runtime-trt-repo-ubuntu1604-4.0.1-ga-cuda8.0 &&
apt-get update &&
apt-get install libnvinfer4=4.1.2-1+cuda8.0 &&
apt-get install libnvinfer-dev=4.1.2-1+cuda8.0
# Link NCCL libray and header where the build script expects them.
RUN mkdir /usr/local/cuda-8.0/lib &&
ln -s /usr/lib/x86_64-linux-gnu/libnccl.so.2 /usr/local/cuda/lib/libnccl.so.2 &&
ln -s /usr/include/nccl.h /usr/local/cuda/include/nccl.h
# TODO(tobyboyd): Remove after license is excluded from BUILD file.
#RUN gunzip /usr/share/doc/libnccl2/NCCL-SLA.txt.gz &&
# cp /usr/share/doc/libnccl2/NCCL-SLA.txt /usr/local/cuda/
# Add External Mount Points
RUN mkdir -p /external_lib
RUN mkdir -p /external_bin
RUN curl -fSsL -O https://bootstrap.pypa.io/get-pip.py &&
python get-pip.py &&
rm get-pip.py
RUN pip --no-cache-dir install
ipykernel
jupyter
keras_applications==1.0.5
keras_preprocessing==1.0.3
matplotlib
numpy
pandas
scipy
sklearn
mock
&&
python -m ipykernel.kernelspec
# Set up our notebook config.
#COPY jupyter_notebook_config.py /root/.jupyter/
# Jupyter has issues with being run directly:
# https://github.com/ipython/ipython/issues/7062
# We just add a little wrapper script.
# COPY run_jupyter.sh /
# Set up Bazel.
# Running bazel inside a `docker build` command causes trouble, cf:
# https://github.com/bazelbuild/bazel/issues/134
# The easiest solution is to set up a bazelrc file forcing --batch.
RUN echo "startup --batch" >>/etc/bazel.bazelrc
# Similarly, we need to workaround sandboxing issues:
# https://github.com/bazelbuild/bazel/issues/418
RUN echo "build --spawn_strategy=standalone --genrule_strategy=standalone"
>>/etc/bazel.bazelrc
# Install the most recent bazel release.
ENV BAZEL_VERSION 0.15.0
WORKDIR /
RUN mkdir /bazel &&
cd /bazel &&
curl -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" -fSsL -O https://github.com/bazelbuild/bazel/releases/download/$BAZEL_VERSION/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh &&
curl -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" -fSsL -o /bazel/LICENSE.txt https://raw.githubusercontent.com/bazelbuild/bazel/master/LICENSE &&
chmod +x bazel-*.sh &&
./bazel-$BAZEL_VERSION-installer-linux-x86_64.sh &&
cd / &&
rm -f /bazel/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh
# Download and build TensorFlow.
RUN git clone http://github.com/tensorflow/tensorflow --branch r1.11 --depth=1
WORKDIR /tensorflow
RUN sed -i 's/^#if TF_HAS_.*$/#if !defined(__NVCC__)/g' tensorflow/core/platform/macros.h
ENV TF_NCCL_VERSION=2
#RUN /bin/echo -e "/usr/bin/pythonnnnnnnnnnnnnnnnnnnnnyn8.0n/usr/local/cudan7.0n/usr/local/cudannnnnnnnn-march=nativennn" | ./configure
RUN /bin/echo -e "/usr/bin/pythonnnnnnnnnnnnnnnnnnnnnnnyn8.0n/usr/local/cudan7.0n/usr/local/cudannnnnnnnnnnnn-march=nativennn" | ./configure
#RUN /bin/echo -e "nnnnnnnnnnnnnnnnnnnnnnn-march=nativennn" | ./configure
# Configure the build for our CUDA configuration.
ENV CI_BUILD_PYTHON python
ENV PATH /external_bin:$PATH
ENV LD_LIBRARY_PATH /external_lib:/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH
ENV TF_NEED_CUDA 1
ENV TF_NEED_TENSORRT 1
ENV TF_CUDA_COMPUTE_CAPABILITIES=3.0,3.5,5.2,6.0,6.1
ENV TF_CUDA_VERSION=8.0
ENV TF_CUDNN_VERSION=7
# https://github.com/tensorflow/tensorflow/issues/17801
RUN ln -s /usr/local/cuda/lib64/stubs/libcuda.so /usr/local/cuda/lib64/stubs/libcuda.so.1 &&
ln -s /usr/local/cuda/nvvm/libdevice/libdevice.compute_50.10.bc /usr/local/cuda/nvvm/libdevice/libdevice.10.bc &&
LD_LIBRARY_PATH=/usr/local/cuda/lib64/stubs:$LD_LIBRARY_PATH
tensorflow/tools/ci_build/builds/configured GPU
bazel build -c opt --copt=-mavx --config=cuda
--cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"
tensorflow/tools/pip_package/build_pip_package &&
rm /usr/local/cuda/lib64/stubs/libcuda.so.1
RUN bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/pip
RUN pip --no-cache-dir install --upgrade /tmp/pip/tensorflow-*.whl &&
rm -rf /tmp/pip &&
rm -rf /root/.cache
# Clean up pip wheel and Bazel cache when done.
WORKDIR /root
# TensorBoard
EXPOSE 6006
# IPython
EXPOSE 8888
CMD [ "/bin/bash" ]
tf11cuda8.log - Log attached to github issue (too long to post here)
python-2.7 docker tensorflow
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
TensorFlow 1.11 fails to build with CUDA 8. I tried opening an issue on github (Issue opened on Github #23256 [https://github.com/tensorflow/tensorflow/issues/23256]) but the tensorflow team's response is to just upgrade CUDA to 9 or downgrade Tensorflow to 1.10, which isn't an option for me. Trying to find a way to get TF1.11 to work with CUDA 8.
Attempting to build a docker container with TF 1.11 and CUDA 8 on an GeForce 1060 3GB GPU. An error keeps occurring in the build. Github Issue 22729 (#22729) was looked at but the work around didn't work for TF 1.11 and that's what is needed. The docker file is also below. Any help you can provide would be greatly appreciated.
System information
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
TensorFlow installed from (source or binary): Source
TensorFlow version: TF 1.11
Python version: 2.7
Installed using virtualenv? pip? conda?: Docker
Bazel version (if compiling from source): 0.15.0
GCC/Compiler version (if compiling from source): 7.3.0
CUDA/cuDNN version: 8.0/7
GPU model and memory: GeForce GTX 1060 3GB
Provide the exact sequence of commands / steps that you executed before running into the problem
sudo docker build --no-cache . -f Dockerfile.tf-1.11-py27-gpu.txt -t tf-1.11-py27-gpu
Thank you,
Kyle
Dockerfile.tf-1.11-py27-gpu
FROM nvidia/cuda:8.0-cudnn7-devel-ubuntu16.04
LABEL maintainer="Craig Citro <craigcitro@google.com>; Modified for Cuda 8 by Jack Harris"
RUN apt-get update && apt-get install -y --allow-downgrades --allow-change-held-packages --no-install-recommends
build-essential
cuda-command-line-tools-8-0
cuda-cublas-dev-8-0
cuda-cudart-dev-8-0
cuda-cufft-dev-8-0
cuda-curand-dev-8-0
cuda-cusolver-dev-8-0
cuda-cusparse-dev-8-0
curl
git
libcudnn7=7.2.1.38-1+cuda8.0
libcudnn7-dev=7.2.1.38-1+cuda8.0
libnccl2=2.2.13-1+cuda8.0
libnccl-dev=2.2.13-1+cuda8.0
libcurl3-dev
libfreetype6-dev
libhdf5-serial-dev
libpng12-dev
libzmq3-dev
pkg-config
python-dev
rsync
software-properties-common
unzip
zip
zlib1g-dev
wget
&&
rm -rf /var/lib/apt/lists/* &&
find /usr/local/cuda-8.0/lib64/ -type f -name 'lib*_static.a' -not -name 'libcudart_static.a' -delete &&
rm -f /usr/lib/x86_64-linux-gnu/libcudnn_static_v7.a
RUN apt-get update &&
apt-get install nvinfer-runtime-trt-repo-ubuntu1604-4.0.1-ga-cuda8.0 &&
apt-get update &&
apt-get install libnvinfer4=4.1.2-1+cuda8.0 &&
apt-get install libnvinfer-dev=4.1.2-1+cuda8.0
# Link NCCL libray and header where the build script expects them.
RUN mkdir /usr/local/cuda-8.0/lib &&
ln -s /usr/lib/x86_64-linux-gnu/libnccl.so.2 /usr/local/cuda/lib/libnccl.so.2 &&
ln -s /usr/include/nccl.h /usr/local/cuda/include/nccl.h
# TODO(tobyboyd): Remove after license is excluded from BUILD file.
#RUN gunzip /usr/share/doc/libnccl2/NCCL-SLA.txt.gz &&
# cp /usr/share/doc/libnccl2/NCCL-SLA.txt /usr/local/cuda/
# Add External Mount Points
RUN mkdir -p /external_lib
RUN mkdir -p /external_bin
RUN curl -fSsL -O https://bootstrap.pypa.io/get-pip.py &&
python get-pip.py &&
rm get-pip.py
RUN pip --no-cache-dir install
ipykernel
jupyter
keras_applications==1.0.5
keras_preprocessing==1.0.3
matplotlib
numpy
pandas
scipy
sklearn
mock
&&
python -m ipykernel.kernelspec
# Set up our notebook config.
#COPY jupyter_notebook_config.py /root/.jupyter/
# Jupyter has issues with being run directly:
# https://github.com/ipython/ipython/issues/7062
# We just add a little wrapper script.
# COPY run_jupyter.sh /
# Set up Bazel.
# Running bazel inside a `docker build` command causes trouble, cf:
# https://github.com/bazelbuild/bazel/issues/134
# The easiest solution is to set up a bazelrc file forcing --batch.
RUN echo "startup --batch" >>/etc/bazel.bazelrc
# Similarly, we need to workaround sandboxing issues:
# https://github.com/bazelbuild/bazel/issues/418
RUN echo "build --spawn_strategy=standalone --genrule_strategy=standalone"
>>/etc/bazel.bazelrc
# Install the most recent bazel release.
ENV BAZEL_VERSION 0.15.0
WORKDIR /
RUN mkdir /bazel &&
cd /bazel &&
curl -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" -fSsL -O https://github.com/bazelbuild/bazel/releases/download/$BAZEL_VERSION/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh &&
curl -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" -fSsL -o /bazel/LICENSE.txt https://raw.githubusercontent.com/bazelbuild/bazel/master/LICENSE &&
chmod +x bazel-*.sh &&
./bazel-$BAZEL_VERSION-installer-linux-x86_64.sh &&
cd / &&
rm -f /bazel/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh
# Download and build TensorFlow.
RUN git clone http://github.com/tensorflow/tensorflow --branch r1.11 --depth=1
WORKDIR /tensorflow
RUN sed -i 's/^#if TF_HAS_.*$/#if !defined(__NVCC__)/g' tensorflow/core/platform/macros.h
ENV TF_NCCL_VERSION=2
#RUN /bin/echo -e "/usr/bin/pythonnnnnnnnnnnnnnnnnnnnnyn8.0n/usr/local/cudan7.0n/usr/local/cudannnnnnnnn-march=nativennn" | ./configure
RUN /bin/echo -e "/usr/bin/pythonnnnnnnnnnnnnnnnnnnnnnnyn8.0n/usr/local/cudan7.0n/usr/local/cudannnnnnnnnnnnn-march=nativennn" | ./configure
#RUN /bin/echo -e "nnnnnnnnnnnnnnnnnnnnnnn-march=nativennn" | ./configure
# Configure the build for our CUDA configuration.
ENV CI_BUILD_PYTHON python
ENV PATH /external_bin:$PATH
ENV LD_LIBRARY_PATH /external_lib:/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH
ENV TF_NEED_CUDA 1
ENV TF_NEED_TENSORRT 1
ENV TF_CUDA_COMPUTE_CAPABILITIES=3.0,3.5,5.2,6.0,6.1
ENV TF_CUDA_VERSION=8.0
ENV TF_CUDNN_VERSION=7
# https://github.com/tensorflow/tensorflow/issues/17801
RUN ln -s /usr/local/cuda/lib64/stubs/libcuda.so /usr/local/cuda/lib64/stubs/libcuda.so.1 &&
ln -s /usr/local/cuda/nvvm/libdevice/libdevice.compute_50.10.bc /usr/local/cuda/nvvm/libdevice/libdevice.10.bc &&
LD_LIBRARY_PATH=/usr/local/cuda/lib64/stubs:$LD_LIBRARY_PATH
tensorflow/tools/ci_build/builds/configured GPU
bazel build -c opt --copt=-mavx --config=cuda
--cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"
tensorflow/tools/pip_package/build_pip_package &&
rm /usr/local/cuda/lib64/stubs/libcuda.so.1
RUN bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/pip
RUN pip --no-cache-dir install --upgrade /tmp/pip/tensorflow-*.whl &&
rm -rf /tmp/pip &&
rm -rf /root/.cache
# Clean up pip wheel and Bazel cache when done.
WORKDIR /root
# TensorBoard
EXPOSE 6006
# IPython
EXPOSE 8888
CMD [ "/bin/bash" ]
tf11cuda8.log - Log attached to github issue (too long to post here)
python-2.7 docker tensorflow
TensorFlow 1.11 fails to build with CUDA 8. I tried opening an issue on github (Issue opened on Github #23256 [https://github.com/tensorflow/tensorflow/issues/23256]) but the tensorflow team's response is to just upgrade CUDA to 9 or downgrade Tensorflow to 1.10, which isn't an option for me. Trying to find a way to get TF1.11 to work with CUDA 8.
Attempting to build a docker container with TF 1.11 and CUDA 8 on an GeForce 1060 3GB GPU. An error keeps occurring in the build. Github Issue 22729 (#22729) was looked at but the work around didn't work for TF 1.11 and that's what is needed. The docker file is also below. Any help you can provide would be greatly appreciated.
System information
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
TensorFlow installed from (source or binary): Source
TensorFlow version: TF 1.11
Python version: 2.7
Installed using virtualenv? pip? conda?: Docker
Bazel version (if compiling from source): 0.15.0
GCC/Compiler version (if compiling from source): 7.3.0
CUDA/cuDNN version: 8.0/7
GPU model and memory: GeForce GTX 1060 3GB
Provide the exact sequence of commands / steps that you executed before running into the problem
sudo docker build --no-cache . -f Dockerfile.tf-1.11-py27-gpu.txt -t tf-1.11-py27-gpu
Thank you,
Kyle
Dockerfile.tf-1.11-py27-gpu
FROM nvidia/cuda:8.0-cudnn7-devel-ubuntu16.04
LABEL maintainer="Craig Citro <craigcitro@google.com>; Modified for Cuda 8 by Jack Harris"
RUN apt-get update && apt-get install -y --allow-downgrades --allow-change-held-packages --no-install-recommends
build-essential
cuda-command-line-tools-8-0
cuda-cublas-dev-8-0
cuda-cudart-dev-8-0
cuda-cufft-dev-8-0
cuda-curand-dev-8-0
cuda-cusolver-dev-8-0
cuda-cusparse-dev-8-0
curl
git
libcudnn7=7.2.1.38-1+cuda8.0
libcudnn7-dev=7.2.1.38-1+cuda8.0
libnccl2=2.2.13-1+cuda8.0
libnccl-dev=2.2.13-1+cuda8.0
libcurl3-dev
libfreetype6-dev
libhdf5-serial-dev
libpng12-dev
libzmq3-dev
pkg-config
python-dev
rsync
software-properties-common
unzip
zip
zlib1g-dev
wget
&&
rm -rf /var/lib/apt/lists/* &&
find /usr/local/cuda-8.0/lib64/ -type f -name 'lib*_static.a' -not -name 'libcudart_static.a' -delete &&
rm -f /usr/lib/x86_64-linux-gnu/libcudnn_static_v7.a
RUN apt-get update &&
apt-get install nvinfer-runtime-trt-repo-ubuntu1604-4.0.1-ga-cuda8.0 &&
apt-get update &&
apt-get install libnvinfer4=4.1.2-1+cuda8.0 &&
apt-get install libnvinfer-dev=4.1.2-1+cuda8.0
# Link NCCL libray and header where the build script expects them.
RUN mkdir /usr/local/cuda-8.0/lib &&
ln -s /usr/lib/x86_64-linux-gnu/libnccl.so.2 /usr/local/cuda/lib/libnccl.so.2 &&
ln -s /usr/include/nccl.h /usr/local/cuda/include/nccl.h
# TODO(tobyboyd): Remove after license is excluded from BUILD file.
#RUN gunzip /usr/share/doc/libnccl2/NCCL-SLA.txt.gz &&
# cp /usr/share/doc/libnccl2/NCCL-SLA.txt /usr/local/cuda/
# Add External Mount Points
RUN mkdir -p /external_lib
RUN mkdir -p /external_bin
RUN curl -fSsL -O https://bootstrap.pypa.io/get-pip.py &&
python get-pip.py &&
rm get-pip.py
RUN pip --no-cache-dir install
ipykernel
jupyter
keras_applications==1.0.5
keras_preprocessing==1.0.3
matplotlib
numpy
pandas
scipy
sklearn
mock
&&
python -m ipykernel.kernelspec
# Set up our notebook config.
#COPY jupyter_notebook_config.py /root/.jupyter/
# Jupyter has issues with being run directly:
# https://github.com/ipython/ipython/issues/7062
# We just add a little wrapper script.
# COPY run_jupyter.sh /
# Set up Bazel.
# Running bazel inside a `docker build` command causes trouble, cf:
# https://github.com/bazelbuild/bazel/issues/134
# The easiest solution is to set up a bazelrc file forcing --batch.
RUN echo "startup --batch" >>/etc/bazel.bazelrc
# Similarly, we need to workaround sandboxing issues:
# https://github.com/bazelbuild/bazel/issues/418
RUN echo "build --spawn_strategy=standalone --genrule_strategy=standalone"
>>/etc/bazel.bazelrc
# Install the most recent bazel release.
ENV BAZEL_VERSION 0.15.0
WORKDIR /
RUN mkdir /bazel &&
cd /bazel &&
curl -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" -fSsL -O https://github.com/bazelbuild/bazel/releases/download/$BAZEL_VERSION/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh &&
curl -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" -fSsL -o /bazel/LICENSE.txt https://raw.githubusercontent.com/bazelbuild/bazel/master/LICENSE &&
chmod +x bazel-*.sh &&
./bazel-$BAZEL_VERSION-installer-linux-x86_64.sh &&
cd / &&
rm -f /bazel/bazel-$BAZEL_VERSION-installer-linux-x86_64.sh
# Download and build TensorFlow.
RUN git clone http://github.com/tensorflow/tensorflow --branch r1.11 --depth=1
WORKDIR /tensorflow
RUN sed -i 's/^#if TF_HAS_.*$/#if !defined(__NVCC__)/g' tensorflow/core/platform/macros.h
ENV TF_NCCL_VERSION=2
#RUN /bin/echo -e "/usr/bin/pythonnnnnnnnnnnnnnnnnnnnnyn8.0n/usr/local/cudan7.0n/usr/local/cudannnnnnnnn-march=nativennn" | ./configure
RUN /bin/echo -e "/usr/bin/pythonnnnnnnnnnnnnnnnnnnnnnnyn8.0n/usr/local/cudan7.0n/usr/local/cudannnnnnnnnnnnn-march=nativennn" | ./configure
#RUN /bin/echo -e "nnnnnnnnnnnnnnnnnnnnnnn-march=nativennn" | ./configure
# Configure the build for our CUDA configuration.
ENV CI_BUILD_PYTHON python
ENV PATH /external_bin:$PATH
ENV LD_LIBRARY_PATH /external_lib:/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH
ENV TF_NEED_CUDA 1
ENV TF_NEED_TENSORRT 1
ENV TF_CUDA_COMPUTE_CAPABILITIES=3.0,3.5,5.2,6.0,6.1
ENV TF_CUDA_VERSION=8.0
ENV TF_CUDNN_VERSION=7
# https://github.com/tensorflow/tensorflow/issues/17801
RUN ln -s /usr/local/cuda/lib64/stubs/libcuda.so /usr/local/cuda/lib64/stubs/libcuda.so.1 &&
ln -s /usr/local/cuda/nvvm/libdevice/libdevice.compute_50.10.bc /usr/local/cuda/nvvm/libdevice/libdevice.10.bc &&
LD_LIBRARY_PATH=/usr/local/cuda/lib64/stubs:$LD_LIBRARY_PATH
tensorflow/tools/ci_build/builds/configured GPU
bazel build -c opt --copt=-mavx --config=cuda
--cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"
tensorflow/tools/pip_package/build_pip_package &&
rm /usr/local/cuda/lib64/stubs/libcuda.so.1
RUN bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/pip
RUN pip --no-cache-dir install --upgrade /tmp/pip/tensorflow-*.whl &&
rm -rf /tmp/pip &&
rm -rf /root/.cache
# Clean up pip wheel and Bazel cache when done.
WORKDIR /root
# TensorBoard
EXPOSE 6006
# IPython
EXPOSE 8888
CMD [ "/bin/bash" ]
tf11cuda8.log - Log attached to github issue (too long to post here)
python-2.7 docker tensorflow
python-2.7 docker tensorflow
edited Nov 10 at 20:05
talonmies
58.8k17126192
58.8k17126192
asked Nov 10 at 19:14
healykys
61
61
add a comment |
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53242520%2ftensorflow-1-11-and-cuda-8%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown