Showing posts with label Data Science. Show all posts
Showing posts with label Data Science. Show all posts

Friday 23 April 2021

How to link Google Colab with GitHub

Create a GitHub repository in which you want to keep Colaboratory notebooks.
Go to https://colab.research.google.com/github/ and add your GitHub account.



GitHub website window will open asking you to accept Colab's access to your GitHub account.
Once accepted that window will close and back in https://colab.research.google.com/github/ a popup window above will contain a list of all repositories in connected GitHub account. Select the one you created. As it's currently empty, there are no notebooks in there, it will show "No results".

If you create a notebook in Colab (https://colab.research.google.com/) and want to add it to that GitHub repo, go to Colab's tab in the browser, File >> Save a copy in GitHub. 




After this, this notebook will appear in the list of notebooks available to be loaded into Colab:






Sunday 14 March 2021

Running NVIDIA DIGITS Docker container on Ubuntu

Installing NVIDIA DIGITS directly on your computer means that you'll:
  • spend a considerable amount of time in installing all dependencies and building DIGITS itself
  • pollute your machine with another application and its dependencies
To prevent this, we can run NVIDIA DIGITS Docker container. Let's check first whether docker is installed and its version :

$ docker --version
Docker version 20.10.3, build 48d30b5

For the reference, I was running the commands I listed below in this article on my Ubuntu 20.04:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.2 LTS
Release: 20.04
Codename: focal

Ideally, we'd be running NVIDIA Digits on a machine with GPU(s). This would speed up training and inference but Digits can also work on a machine which has a CPU only. 

I have GeForce GT 640 graphics card:

$ nvidia-smi -L
GPU 0: GeForce GT 640 (UUID: GPU-f2583df9-404d-2564-d332-e7878a94d087)

$ lspci
...
VGA compatible controller: NVIDIA Corporation GK107 [GeForce GT 640 OEM] (rev a1)
...

GK107 is a code name for GeForce GT 640 (GDDR5) (source: GeForce 600 series - Wikipedia) which, according to CUDA GPUs | NVIDIA Developer, has computing capability 3.5 (which is supported as it has to be >2.1 according to Installation Guide — NVIDIA Cloud Native Technologies documentation).

To test the local GPU we can run nvidia-smi application on the local host or in Docker image.

If we haven't installed CUDA or nvidia-smi locally, we can run nvidia-smi from NVIDIA CUDA Docker image:

$ sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
Thu Feb 11 01:02:09 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GT 640      Off  | 00000000:01:00.0 N/A |                  N/A |
| 40%   31C    P8    N/A /  N/A |    286MiB /  1992MiB |     N/A      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+


Let's now follow the instructions from DIGITS | NVIDIA NGC. We first need to download the image to our local host:

$ docker pull nvcr.io/nvidia/digits:20.12-tensorflow-py3
20.12-tensorflow-py3: Pulling from nvidia/digits
6a5697faee43: Pulling fs layer 
ba13d3bc422b: Pulling fs layer 
...
cec6045b0d0e: Pulling fs layer 
cb4aa708e833: Waiting 
235cfa23a5f4: Waiting 
24781a3c82ea: Waiting 
f7c7d47c1a97: Pull complete 
...
b57dde2f2923: Pull complete 
Digest: sha256:7542143bc2292fc48a3874786877815a5ca6a74a69366324aaf66914155cb5a7
Status: Downloaded newer image for nvcr.io/nvidia/digits:20.12-tensorflow-py3
nvcr.io/nvidia/digits:20.12-tensorflow-py3

Let's now run the container. docker run has --gpus option which instructs Docker to add GPU devices to container ('all' to pass all GPUs).

$ docker run --gpus all -it --rm nvcr.io/nvidia/digits:20.12-tensorflow-py3
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

I haven't installed NVIDIA Container Toolkit (nvidia-docker) which enable Docker containers accessing host's GPU. Installation Guide — NVIDIA Cloud Native Technologies documentation describes how to install it:

$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
   && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
   && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$ sudo apt-get update
$ sudo apt-get install -y nvidia-docker2
$ sudo systemctl restart docker


nvidia-docker version 
NVIDIA Docker: 2.5.0
Client: Docker Engine - Community
 Version:           20.10.3
 API version:       1.41
 Go version:        go1.13.15
 Git commit:        48d30b5
 Built:             Fri Jan 29 14:33:21 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.3
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       46229ca
  Built:            Fri Jan 29 14:31:32 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.3
  GitCommit:        269548fa27e0089a8b8278fc4fc781d7f65a939b
 runc:
  Version:          1.0.0-rc92
  GitCommit:        ff819c7e9184c13b7c2607fe6c30ae19403a7aff
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0


To be on the safe side, I also installed the latest NVIDIA driver.

$ sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
[sudo] password for bojan: 
Thu Feb 11 01:02:09 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GT 640      Off  | 00000000:01:00.0 N/A |                  N/A |
| 40%   31C    P8    N/A /  N/A |    286MiB /  1992MiB |     N/A      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+


This time running DIGITS container was successful. DIGITS 6.0 http server uses port 5000 by default and in this example it is mapped to host port 8888.

$ docker run --gpus all -it --rm -p 8888:5000 nvcr.io/nvidia/digits:20.12-tensorflow-py3

============
== DIGITS ==
============

NVIDIA Release 20.12 (build 17912121)
DIGITS Version 6.1.1

Container image Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
DIGITS Copyright (c) 2014-2019, NVIDIA CORPORATION. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION.  All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying project or file.
ERROR: No supported GPU(s) detected to run this container

  ___ ___ ___ ___ _____ ___
 |   \_ _/ __|_ _|_   _/ __|
 | |) | | (_ || |  | | \__ \
 |___/___\___|___| |_| |___/ 6.1.1

Caffe support disabled.
Reason: A valid Caffe installation was not found on your system.
cudaRuntimeGetVersion() failed with error #999
2021-02-11 16:23:54.454747: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
/opt/digits/digits/pretrained_model/views.py:32: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if str(files['weights_file'].filename) is '':
/opt/digits/digits/pretrained_model/views.py:38: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if str(files['model_def_file'].filename) is '':
/opt/digits/digits/pretrained_model/views.py:54: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if str(files['weights_file'].filename) is '':
/opt/digits/digits/pretrained_model/views.py:60: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if str(files['model_def_file'].filename) is '':
/opt/digits/digits/pretrained_model/views.py:169: SyntaxWarning: "is" with a literal. Did you mean "=="?
  elif str(flask.request.form['job_name']) is '':
/opt/digits/digits/pretrained_model/views.py:177: SyntaxWarning: "is not" with a literal. Did you mean "!="?
  if str(flask.request.files['labels_file'].filename) is not '':
2021-02-11 16:23:56 [INFO ] Loaded 0 jobs.


If we now open a browser on the host and type http://localhost:8888 we'll be able to see DIGITS home page:



As DIGITS is a web-based application we don't need to run it in interactive mode (docker run -it) but can run it in a detached mode (docker run -d):

$ docker run \
--gpus all \
-d \
--name digits \
--rm \
-p 8888:5000 \
-v /home/bojan/dev/digits-demo/data:/data \
-v /home/bojan/dev/digits-demo/jobs:/workspace/jobs \ nvcr.io/nvidia/digits:20.12-tensorflow-py3

905f9a8c8e48bc87ae99117eed92b855d45c7d37695c0e94433bd18fab6bfaca

We can verify that DIGITS container is indeed running:

$ docker ps 
CONTAINER ID   IMAGE                                        COMMAND                  CREATED              STATUS              PORTS                                                  NAMES
905f9a8c8e48   nvcr.io/nvidia/digits:20.12-tensorflow-py3   "/usr/local/bin/nvid…"   About a minute ago   Up About a minute   6006/tcp, 6064/tcp, 8888/tcp, 0.0.0.0:8888->5000/tcp   digits


Why DIGITS doesn't recognize my GPU?



One thing didn't seem right to me though. In the upper right corner of the DIGITS home page should be a text which indicates how many GPUs are available. In my case, although I have one GPU, no GPUs were listed. 




I tried first to check if GPU is indeed visible from the container:

$ docker exec -it digits bash
root@e58b860504a9:/workspace# 

root@e58b860504a9:/workspace# nvidia-smi
Fri Feb 12 23:33:17 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GT 640      Off  | 00000000:01:00.0 N/A |                  N/A |
| 40%   32C    P8    N/A /  N/A |    260MiB /  1992MiB |     N/A      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Graphics card was visible. DIGITS installation contains a Python script which is DIGITS Device Query (source code: python/9427/DIGITS/digits/device_query.py). When I tried to run it, I got an error:

root@e58b860504a9:/opt/digits/digits# python device_query.py 
cudaRuntimeGetVersion() failed with error #999
No devices found.


cudaErrorUnknown = 999
This indicates that an unknown internal error has occurred.
CUDA was installed fine:

root@6cd6c429f20c:/workspace# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0


On the host system I checked if loading the NVIDIA driver gave any errors (NVRM errors are internal to the nvidia kernel module):

$ sudo dmesg |grep NVRM
[sudo] password for bojan: 
[    2.283911] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  460.32.03  Sun Dec 27 19:00:34 UTC 2020
[ 8654.742795] NVRM: GPU at PCI:0000:01:00: GPU-f2583df9-404d-2564-d332-e7878a94d087
[ 8654.742800] NVRM: Xid (PCI:0000:01:00): 31, pid=577, Ch 00000002, intr 10000000. MMU Fault: ENGINE HOST4 HUBCLIENT_HOST faulted @ 0x1_01160000. Fault is of type FAULT_INFO_TYPE_UNSUPPORTED


I could not deduct anything useful from here but by reading DIGITS release notes I finally found the reason why DIGITS won't recognize my GPU - it is too old!

Installation Guide — NVIDIA Cloud Native Technologies documentation specifies compute capability requirements for NVIDIA Container Toolkit but compute capability requirements for DIGITS Docker image are specified for each image release. For digits:20.12 DIGITS Release Notes :: NVIDIA Deep Learning DIGITS Documentation states the following:

Release 20.12 supports CUDA compute capability 6.0 and higher.

My GPU has compute capability 3.5 and so it does not meet that requirement.


References







Thursday 17 December 2020

Webcam capture with ffmpeg and OpenCV from Jupyter Notebook

I want to share here my experience with using OpenCV and ffmpeg to capture a webcam output.


Setup:
  • Jupyter notebook running in jupyter-lab
  • Ubuntu 20.04
  • USB web camera
Goal:
  • Capture and display frames from the webcam

OpenCV: Video I/O with OpenCV Overview says that OpenCV: cv::VideoCapture Class calls video I/O backends (APIs) depending on which one is available.

To find out what backends (VideoCaptureAPIs) are available we can use the following code:

import cv2

# cv2a.videoio_registry.getBackends() returns list of all available backends.
availableBackends = [cv2.videoio_registry.getBackendName(b) for b in cv2.videoio_registry.getBackends()]
print(availableBackends)

# Returns list of available backends which works via cv::VideoCapture(int index)
availableCameraBackends = [cv2.videoio_registry.getBackendName(b) for b in cv2.videoio_registry.getCameraBackends()]
print(availableBackends)

The output in my case was: 

['FFMPEG', 'GSTREAMER', 'CV_IMAGES', 'CV_MJPEG']
['FFMPEG', 'GSTREAMER', 'CV_IMAGES', 'CV_MJPEG']

Let's see what is each of these backends:

• FFMPEG is a multimedia framework which can record, convert and stream audio and video.

It contains libavcodec, libavutil, libavformat, libavfilter, libavdevice, libswscale and libswresample which can be used by applications. As well as ffmpeg, ffplay and ffprobe which can be used by end users for transcoding and playing.

• GSTREAMER is a pipeline-based multimedia framework with similar capabilities as ffmpeg.

• CV_IMAGES -  OpenCV Image Sequence (e.g. img_%02d.jpg). Matches cv2.CAP_IMAGES API ID.

• CV_MJPEG - Built-in OpenCV MotionJPEG codec (used for reading video files). Matches cv2.CAP_OPENCV_MJPEG video capture API.

I was surprised to see GSTREAMER listed above as VideoCaptureAPIs documentation says

Backends are available only if they have been built with your OpenCV binaries. 

...and OpenCV package installed in my environment was built only with FFMPEG support:

>>> import cv2
>>> cv2.getBuildInformation()
...
Video I/O:\n    DC1394:                      NO\n    FFMPEG:                      YES\n      avcodec:                   YES (58.35.100)\n      avformat:                  YES (58.20.100)\n      avutil:                    YES (56.22.100)\n      swscale:                   YES (5.3.100)\n      avresample:                YES (4.0.0)\n\n  
...

...which can also be verifed by looking the cmake config in the repository (opencv-feedstock/build.sh at master · conda-forge/opencv-feedstock):

-DWITH_FFMPEG=1     \
-DWITH_GSTREAMER=0  \

Although my conda environment contained all relevant packages:

(my-env) $ conda list | grep 'opencv\|ffmpeg\|gstreamer'
ffmpeg                    4.1.3                h167e202_0    conda-forge
gstreamer                 1.14.5               h36ae1b5_2    conda-forge
opencv                    4.1.0            py36h79d2e43_1    conda-forge

...it is important to know that having ffmpeg and gstreamer packages installed means only that we have their binaries installed (executables and .so libraries) but not Python bindings (modules) or their OpenCV plugins. We are able to launch these applications from terminal but can't import them in Python code.

I tried to force using FFMPEG:

import cv2

deviceId = "/dev/video0"

# videoCaptureApi = cv2.CAP_ANY       # autodetect default API
videoCaptureApi = cv2.CAP_FFMPEG
# videoCaptureApi = cv2.CAP_GSTREAMER 
cap = cv2.VideoCapture("/dev/video2", videoCaptureApi)

cap = cv2.VideoCapture(deviceId)
cap.open(deviceId)
if not cap.isOpened():
    raise RuntimeError("ERROR! Unable to open camera")

try:
    while True:
        ret, frame = cap.read()
        cv2.imshow('frame', frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
finally:        
    cap.release()
    cv2.destroyAllWindows()

...but cell execution would fail with:

RuntimeError: ERROR! Unable to open camera

I checked ($ v4l2-ctl --list-devices) - my webcam was indeed with index 2. As this was failing at the very beginning I decided to open python interpreter console and debug there only the isolated code snippet which opens the camera:

(my-env) $ export OPENCV_LOG_LEVEL=DEBUG; export OPENCV_VIDEOIO_DEBUG=1

(my-env) $ python 
Python 3.6.6 | packaged by conda-forge | (default, Oct 12 2018, 14:43:46) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> cap = cv2.VideoCapture("/dev/video2", cv2.CAP_FFMPEG)
[ WARN:0] VIDEOIO(FFMPEG): trying capture filename='/dev/video2' ...
[ WARN:0] VIDEOIO(FFMPEG): can't create capture

I also tried to force using Gstreamer to no avail (which was expected):

>>> cap = cv2.VideoCapture("/dev/video2", cv2.CAP_GSTREAMER)
[ WARN:0] VIDEOIO(GSTREAMER): trying capture filename='/dev/video2' ...
[ INFO:0] VideoIO pluigin (GSTREAMER): glob is 'libopencv_videoio_gstreamer*.so', 1 location(s)
[ INFO:0]     - /home/bojan/anaconda3/envs/my-env/lib/python3.6/site-packages/../..: 0
[ INFO:0] Found 0 plugin(s) for GSTREAMER
[ WARN:0] VIDEOIO(GSTREAMER): backend is not available (plugin is missing, or can't be loaded due dependencies or it is not compatible)

Indeed ~/anaconda3/envs/my-env/lib did not contain ffmpeg plugin (libopencv_videoio_ffmpeg*.so files) or Gstreamer plugin (libopencv_videoio_gstreamer*.so files).

These plugins are installed only if OpenCV is build with following CMake options:

- DWITH_FFMPEG=1     \
-DVIDEOIO_PLUGIN_LIST=ffmpeg

...or (for Gstreamer):

-DWITH_GSTREAMER=1 \
-DVIDEOIO_PLUGIN_LIST=gstreamer \

...and apart from WITH_FFMPEG no other were used in the cmake config that was used to build OpenCV package installed in my environment.

As I didn't want to compile OpenCV myself but to achieve my goal with what I have I decided to see if I can run ffmpg process to stream camera output into a pipe and then read the binary information from it and convert it into frames:

import os
import tempfile
import subprocess
import cv2
import numpy as np

# To get this path execute:
#    $ which ffmpeg
FFMPEG_BIN = '/home/bojan/anaconda3/envs/my-env/bin/ffmpeg'


# To find allowed formats for the specific camera:
#    $ ffmpeg -f v4l2 -list_formats all -i /dev/video3
#    ...
#    [video4linux2,v4l2 @ 0x5608ac90af40] Raw: yuyv422: YUYV 4:2:2: 640x480 1280x720 960x544 800x448 640x360 424x240 352x288 320x240 800x600 176x144 160x120 1280x800
#    ...

def run_ffmpeg(fifo_path):
    ffmpg_cmd = [
        FFMPEG_BIN,
        '-i', '/dev/video2',
        '-video_size', '640x480',
        '-pix_fmt', 'bgr24',        # opencv requires bgr24 pixel format
        '-vcodec', 'rawvideo',
        '-an','-sn',                # disable audio processing
        '-f', 'image2pipe',
        '-',                        # output to go to stdout
    ]
    return subprocess.Popen(ffmpg_cmd, stdout = subprocess.PIPE, bufsize=10**8)

def run_cv_window(process):
    while True:
        # read frame-by-frame
        raw_image = process.stdout.read(640*480*3)
        if raw_image == b'':
            raise RuntimeError("Empty pipe")
        
        # transform the bytes read into a numpy array
        frame =  np.frombuffer(raw_image, dtype='uint8')
        frame = frame.reshape((480,640,3)) # height, width, channels
        if frame is not None:
            cv2.imshow('Video', frame)
        
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
        process.stdout.flush()
    
    cv2.destroyAllWindows()
    process.terminate()
    print(process.poll())

def run():
    ffmpeg_process = run_ffmpeg()
    run_cv_window(ffmpeg_process)

run()

Et voila! I got the camera capture from Python notebook thanks to ffmpeg and OpenCV.



Sunday 4 October 2020

Introduction to JupyterLab

JupyterLab is a new interface of Jupyter Notebook server. It is a web-based interactive development environment for Jupyter notebooks, code, and data.


Installation via Anaconda


env.yml file should contain:

name: my-env # arbitrary name
...
dependencies:
  - jupyter=1.0.0
  - jupyterlab=0.34.9
...
  - python=3.6.6 # for running python kernels
...


Let's first check that python we'll be using is the right one:

(my-env) $ which python
/home/bojan/anaconda3/envs/my-env/bin/python

(my-env) $ python --version
Python 3.6.6

To launch jupyter lab:

(my-env) $ jupyter-lab
[I 07:37:01.479 LabApp] JupyterLab extension loaded from /home/bojan/anaconda3/envs/my-env/lib/python3.6/site-packages/jupyterlab
[I 07:37:01.479 LabApp] JupyterLab application directory is /home/bojan/anaconda3/envs/my-env/share/jupyter/lab
[I 07:37:01.483 LabApp] Serving notebooks from local directory: /home/bojan/dev/github/jupyterlab-demo
[I 07:37:01.484 LabApp] The Jupyter Notebook is running at:
[I 07:37:01.484 LabApp] http://localhost:8888/?token=22151a5520697ab97bfe48f44bcdd6248a0c1bd7bc5100ff
[I 07:37:01.484 LabApp]  or http://127.0.0.1:8888/?token=22151a5520697ab97bfe48f44bcdd6248a0c1bd7bc5100ff
[I 07:37:01.484 LabApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 07:37:01.491 LabApp] 
    
    To access the notebook, open this file in a browser:
        file:///home/bojan/.local/share/jupyter/runtime/nbserver-20890-open.html
    Or copy and paste one of these URLs:
        http://localhost:8888/?token=22151a5520697ab97bfe48f44bcdd6248a0c1bd7bc5100ff
     or http://127.0.0.1:8888/?token=22151a5520697ab97bfe48f44bcdd6248a0c1bd7bc5100ff
Opening in existing browser session.
[1004/073701.855894:ERROR:nacl_helper_linux.cc(308)] NaCl helper process running without a sandbox!
Most likely you need to configure your SUID sandbox correctly
[I 07:37:24.221 LabApp] Kernel started: 8144639c-36e9-41ec-8e0c-06b9c593a426
[I 07:37:24.780 LabApp] Build is up to date
[I 07:37:46.613 LabApp] 302 GET /?token=22151a5520697ab97bfe48f44bcdd6248a0c1bd7bc5100ff (127.0.0.1) 0.66ms
[I 07:37:53.685 LabApp] Starting buffering for 8144639c-36e9-41ec-8e0c-06b9c593a426:3d1e75cf-7a1f-4999-a91a-013da0761314
[I 07:37:55.524 LabApp] Build is up to date
[I 07:38:59.153 LabApp] Saving file at /Untitled.ipynb
...
[I 07:46:51.165 LabApp] Starting buffering for 8144639c-36e9-41ec-8e0c-06b9c593a426:84ef87af-47d6-473a-b8a6-cbe3f2c0da45
[I 07:46:53.082 LabApp] Build is up to date
[I 07:46:55.971 LabApp] Starting buffering for 8144639c-36e9-41ec-8e0c-06b9c593a426:58fc2ee2-42b3-408d-baa9-b9d3fc5ba29c
^C[I 07:47:02.242 LabApp] interrupted
Serving notebooks from local directory: /home/bojan/dev/github/jupyterlab-demo
1 active kernel
The Jupyter Notebook is running at:
http://localhost:8888/?token=22151a5520697ab97bfe48f44bcdd6248a0c1bd7bc5100ff
 or http://127.0.0.1:8888/?token=22151a5520697ab97bfe48f44bcdd6248a0c1bd7bc5100ff
Shutdown this notebook server (y/[n])? y
[C 07:47:03.991 LabApp] Shutdown confirmed
[I 07:47:03.992 LabApp] Shutting down 1 kernel
[I 07:47:04.294 LabApp] Kernel shutdown: 8144639c-36e9-41ec-8e0c-06b9c593a426

We can see that jupyterlab starts a local web server which listens on port 8888. This is a classical Jupyter Notebook server (that's why we needed to include jupyter in the environment packages). We can then access it from a browser with http://localhost:8888 or http://127.0.0.1:8888. 

jupyterlab is an extension of this server:

(my-env) $ jupyter serverextension list
config dir: /home/bojan/anaconda3/envs/my-env/etc/jupyter
    jupyterlab  enabled 
    - Validating...
      jupyterlab 0.34.9 OK

jupyter lab is accessible at /lab path so when we run it, it will automatically open http://localhost:8888/lab in a browser.

We can also run jupyter notebook:

(my-env) $ jupyter notebook
[I 08:05:37.951 NotebookApp] JupyterLab extension loaded from /home/bojan/anaconda3/envs/my-env/lib/python3.6/site-packages/jupyterlab
[I 08:05:37.951 NotebookApp] JupyterLab application directory is /home/bojan/anaconda3/envs/my-env/share/jupyter/lab
[I 08:05:37.954 NotebookApp] Serving notebooks from local directory: /home/bojan/dev/github/jupyterlab-demo
[I 08:05:37.954 NotebookApp] The Jupyter Notebook is running at:
[I 08:05:37.954 NotebookApp] http://localhost:8888/?token=90194de8634a4a8e3c0ee9d4ec760382b4aa545d7b9260f6
[I 08:05:37.954 NotebookApp]  or http://127.0.0.1:8888/?token=90194de8634a4a8e3c0ee9d4ec760382b4aa545d7b9260f6
[I 08:05:37.954 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 08:05:37.960 NotebookApp] 
    
    To access the notebook, open this file in a browser:
        file:///home/bojan/.local/share/jupyter/runtime/nbserver-24138-open.html
    Or copy and paste one of these URLs:
        http://localhost:8888/?token=90194de8634a4a8e3c0ee9d4ec760382b4aa545d7b9260f6
     or http://127.0.0.1:8888/?token=90194de8634a4a8e3c0ee9d4ec760382b4aa545d7b9260f6
Opening in existing browser session.
[1004/080538.264120:ERROR:nacl_helper_linux.cc(308)] NaCl helper process running without a sandbox!
Most likely you need to configure your SUID sandbox correctly
[I 08:05:49.878 NotebookApp] Kernel started: 3aa4ef5c-e260-40af-8e54-a4f509223a64
[I 08:05:50.492 NotebookApp] Build is up to date
[I 08:06:31.324 NotebookApp] Saving file at /Untitled.ipynb




...and manually go to /lab:





References:

Saturday 3 October 2020

Managing Python Environments with Conda

At the end of my article How to install Anaconda on Ubuntu I verified that Anaconda was installed successfully by executing conda list. You've probably noticed (base) that appears in front of command prompt:

(base) $

This happens after we activate conda by executing ~/.bashrc:

$ cat ~/.bashrc
...
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/home/bojan/anaconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/home/bojan/anaconda3/etc/profile.d/conda.sh" ]; then
        . "/home/bojan/anaconda3/etc/profile.d/conda.sh"
    else
        export PATH="/home/bojan/anaconda3/bin:$PATH"
    fi
fi
unset __conda_setup
# <<< conda initialize <<<
...




As conda is also an environment manager it is capable of setting a specified or a default environment. 

Each environment has associated:
  • name
  • collection of packages 
  • resources which are stored in a dedicated directory

When conda is activated, it sets a default environment unless some other is specified. Its name appears within parentheses before the prompt in terminal. In our case that was a default environment named base.

To list all available environments with their names and associated directories we can use:

$ conda env list
# conda environments:
#
base                  *  /home/bojan/anaconda3

Asterisk (*) appears next to the currently active environment.

Another way to get the same information is:

$ conda info --envs
# conda environments:
#
base                  *  /home/bojan/anaconda3

Here is the content of the base environment directory:

$ ls -1a /home/bojan/anaconda3
.
..
bin
compiler_compat
condabin
conda-meta
doc
envs
etc
include
lib
libexec
LICENSE.txt
man
mkspecs
phrasebooks
pkgs
plugins
qml
resources
sbin
share
shell
ssl
translations
var
x86_64-conda_cos6-linux-gnu

To deactivate the current environment:

(base) $ conda deactivate
$

To activate default environment again:

$ conda activate
(base) $ 


After installing Conda, base environment is set to be auto activated by default:

(base) $ conda config --show | grep auto_activate_base
auto_activate_base: True


(base) $ conda config --set auto_activate_base False

Let's verify it:

(base) $ conda config --show | grep auto_activate_base
auto_activate_base: False

Next time we open a new terminal, we should not see (base) in front of the prompt.

Note that when we have a base environment activated, Terminal shell uses Python from Anaconda installaton:

(base) $ which python
/home/bojan/anaconda3/bin/python

(base) $ which python3
/home/bojan/anaconda3/bin/python3

When no environment is activated, Terminal shell uses Python from a system installation:

$ which python
/usr/bin/python

$ which python3
/usr/bin/python3

Same applies to pip and pip3:

(base) $ which pip
/home/bojan/anaconda3/bin/pip
(base) $ which pip3
/home/bojan/anaconda3/bin/pip3

$ which pip
/home/bojan/.local/bin/pip
$ which pip3
/home/bojan/.local/bin/pip3


Each environment can be specified with its YAML (.yml) file. This file contains all necessary information for conda to recreate the same environment on some other computer:
  • name of the environment
  • channels
  • dependencies
Environment yaml file generic example:

name: my_env
channels:
  - channel1
dependencies:
  - python=3.6
  - some_package=1.2 # it is possible to add comments
  - some_package2
  - jupyter=1.0.0
  - jupyterlab=0.34.9
  - keras=2.2.2=0
  - matplotlib=2.2.3
  - ...


We can create this yaml file manually or can use conda command:

$ conda env export > my_environment.yml

To re-create the environment from a given yaml file:

$ conda env create -f my_environment.yml 
Collecting package metadata (repodata.json): done
Solving environment: done


==> WARNING: A newer version of conda exists. <==
  current version: 4.8.3
  latest version: 4.8.5

Please update conda by running

    $ conda update -n base -c defaults conda

Downloading and Extracting Packages
gmp-6.1.2            | 751 KB    | ##################################### | 100% 
prometheus_client-0. | 44 KB     | ##################################### | 100% 
libstdcxx-ng-9.3.0   | 4.0 MB    | ##################################### | 100% 
defusedxml-0.6.0     | 22 KB     | ##################################### | 100% 
ipython-7.16.1       | 1.1 MB    | ##################################### | 100% 
kiwisolver-1.2.0     | 87 KB     | ##################################### | 100% 
libgcc-ng-9.3.0      | 7.8 MB    | ##################################### | 100% 
...
Preparing transaction: done
Verifying transaction: done
Executing transaction: \ b'Enabling notebook extension jupyter-js-widgets/extension...\n      - Validating: \x1b[32mOK\x1b[0m\n'
done
#
# To activate this environment, use
#
#     $ conda activate my_environment
#
# To deactivate an active environment, use
#
#     $ conda deactivate


We can verify that this environment has now been added to conda:

(base) $ conda env list
# conda environments:
#
base                  *  /home/bojan/anaconda3
my_environment           /home/bojan/anaconda3/envs/my_environment
test-opencv              /home/bojan/anaconda3/envs/test-opencv


To activate this environment:

(base)$ conda activate my_environment
(my_environment)$


To check which packages are installed in the current environment:

(my_environment) $ conda list
# packages in environment at /home/bojan/anaconda3/envs/my_environment:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
absl-py                   0.10.0           py36h9f0ad1d_0    conda-forge
argon2-cffi               20.1.0           py36h8c4c3a4_1    conda-forge
astor                     0.8.1              pyh9f0ad1d_0    conda-forge
async_generator           1.10                       py_0    conda-forge
atk                       2.25.90           hf2eb9ee_1001    conda-forge
attrs                     20.2.0             pyh9f0ad1d_0    conda-forge
backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
backports                 1.0                        py_2    conda-forge
...

To install additional package to the current environment:

(my_environment) $ conda install pillow
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/bojan/anaconda3/envs/my_environment

  added / updated specs:
    - pillow


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2020.10.14 |                0         121 KB
    certifi-2020.6.20          |           py36_0         156 KB
    olefile-0.46               |           py36_0          48 KB
    openssl-1.0.2u             |       h7b6447c_0         2.2 MB
    pillow-7.2.0               |   py36hb39fc2d_0         619 KB
    ------------------------------------------------------------
                                           Total:         3.1 MB

The following NEW packages will be INSTALLED:

  lcms2              pkgs/main/linux-64::lcms2-2.11-h396b838_0
  olefile            pkgs/main/linux-64::olefile-0.46-py36_0
  pillow             pkgs/main/linux-64::pillow-7.2.0-py36hb39fc2d_0

The following packages will be UPDATED:

  ca-certificates    conda-forge::ca-certificates-2020.6.2~ --> pkgs/main::ca-certificates-2020.10.14-0

The following packages will be SUPERSEDED by a higher-priority channel:

  certifi            conda-forge::certifi-2020.6.20-py36h9~ --> pkgs/main::certifi-2020.6.20-py36_0
  openssl            conda-forge::openssl-1.0.2u-h516909a_0 --> pkgs/main::openssl-1.0.2u-h7b6447c_0


Proceed ([y]/n)? y


Downloading and Extracting Packages
openssl-1.0.2u       | 2.2 MB    | ################################################################ | 100% 
certifi-2020.6.20    | 156 KB    | ################################################################ | 100% 
pillow-7.2.0         | 619 KB    | ################################################################ | 100% 
ca-certificates-2020 | 121 KB    | ################################################################ | 100% 
olefile-0.46         | 48 KB     | ################################################################ | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done

We can verify that a new package is indeed installed in new environment:

(my_environment) $ conda list
# packages in environment at /home/bojan/anaconda3/envs/my_environment:
#
# Name                    Version                   Build  Channel
...
pillow                    7.2.0            py36hb39fc2d_0 
...

To install a new package from a specific channel in the current environment:

(my_environment) $ conda install -c channel_name package_name


To install a specific version of a new package from a specific channel in the current environment:

(my_environment) $ conda install -c channel_name package_name=version


To uninstall some package from the current environment:

(my_environment) $ conda remove pillow

To uninstall multiple packages from the current environment:

(my_environment) $ conda remove package1 package2 ...

To update an environment with (new) dependencies in (new) yml file:

(my_environment$ conda deactivate
$ conda env update --file my_environment.yml
$ conda activate my_environment


To update conda (I've deactivated the current environment):

$ conda update -n base -c defaults conda
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/bojan/anaconda3

  added / updated specs:
    - conda


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    conda-4.8.5                |           py38_0         2.9 MB
    ------------------------------------------------------------
                                           Total:         2.9 MB

The following packages will be UPDATED:

  conda                                        4.8.3-py38_0 --> 4.8.5-py38_0


Proceed ([y]/n)? y


Downloading and Extracting Packages
conda-4.8.5          | 2.9 MB    | ################################################################################################################################################################# | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done


To remove given environment:

$ conda env remove --name test-opencv

Remove all packages in environment /home/bojan/anaconda3/envs/test-opencv:


We can verify that its directory does not exist:

$ ls /home/bojan/anaconda3/envs/test-opencv
ls: cannot access '/home/bojan/anaconda3/envs/test-opencv': No such file or directory

References: