Semantic Segmentation with Fully Convolutional Networks (FCN)
- Préparation
- Prepare virtualenv
- Références :
- To save the model
- Load the trained model
- Set the model to evaluate mode
- Define the helper function
- Some standard imports
- Load pretrained model weights
- Initialize model with the pretrained weights
- Set the model to evaluate mode
- Reference:https://discuss.pytorch.org/t/random-number-on-gpu/9649
- torch.cuda.FloatTensor(batch_size, 3, 224, 224).normal_()
- Export the model
- x, # model input (or a tuple for multiple inputs)
- “test-onnx.onnx”, # where to save the model (can be a file or file-like object)
- export_params=True, # store the trained parameter weights inside the model file
- opset_version=10, # the ONNX version to export the model to
- do_constant_folding=True, # whether to execute constant folding for optimization
- input_names = [‘input’], # the model’s input names
- output_names = [‘output’], # the model’s output names
- dynamic_axes={‘input’ : {2048 : ‘batch_size’}, # variable lenght axes
- ‘output’ : {1 : ‘batch_size’}})
- compute ONNX Runtime output prediction
- compare ONNX Runtime and PyTorch results
- Opset version 9 is required by Jetson Xavier, which has TRT6.
- This script will install pytorch, torchvision on nano
- If you have any of these installed already on your machine, you can skip those.
- from train.py on https://github.com/dusty-nv/pytorch-segmentation/blob/master/train.py, it is used against this fork of torchvision: https://github.com/dusty-nv/vision/tree/v0.3.0
- git clone –branch v0.5.0 https://github.com/pytorch/vision torchvision # see below for version of torchvision to download
- checkout code from https://github.com/dusty-nv/pytorch-segmentation (link from https://github.com/dusty-nv/jetson-inference/tree/master/python/training)
- TODO run a test to make sure all is fine
- find a cheap baseline model to train to test the env
- also the onnx export to test
- and run the onnx
The project model which will be used for transfer learning and domain adaptation is DeepLabv3FineTuning.
- https://expoundai.wordpress.com/2019/08/30/transfer-learning-for-segmentation-using-deeplabv3-in-pytorch/
- https://github.com/msminhas93/DeepLabv3FineTuning
Then we will be using the tutorial to infer an image with the new trained model.
Then to save it as an ONNX file and use it, check this tutorial:
- https://pytorch.org/tutorials/advanced/super_resolution_with_onnxruntime.html
- https://pytorch.org/docs/stable/onnx.html
Maybe taking a look at (NVidia) DeepStream
Références:
- https://github.com/dusty-nv/pytorch-segmentation
- https://github.com/pytorch/vision/tree/v0.3.0/references/segmentation
- https://github.com/DeepSceneSeg/AdapNet
- https://github.com/tensorflow/models/tree/master/research/slim
- https://www.jeremyjordan.me/semantic-segmentation/
- https://medium.com/nanonets/how-to-do-image-segmentation-using-deep-learning-c673cc5862ef
- http://www.cvlibs.net/datasets/kitti/eval_road.php
- https://www.researchgate.net/post/What_exactly_is_the_label_data_set_for_semantic_segmentation_using_FCN
- (pdf) https://www.researchgate.net/publication/334290613_Semantic_Segmentation_with_Deep_Learning_A_general_introduction
- https://pytorch.org/blog/torchvision03/
- cheatsheet vocabulary tips and tricks
- recipe to train a model
Préparation
- Téléchargement du projet à partir de github en zip
wget https://github.com/msminhas93/DeepLabv3FineTuning/archive/master.zip mv /home/lefv2603/Downloads/master.zip /home/lefv2603/Downloads/DeepLabv3FineTuning-master.zip
- copier to projet zippé sur le serveur beluga de Compute Canada
scp /home/lefv2603/Downloads/DeepLabv3FineTuning-master.zip vincelf@beluga.computecanada.ca:~
- Télécharger les modèles de resnet1010 et deeplabv3_resnet101_coco en avance depuis une autre machine que celle de Compute Canada car depuis le serveur de Compute Canada l’accès à ces URLs est bloqué.
Erreur si le modèle n’est pas dans le cache et doit être téléchargé à partir de Compute Canada
(env) [vincelf@blg4101 DeepLabv3FineTuning-master-test]$ python $TRAINDIR/main.py $TRAINDIR/CrackForest $TRAINDIR/VLFExp Downloading: "https://download.pytorch.org/models/resnet101-5d3b4d8f.pth" to /home/vincelf/.cache/torch/checkpoints/resnet101-5d3b4d8f.pth Traceback (most recent call last): File "/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/python/3.7.4/lib/python3.7/urllib/request.py", line 1317, in do_open encode_chunked=req.has_header('Transfer-encoding')) File "/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/python/3.7.4/lib/python3.7/http/client.py", line 1244, in request self._send_request(method, url, body, headers, encode_chunked) File "/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/python/3.7.4/lib/python3.7/http/client.py", line 1290, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/python/3.7.4/lib/python3.7/http/client.py", line 1239, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/python/3.7.4/lib/python3.7/http/client.py", line 1026, in _send_output self.send(msg) File "/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/python/3.7.4/lib/python3.7/http/client.py", line 966, in send self.connect() File "/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/python/3.7.4/lib/python3.7/http/client.py", line 1406, in connect super().connect() File "/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/python/3.7.4/lib/python3.7/http/client.py", line 938, in connect (self.host,self.port), self.timeout, self.source_address) File "/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/python/3.7.4/lib/python3.7/socket.py", line 727, in create_connection raise err File "/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/python/3.7.4/lib/python3.7/socket.py", line 716, in create_connection sock.connect(sa) OSError: [Errno 101] Network is unreachable ...
wget https://download.pytorch.org/models/resnet101-5d3b4d8f.pth wget https://download.pytorch.org/models/deeplabv3_resnet101_coco-586e9e4e.pth
- copier les modèles sur le serveur de Compute Canada
scp /home/lefv2603/Downloads/resnet101-5d3b4d8f.pth vincelf@beluga.computecanada.ca:~ scp /home/lefv2603/Downloads/deeplabv3_resnet101_coco-586e9e4e.pth vincelf@beluga.computecanada.ca:~
- Déplacer les modèles dans le cache de torch
mv ~/resnet101-5d3b4d8f.pth /home/vincelf/.cache/torch/checkpoints m ~/deeplabv3_resnet101_coco-586e9e4e.pth /home/vincelf/.cache/torch/checkpoints
- S’assurer d’avoir les bonnes librairies dans requirements.tx.
Note: Les librairies peuvent être installées dans l’env virtuel Pyton à la main avec l’optoipn –no-index sinon il y a une erreur:
(env) [vincelf@blg5602 DeepLabv3FineTuning-master-test]$ pip3 install torchvision Ignoring pip: markers 'python_version < "3"' don't match your environment Looking in links: /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/avx512, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/avx2, /cvmfs/soft.computecanada.ca/custom/python/wheelhouse/generic Collecting torchvision Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x2b343e329450>: Failed to establish a new connection: [Errno 101] Network is unreachable')': /simple/torchvision/
tqdm scikit-learn opencv-python torch torchvision
Utiliser
pip3 install --no-index torchvision
- Exécuter l’entrainement dans un mode intéractif (salloc) ``` salloc –account=def-germ2201-ab –gres=gpu:1 –cpus-per-task=10 –mem=48000M –time=0-3:00
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
#module load python/3.6 cuda cudnn module load python/3.7.4 cuda cudnn
SOURCEDIR=~/gae724/src/DeepLabv3FineTuning-master-test TRAINDIR=~/DeepLabv3FineTuning-master
Prepare virtualenv
#virtualenv –no-download $SLURM_TMPDIR/env python3 -m venv $SLURM_TMPDIR/env source $SLURM_TMPDIR/env/bin/activate pip3 install –no-index -r $SOURCEDIR/requirements.txt
Références :
https://expoundai.wordpress.com/2019/08/30/transfer-learning-for-segmentation-using-deeplabv3-in-pytorch/
https://github.com/msminhas93/DeepLabv3FineTuning
python $TRAINDIR/main.py $TRAINDIR/CrackForest $TRAINDIR/VLFExp
* Le modèle s'entraine
(env) [vincelf@blg4101 DeepLabv3FineTuning-master-test]$ python $TRAINDIR/main.py $TRAINDIR/CrackForest $TRAINDIR/VLFExp Epoch 1/25 ———- 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:22<00:00, 1.06it/s] Train Loss: 0.0251 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:03<00:00, 1.76it/s] Test Loss: 0.0556 {‘epoch’: 1, ‘Train_loss’: 0.025081725791096687, ‘Test_loss’: 0.05564142018556595, ‘Train_f1_score’: 0.050840583267973714, ‘Train_auroc’: 0.5767874359848608, ‘Test_f1_score’: 0.010331824851978231, ‘Test_auroc’: 0.5188828199161867} Epoch 2/25 ———- 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:22<00:00, 1.07it/s] Train Loss: 0.0155 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:03<00:00, 1.91it/s] Test Loss: 0.0385 {‘epoch’: 2, ‘Train_loss’: 0.015544182620942593, ‘Test_loss’: 0.03848109021782875, ‘Train_f1_score’: 0.1191172723341669, ‘Train_auroc’: 0.7760033667334525, ‘Test_f1_score’: 0.08732444335126234, ‘Test_auroc’: 0.7224661313356657} … Epoch 24/25 ———- 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:22<00:00, 1.08it/s] Train Loss: 0.0106 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:03<00:00, 1.91it/s] Test Loss: 0.0083 {‘epoch’: 24, ‘Train_loss’: 0.010582794435322285, ‘Test_loss’: 0.00830918364226818, ‘Train_f1_score’: 0.4236991146887413, ‘Train_auroc’: 0.9393122372904904, ‘Test_f1_score’: 0.3157004818295278, ‘Test_auroc’: 0.8214732776085168} Epoch 25/25 ———- 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:22<00:00, 1.08it/s] Train Loss: 0.0087 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:03<00:00, 1.90it/s] Test Loss: 0.0072 {‘epoch’: 25, ‘Train_loss’: 0.008725304156541824, ‘Test_loss’: 0.007179579231888056, ‘Train_f1_score’: 0.3882697467070124, ‘Train_auroc’: 0.9296205158172175, ‘Test_f1_score’: 0.263064162183016, ‘Test_auroc’: 0.802269088719596} Training complete in 10m 57s Lowest Loss: 0.005794
* Sauver le modèle
To save the model
torch.save(model, os.path.join(bpath, ‘weights.pt’))
* Charger le modèle
Load the trained model
model = torch.load(‘./CFExp/weights.pt’)
Set the model to evaluate mode
model.eval()
* Pour l'inférence il faut faire .eval
* How can this be extended to a single image having multiple masks (4 masks per image, where each mask represents a class) ?
Just use those masks as input and use crossentropy as the loss function
## Test de segmentation d'une image avec resnet101
Ce scrit fonctionne bien sur le serveur de Compute canada
* Se logguer sur le serveur beluga
* Préparer un répertoire pour les scripts
* démarrer une session intéractive
```salloc --account=def-germ2201-ab --gres=gpu:1 --cpus-per-task=10 --mem=48000M --time=0-3:00```
* charger les modules python 6.7, cuda et cudnn. À noter que sur le jetson nano (L4T Linux For Tegra), cuda et cudnn existent déjà, ils sont installés avec le Jetpack. Par contre je ne sais pas comment les "charger" pour que les librairies python y fasse référence, au lieu de retourner une erreur.
#SBATCH –gres=gpu:1 # Request GPU “generic resources”; Number of GPU(s) per node #SBATCH –cpus-per-task=10 # Cores proportional to GPUs: 6 on Cedar, 16 on Graham, 10 on Beluga.; CPU cores/threads #SBATCH –mem=48000M # Memory proportional to GPUs: 32000 Cedar, 64000 Graham, 48000M on Beluga.; memory per node; Requesting –mem=0 is interpreted by Slurm to mean “reserve all the available memory on each node assigned to the job.” #SBATCH –time=0-03:00 #SBATCH –output=%N-%j.out
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
#module load python/3.6 cuda cudnn module load python/3.7.4 cuda cudnn
SOURCEDIR=~/gae724/src/DeepLabv3FineTuning-master-test TRAINDIR=~/DeepLabv3FineTuning-master
* créer l'environnement virtuel python 3
#virtualenv –no-download $SLURM_TMPDIR/env python3 -m venv $SLURM_TMPDIR/env source $SLURM_TMPDIR/env/bin/activate
* installer les librairies python requisent
requirements.txt:
tqdm scikit-learn opencv-python torch torchvision matplotlib
pip3 install –no-index -r $SOURCEDIR/requirements.txt
* exécuter le script
(env) [vincelf@blg4101 DeepLabv3FineTuning-master-test]$ python3 test_segmentation_resnet101.py
Il va créer une image plot.png
(env) [vincelf@blg4101 DeepLabv3FineTuning-master-test]$ cat test_segmentation_resnet101.py from torchvision import models from PIL import Image import matplotlib.pyplot as plt import torch import torchvision.transforms as T import numpy as np
def segment(net, path): img = Image.open(path) plt.imshow(img) plt.axis(‘off’) plt.show() # Comment the Resize and CenterCrop for better inference results trf = T.Compose([T.Resize(256), T.CenterCrop(224), T.ToTensor(), T.Normalize(mean = [0.485, 0.456, 0.406], std = [0.229, 0.224, 0.225])]) inp = trf(img).unsqueeze(0) out = net(inp)[‘out’] om = torch.argmax(out.squeeze(), dim=0).detach().cpu().numpy() rgb = decode_segmap(om) plt.imshow(rgb) plt.axis(‘off’) plt.show() plt.savefig(‘plot.png’)
Define the helper function
def decode_segmap(image, nc=21):
label_colors = np.array([(0, 0, 0), # 0=background # 1=aeroplane, 2=bicycle, 3=bird, 4=boat, 5=bottle (128, 0, 0), (0, 128, 0), (128, 128, 0), (0, 0, 128), (128, 0, 128), # 6=bus, 7=car, 8=cat, 9=chair, 10=cow (0, 128, 128), (128, 128, 128), (64, 0, 0), (192, 0, 0), (64, 128, 0), # 11=dining table, 12=dog, 13=horse, 14=motorbike, 15=person (192, 128, 0), (64, 0, 128), (192, 0, 128), (64, 128, 128), (192, 128, 128), # 16=potted plant, 17=sheep, 18=sofa, 19=train, 20=tv/monitor (0, 64, 0), (128, 64, 0), (0, 192, 0), (128, 192, 0), (0, 64, 128)])
r = np.zeros_like(image).astype(np.uint8) g = np.zeros_like(image).astype(np.uint8) b = np.zeros_like(image).astype(np.uint8)
for l in range(0, nc): idx = image == l r[idx] = label_colors[l, 0] g[idx] = label_colors[l, 1] b[idx] = label_colors[l, 2]
rgb = np.stack([r, g, b], axis=2) return rgb
fcn = models.segmentation.fcn_resnet101(pretrained=True).eval() segment(fcn, ‘./test.jpeg’)
## export to ONNX
L'export vers un modele onnx fonctionne avec le script suivant.
Les tests avec onnx et onnxruntime ne fonctionnent pas car les librairies n'existe pas dans le repo de Compute Canad (email en cours).
Le model qui est utilisé est celui de DeepLabv3FineTuning-master-test qui a été sauvegardé avec
torch.save(model, os.path.join(bpath, 'weights.pt')) (voir fichier /home/vincelf/DeepLabv3FineTuning-master/mains.py)
(env) [vincelf@blg4101 DeepLabv3FineTuning-master-test]$ cat export-to-onnx.py
Some standard imports
import io import numpy as np
from torch import nn import torch.utils.model_zoo as model_zoo import torch.onnx
import torch.nn as nn import torch.nn.init as init
#from model import createDeepLabv3
Load pretrained model weights
#model_url = ‘fcn-weights.pt’ batch_size = 1 # just a random number
Initialize model with the pretrained weights
#torch_model = createDeepLabv3() #torch_model = models.segmentation.deeplabv3_resnet101() torch_model = torch.load(‘weights.pt’) #torch_model = torch.load_state_dict(‘weights’)
Set the model to evaluate mode
torch_model.eval()
#torch_model.cuda() torch_model.to(‘cuda’)
#Input to the model
Reference:https://discuss.pytorch.org/t/random-number-on-gpu/9649
torch.cuda.FloatTensor(batch_size, 3, 224, 224).normal_()
x = torch.randn(batch_size, 3, 224, 224, requires_grad=True).cuda() #x.cuda() #x.to(‘cuda’) torch_out = torch_model(x)
Export the model
#torch.onnx.export(torch_model, # model being run
x, # model input (or a tuple for multiple inputs)
“test-onnx.onnx”, # where to save the model (can be a file or file-like object)
export_params=True, # store the trained parameter weights inside the model file
opset_version=10, # the ONNX version to export the model to
do_constant_folding=True, # whether to execute constant folding for optimization
input_names = [‘input’], # the model’s input names
output_names = [‘output’], # the model’s output names
dynamic_axes={‘input’ : {2048 : ‘batch_size’}, # variable lenght axes
‘output’ : {1 : ‘batch_size’}})
torch.onnx.export(torch_model, # model being run x, # model input (or a tuple for multiple inputs) “test-onnx.onnx”, # where to save the model (can be a file or file-like object) opset_version=11, # the ONNX version to export the model to input_names = [‘input’], # the model’s input names output_names = [‘output’] # the model’s output names )
import onnx
onnx_model = onnx.load(“test-onnx.onnx”) onnx.checker.check_model(onnx_model)
import onnxruntime
ort_session = onnxruntime.InferenceSession(“test-onnx.onnx”)
def to_numpy(tensor): return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()
compute ONNX Runtime output prediction
ort_inputs = {ort_session.get_inputs()[0].name: to_numpy(x)} ort_outs = ort_session.run(None, ort_inputs)
compare ONNX Runtime and PyTorch results
np.testing.assert_allclose(to_numpy(torch_out), ort_outs[0], rtol=1e-03, atol=1e-05)
print(“Exported model has been tested with ONNXRuntime, and the result looks good!”)
* copy the onnx file generated to the jetson nano from a terminal on the jetson nano
scp vincelf@beluga.computecanada.ca:/home/vincelf/gae724/src/DeepLabv3FineTuning-master-test/test-onnx.onnx /home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/
* test the onnx with the segnet-camera script (python calling cpp)
Il y a une erreur `ERROR: onnx2trt_utils.hpp:347 In function convert_axis:
[8] Assertion failed: axis >= 0 && axis < nbDims`
* References:
* <https://forums.developer.nvidia.com/t/running-a-pytorch-network-converted-to-onnx-with-tensorrt-on-the-tx2/70685/21>
* <https://docs.nvidia.com/deeplearning/tensorrt/support-matrix/index.html>
* <https://github.com/onnx/onnx-tensorrt/issues/192>
* <https://github.com/onnx/onnx-tensorrt/issues/125>
* <https://github.com/pytorch/pytorch/issues/16908>
* <https://github.com/onnx/onnx-tensorrt/issues/283#issuecomment-545703178>
cd projects/dusty-nv/jetson-inference/build/aarch64/bin lefv2603@lefv2603-jetsonnano:~/projects/dusty-nv/jetson-inference/build/aarch64/bin$ ./segnet-camera –network ‘/home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/test-onnx.onnx’ –labels /home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/labels.txt –colors /home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/colors.txt –input_blob input –output_blob ouput [gstreamer] initialized gstreamer, version 1.14.5.0 [gstreamer] gstCamera attempting to initialize with GST_SOURCE_NVARGUS, camera 0 [gstreamer] gstCamera pipeline string: nvarguscamerasrc sensor-id=0 ! video/x-raw(memory:NVMM), width=(int)1280, height=(int)720, framerate=30/1, format=(string)NV12 ! nvvidconv flip-method=0 ! video/x-raw ! appsink name=mysink [gstreamer] gstCamera successfully initialized with GST_SOURCE_NVARGUS, camera 0
segnet-camera: successfully initialized camera device width: 1280 height: 720 depth: 12 (bpp)
segNet – loading segmentation network model from: – prototxt: (null) – model: /home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/test-onnx.onnx – labels: /home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/labels.txt – colors: /home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/colors.txt – input_blob ‘input’ – output_blob ‘ouput’ – batch_size 1
[TRT] TensorRT version 6.0.1
[TRT] loading NVIDIA plugins…
[TRT] Plugin Creator registration succeeded - GridAnchor_TRT
[TRT] Plugin Creator registration succeeded - GridAnchorRect_TRT
[TRT] Plugin Creator registration succeeded - NMS_TRT
[TRT] Plugin Creator registration succeeded - Reorg_TRT
[TRT] Plugin Creator registration succeeded - Region_TRT
[TRT] Plugin Creator registration succeeded - Clip_TRT
[TRT] Plugin Creator registration succeeded - LReLU_TRT
[TRT] Plugin Creator registration succeeded - PriorBox_TRT
[TRT] Plugin Creator registration succeeded - Normalize_TRT
[TRT] Plugin Creator registration succeeded - RPROI_TRT
[TRT] Plugin Creator registration succeeded - BatchedNMS_TRT
[TRT] Could not register plugin creator: FlattenConcat_TRT in namespace:
[TRT] completed loading NVIDIA plugins.
[TRT] detected model format - ONNX (extension ‘.onnx’)
[TRT] desired precision specified for GPU: FASTEST
[TRT] requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT] native precisions detected for GPU: FP32, FP16
[TRT] selecting fastest native precision for GPU: FP16
[TRT] attempting to open engine cache file /home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/test-onnx.onnx.1.1.GPU.FP16.engine
[TRT] cache file not found, profiling network model on device GPU
[TRT] device GPU, loading /home/lefv2603/projects/dusty-nv/jetson-inference/build/aarch64/bin/ /home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/test-onnx.onnx
—————————————————————-
Input filename: /home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/test-onnx.onnx
ONNX IR version: 0.0.6
Opset version: 11
Producer name: pytorch
Producer version: 1.5
Domain:
Model version: 0
Doc string:
—————————————————————-
WARNING: ONNX model has a newer ir_version (0.0.6) than this parser was built against (0.0.3).
[TRT] 677:Shape -> (4)
[TRT] 678:Constant ->
WARNING: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
Successfully casted down to INT32.
While parsing node number 2 [Gather -> “679”]:
— Begin node —
input: “677”
input: “678”
output: “679”
name: “Gather_2”
op_type: “Gather”
attribute {
name: “axis”
i: 0
type: INT
}
— End node — ERROR: onnx2trt_utils.hpp:347 In function convert_axis: [8] Assertion failed: axis >= 0 && axis < nbDims [TRT] failed to parse ONNX model ‘/home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/test-onnx.onnx’ [TRT] device GPU, failed to load /home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/test-onnx.onnx segNet – failed to initialize. segnet-camera: failed to initialize imageNet
* fichiers qui sont utiles à comprendre
* /home/lefv2603/projects/dusty-nv/jetson-inference/build/aarch64/bin/segnet-camera.py
* /home/lefv2603/projects/dusty-nv/jetson-inference/c/segNet.cpp
* /home/lefv2603/projects/dusty-nv/jetson-inference/c/segNet.cu
* /home/lefv2603/projects/dusty-nv/jetson-inference/c/segNet.h
* /home/lefv2603/projects/dusty-nv/jetson-inference/c/tensorNet.cpp
* /home/lefv2603/projects/dusty-nv/jetson-inference/c/tensorNet.h
* /home/lefv2603/projects/dusty-nv/jetson-inference/c/imageNet.cpp
* /home/lefv2603/projects/dusty-nv/jetson-inference/c/imageNet.cu
* /home/lefv2603/projects/dusty-nv/jetson-inference/c/imageNet.cuh
* /home/lefv2603/projects/dusty-nv/jetson-inference/c/imageNet.h
* /home/lefv2603/projects/dusty-nv/jetson-inference/python/training/segmentation/onnx_export.py
* /home/lefv2603/projects/dusty-nv/jetson-inference/python/training/segmentation/onnx_validate.py
* /home/lefv2603/projects/dusty-nv/jetson-inference/python/training/segmentation/train.py
* /home/lefv2603/projects/dusty-nv/jetson-inference/data/networks/FCN-ResNet18-DeepScene-576x320/classes.txt
* /home/lefv2603/projects/dusty-nv/jetson-inference/data/networks/FCN-ResNet18-DeepScene-576x320/colors.txt
* /home/lefv2603/projects/dusty-nv/jetson-inference/python/bindings/PySegNet.cpp
* /home/lefv2603/projects/dusty-nv/jetson-inference/python/bindings/PySegNet.h
* /home/lefv2603/projects/dusty-nv/jetson-inference/examples/segnet-camera/segnet-camera.cpp
* /home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/models/segmentation/deeplabv3.py
* /home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/models/segmentation/fcn.py
* Exemple de model très simple qui est transformé en onnx en quelques lignes
Référence:<https://github.com/onnx/onnx-tensorrt/issues/125>
#!/usr/bin/env python3
import torch from torch import onnx
input = torch.rand(2,64,4)
class Model(torch.nn.Module): def init(self): super().init()
def forward(self, x: torch.Tensor) -> torch.Tensor:
n, hw, c = x.shape
# Split the hw dimension into 8*8
h = 8
w = hw // h
x = x.view(n, h, w, c)
# NHWC -> NCHW is pretty common.
return x.permute(0, 3, 1, 2)
Opset version 9 is required by Jetson Xavier, which has TRT6.
torch.onnx.export(Model(), input, “/tmp/barf.onnx”, opset_version=9)
* Notes en cours
`x = torch.flatten(x, 3)` a mettre qqpart dans le modele
Fichier: ~/DeepLabv3FineTuning-master/models/segmentation/deeplabv3.py
class ASPPPooling(nn.Sequential): def init(self, in_channels, out_channels): super(ASPPPooling, self).init( nn.AdaptiveAvgPool2d(1), nn.Conv2d(in_channels, out_channels, 1, bias=False), nn.BatchNorm2d(out_channels), nn.ReLU())
def forward(self, x):
size = x.shape[-2:]
x = torch.flatten(x, 3)
x = super(ASPPPooling, self).forward(x)
return F.interpolate(x, size=size, mode='bilinear', align_corners=False) ``` * retrain dans un env virutal sur Beluga Compute Canada ``` (env) [vincelf@blg5803 DeepLabv3FineTuning-master-test]$ python $TRAINDIR/main.py $TRAINDIR/CrackForest $TRAINDIR/VLFExp ``` * export to onnx ``` (env) [vincelf@blg5803 DeepLabv3FineTuning-master-test]$ python export-to-onnx.py ``` * copy to jetson nano ``` lefv2603@lefv2603-jetsonnano:~/projects/dusty-nv/jetson-inference/build/aarch64/bin$ scp vincelf@beluga.computecanada.ca:/home/vincelf/gae724/src/DeepLabv3FineTuning-master-test/test-onnx.onnx /home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/ ``` * Run inference using onnx model on jetson nano ``` lefv2603@lefv2603-jetsonnano:~/projects/dusty-nv/jetson-inference/build/aarch64/bin$ ./segnet-camera --network '/home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/test-onnx.onnx' --labels /home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/labels.txt --colors /home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/colors.txt --input_blob input --output_blob ouput ```
- fails with same error
```
[TRT] device GPU, loading /home/lefv2603/projects/dusty-nv/jetson-inference/build/aarch64/bin/ /home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/test-onnx.onnx
—————————————————————-
Input filename: /home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/test-onnx.onnx
ONNX IR version: 0.0.6
Opset version: 11
Producer name: pytorch
Producer version: 1.5
Domain:
Model version: 0 Doc string:
—————————————————————- WARNING: ONNX model has a newer ir_version (0.0.6) than this parser was built against (0.0.3). [TRT] 677:Shape -> (4) [TRT] 678:Constant -> WARNING: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. Successfully casted down to INT32. While parsing node number 2 [Gather -> “679”]: — Begin node — input: “677” input: “678” output: “679” name: “Gather_2” op_type: “Gather” attribute { name: “axis” i: 0 type: INT }
— End node — ERROR: onnx2trt_utils.hpp:347 In function convert_axis: [8] Assertion failed: axis >= 0 && axis < nbDims [TRT] failed to parse ONNX model ‘/home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/test-onnx.onnx’ [TRT] device GPU, failed to load /home/lefv2603/projects/msminhas93/DeepLabv3FineTuning-master/test-onnx.onnx segNet – failed to initialize. segnet-camera: failed to initialize imageNet lefv2603@lefv2603-jetsonnano:~/projects/dusty-nv/jetson-inference/build/aarch64/bin$ ./segnet-camera –network ‘/home/lefv2603/projects/msminhas93/DeepLabv3FineTuni
## Simplifier le model onnx avec onnx-simplifier
Pour faire face "facilement" et "rapidement" à l'erreur rencontrée précédement `[8] Assertion failed: axis >= 0 && axis < nbDims`, et qui est recommandé par la plupart de ceux qui ont rencontré cette erreur, est d'utiliser le package onnx-simplifier.
Référence: <https://github.com/daquexian/onnx-simplifier>
Le problème est qu'il dépend de onnxruntime, et que cette libraire n'est pas disponible dans l'environnement de Compute Canada.
L'objectif est donc d'installer onnxrutime sur le Jetson nano.
Mais les pré-requis sont d'avoir cuda et cudnn (qui viennent avec Jetpack mais qu'il faut rajouter aux env); une version de de cmake version 3.13 ou plus (`CMake 3.13 or higher is required. You are running version 3.10.2`) alors que le L4T du Jetson vient avec la version 3.10.2.
Références:
* <https://github.com/onnx/onnx/blob/master/docs/PythonAPIOverview.md#optimizing-an-onnx-model>
* <https://github.com/daquexian/onnx-simplifier>
* <https://github.com/onnx/models#onnx-model-zoo>
* <https://github.com/microsoft/onnxruntime>
* <https://microsoft.github.io/onnxruntime/>
* <https://github.com/microsoft/onnxruntime/blob/master/BUILD.md#TensorRT>
* <https://github.com/microsoft/onnxruntime#official-builds>
* <https://developer.nvidia.com/cudnn>
* <https://smist08.wordpress.com/2019/04/03/playing-with-cuda-on-my-nvidia-jetson-nano/>
* <https://cmake.org/download/>
CUDA (NVIDIA GPU API programming libraries, from NVIDIA) and it's library specialized with neural network cuDNN
cuDNN is linked on a specifi CUDA version. Install CUDA first, then find out which cuDDN version.
Also add the CUDA environment variables.
What is also important to know is that some library have specific modules for that. Exemples: tensorflow-gpu; onnxruntime-gpu
Il faut rajouter dans les variables d'environnement le répertoire d'installation de cuda
* Vérifier si le compilateur CUDA peut être appelé
nvcc –help
* trouver ou est installé cuda et si les librairies cuDNN sont présentes
lefv2603@lefv2603-jetsonnano:~$ whereis cuda cuda: /usr/local/cuda lefv2603@lefv2603-jetsonnano:~$ whereis cudnn cudnn: /usr/include/cudnn.h ls -al /usr/local/cuda/bin ls -al /usr/local/cuda/lib64
* ajouter le chemin de cuda dans le PATH et LD_LIBRARY_PATH
export PATH=${PATH}:/usr/local/cuda/bin export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda/lib64
* vérifer que le compilateur cuda peut être exécuté
nvcc –help
* mettre à jour les variables d'environnement dans le fichier .bashrc
vi .bashrc . .bashrc
* vérifer en ouvrant une nouvelle console
nvcc –help
* <https://towardsdatascience.com/installing-tensorflow-with-cuda-cudnn-and-gpu-support-on-windows-10-60693e46e781>
* <https://developer.nvidia.com/cudnn>
* <https://devblogs.nvidia.com/tensor-ops-made-easier-in-cudnn/>
### Installation de Cmake-3.17.2
* Télécharger CMake-3.17.2 targ.gz
* unzip dans le répertoire Downloads
* voir le fichier RELEASE.rst dans le tar.gz pour l'installation de cmake
* cd dans le répertoire cmake-3.17.2
* exécuter le bootstrap, make, et make install
lefv2603@lefv2603-jetsonnano:~/Downloads/cmake-3.17.2$ ./bootstrap && make && sudo make install
- S'il y a une erreur à l'installation de cmake avec OPENSSL, il faut installer les fichiers de developpement de OPENSSL.
– Could NOT find OpenSSL, try to set the path to OpenSSL root folder in the system variable OPENSSL_ROOT_DIR (missing: OPENSSL_CRYPTO_LIBRARY OPENSSL_INCLUDE_DIR) CMake Error at Utilities/cmcurl/CMakeLists.txt:454 (message): Could not find OpenSSL. Install an OpenSSL development package or configure CMake with -DCMAKE_USE_OPENSSL=OFF to build without OpenSSL.
sudo apt-get install libssl-dev
* Recommencer avec ./bootstrap
* ensuite make
* cela prend beaucoup de temps. Allez vous coucher et revenez le lendemain pour continuer avec le sudo male install
* puis sudo make install
* cette étape est très rapide
### Installation de onnxruntime
* il faut installer cmake -3.17-2 avant
* il faut avoir rajouter dans PATH et LD_LIBRARY_PATH le chemin de CUDA
* récuprer le code de github
git clone –recursive https://github.com/Microsoft/onnxruntime
* rouler la commande d'installation
cd projects/onnxruntime ./build.sh –config RelWithDebInfo –build_shared_lib –parallel
* C'est aussi très long, peut-être même plus que pour cmake.
## Installer Pytorch torchvision sur le nano
References:
* <https://elinux.org/Jetson_Zoo>
* <https://gist.github.com/heavyinfo/8d0ef5a3768c5d8d22e164863670330a>
* <https://forums.developer.nvidia.com/t/pytorch-for-jetson-nano-version-1-5-0-now-available/72048#5324123>
From https://gist.github.com/heavyinfo/8d0ef5a3768c5d8d22e164863670330a #!/bin/bash
This script will install pytorch, torchvision on nano
If you have any of these installed already on your machine, you can skip those.
sudo apt-get -y update sudo apt-get -y upgrade # prend vraiment beaucoup de temps à completer #Dependencies sudo apt-get install python3-setuptools
#Installing PyTorch #For latest PyTorch refer original Nvidia Jetson Nano thread - https://devtalk.nvidia.com/default/topic/1049071/jetson-nano/pytorch-for-jetson-nano/. wget https://nvidia.box.com/shared/static/ncgzus5o23uck9i5oth2n8n06k340l6k.whl -O torch-1.4.0-cp36-cp36m-linux_aarch64.whl sudo apt-get install python3-pip libopenblas-base sudo pip3 install Cython sudo pip3 install numpy torch-1.4.0-cp36-cp36m-linux_aarch64.whl
#Installing torchvision #For latest torchvision refer original Nvidia Jetson Nano thread - https://devtalk.nvidia.com/default/topic/1049071/jetson-nano/pytorch-for-jetson-nano/. sudo apt-get install libjpeg-dev zlib1g-dev
from train.py on https://github.com/dusty-nv/pytorch-segmentation/blob/master/train.py, it is used against this fork of torchvision: https://github.com/dusty-nv/vision/tree/v0.3.0
git clone –branch v0.5.0 https://github.com/pytorch/vision torchvision # see below for version of torchvision to download
git clone –branch v0.3.0 https://github.com/dusty-nv/vision torchvision # see below for version of torchvision to download cd torchvision sudo python setup.py install cd ../ # attempting to load torchvision from build dir will result in import error
checkout code from https://github.com/dusty-nv/pytorch-segmentation (link from https://github.com/dusty-nv/jetson-inference/tree/master/python/training)
TODO run a test to make sure all is fine
find a cheap baseline model to train to test the env
also the onnx export to test
and run the onnx
```