Commit 04ab862d authored by Roberto Ugolotti's avatar Roberto Ugolotti
Browse files

Merge branch 'add_ml_cookbook' into 'master'

Add ML cookbook

See merge request !1
parents 27347bb7 488b642e
This repository contains configuration files or data that can be of interest to personalise some of the JEODPP services
\ No newline at end of file
# Content
This repository contains files or data that can be of interest to run and personalise JEODPP services.
## jeodpp-text-terminal-service
This folder contains configurations file for `screen`.
## ml-cookbook
This folder contains some examples of using deep learning libraries inside Jupyter Notebooks.
\ No newline at end of file
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Jupyter Notebooks
.ipynb_checkpoints
# Image Classification
This folder contains notebooks that perform image classification using PyTorch, Keras, and MXNet.
## How to run the examples
To run the script in [JeoLab](https://jeodpp.jrc.ec.europa.eu/jhub/) copy a notebook and the zip file containing data (`images.zip`) in your folder and launch it with an environment in which the library needed is installed (see [available environments](https://jeodpp.jrc.ec.europa.eu/jhub/).
It will automatically extract the data contained in `images.zip` and train a simple Convolutional Neural Network to distinguish between satellite images of forestal and industrial areas.
Each notebook contains some references to the documentation of the package used.
## How to use your own data
In case you want to use this script as a base to train your own dataset, the images must be divided into different folders according to their classes
```
main_directory/
...class_a/
......a_image_1.jpg
......a_image_2.jpg
...class_b/
......b_image_1.jpg
......b_image_2.jpg
```
%% Cell type:markdown id:d4beb900-5ae3-43b0-8517-c89876a895fa tags:
# Keras Example
This notebook shows a simple classification example using Keras
Code has been written starting from https://keras.io/examples/vision/image_classification_from_scratch/
Data is a selection of http://madm.dfki.de/files/sentinel/EuroSAT.zip
%% Cell type:code id:f6284ec5-d53f-44d5-8b93-5c2229fa0496 tags:
``` python
# Example taken from https://keras.io/examples/vision/image_classification_from_scratch/
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
```
%% Cell type:code id:1863a3b6-f1f1-4172-b319-dacf28a91538 tags:
``` python
# Check that Tensorflow will use GPU
from tensorflow.python.client import device_lib
assert 'GPU' in str(device_lib.list_local_devices())
```
%% Cell type:markdown id:ec21d233-06b1-4d75-b4eb-04ff183a897c tags:
Reads data from disk. Data must be structured in this way:
```
main_directory/
...class_a/
......a_image_1.jpg
......a_image_2.jpg
...class_b/
......b_image_1.jpg
......b_image_2.jpg
```
%% Cell type:code id:f1b91f72-e6d8-4037-84c4-4ed824d9be1d tags:
``` python
!unzip -qo images
!rm -rf data/.ipynb_checkpoints/ # Otherwise Keras will try to read images from this directory and get the wrong number of classes
image_size = 64, 64
batch_size = 32
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
"images/",
validation_split=0.2,
subset="training",
seed=1234,
image_size=image_size,
batch_size=batch_size,
label_mode='categorical',
)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
"images/",
validation_split=0.2,
subset="validation",
seed=1234,
image_size=image_size,
batch_size=batch_size,
label_mode='categorical',
)
```
%% Cell type:markdown id:298ecb95-e700-43d2-a901-877d8417c827 tags:
Plot some images
%% Cell type:code id:1ea29621-1865-4127-b384-ad043238d220 tags:
``` python
import matplotlib.pyplot as plt
import numpy as np
plt.figure(figsize=(10, 10))
for images, labels in train_ds.take(1):
for i in range(9):
ax = plt.subplot(3, 3, i + 1)
plt.imshow(images[i].numpy().astype("uint8"))
plt.title(np.argmax(labels[i]))
plt.axis("off")
```
%% Cell type:markdown id:317c35b1-a4de-447d-9e63-3959a6a6b97f tags:
Augment training data with flipping and rotation
%% Cell type:code id:f75972b2-cdd2-4fe2-9e90-cc5453a8b564 tags:
``` python
data_augmentation = keras.Sequential(
[
layers.experimental.preprocessing.RandomFlip("horizontal"),
layers.experimental.preprocessing.RandomRotation(0.1),
]
)
```
%% Cell type:markdown id:d6841041-a27c-44fe-8e07-60e842adbb9c tags:
Plot some augmented data
%% Cell type:code id:bb6ac2af-4194-400d-853c-f30f36bbcf8e tags:
``` python
plt.figure(figsize=(10, 10))
for images, _ in train_ds.take(1):
for i in range(9):
augmented_images = data_augmentation(images)
ax = plt.subplot(3, 3, i + 1)
plt.imshow(augmented_images[0].numpy().astype("uint8"))
plt.axis("off")
```
%% Cell type:markdown id:c43e5b56-f98f-4e46-a2db-2aaa5aa4dc57 tags:
Create a deep network for classification. It contains the level of data_augmentation created before, a rescaling layer, and five convolutional layers, followed by the softmax layer used for classification.
%% Cell type:code id:dd137900-44bd-4536-bb3b-ecd0602fc9c3 tags:
``` python
def make_model(input_shape, num_classes):
inputs = keras.Input(shape=input_shape)
# Image augmentation block
x = data_augmentation(inputs)
# Entry block
x = layers.experimental.preprocessing.Rescaling(1.0 / 255)(x)
x = layers.Conv2D(32, 3, strides=2, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
x = layers.Conv2D(64, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
previous_block_activation = x # Set aside residual
for size in [128, 256, 512, 728]:
x = layers.Activation("relu")(x)
x = layers.SeparableConv2D(size, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.MaxPooling2D(3, strides=2, padding="same")(x)
# Project residual
residual = layers.Conv2D(size, 1, strides=2, padding="same")(
previous_block_activation
)
x = layers.add([x, residual]) # Add back residual
previous_block_activation = x # Set aside next residual
x = layers.SeparableConv2D(1024, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
x = layers.GlobalAveragePooling2D()(x)
activation = "softmax"
units = num_classes
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(units, activation=activation)(x)
return keras.Model(inputs, outputs)
model = make_model(input_shape=image_size + (3,), num_classes=2)
# keras.utils.plot_model(model, show_shapes=True) # Needed pydot for this
```
%% Cell type:code id:fcb0538a-7f79-450a-8f65-a45967561704 tags:
``` python
epochs = 10
model.compile(
optimizer=keras.optimizers.Adam(1e-3),
loss="binary_crossentropy",
metrics=["accuracy"],
)
model.fit(
train_ds, epochs=epochs, validation_data=val_ds,
)
```
%% Cell type:code id:c37c970d-b1d7-42d6-a97f-0b040cf8a85e tags:
``` python
img = keras.preprocessing.image.load_img(
"images/Forest/Forest_1.jpg", target_size=image_size
)
img_array = keras.preprocessing.image.img_to_array(img)
img_array = tf.expand_dims(img_array, 0) # Create batch axis
predictions = model.predict(img_array)
score = predictions[0]
print(
"This image is %.1f percent Forest and %.1f percent Industrial."
% (100 * score[0], 100 * score[1])
)
```