Notebook

In [43]:

import cv2
import numpy as np
import matplotlib.pyplot as plt

from IPython.html import widgets

%matplotlib inline

Image Processing in OpenCV¶

resource

Topics covered¶

color convertion
warpAffine & warpPerspective transformation
image thresholding & masking
filtering & smoothing

Colors¶

one application of convertion between differen color spaces (e.g., from BGR/RGB to HSV) is that some colors are easier to represent in HSV, and this can be used to mask certain objects in images
the effect is like removing the background
It is one of the simplest way of object tracking - do it in a VideoCapture().read() loop
both ** cv2.inRange** and** cv2.threshold ** are examples of thresholding method for IMAGE BINARIZATION

In [2]:

rose = cv2.imread("public-images/blue_rose.jpg")
plt.imshow(rose[:,:,::-1])

rose_hsv = cv2.cvtColor(rose, cv2.COLOR_BGR2HSV)
lower_blue = (60, 30, 30)
upper_blue = (130, 255, 255) # Hue (0 to 179), Saturation (0 to 255), Value (0 to 255)
mask = cv2.inRange(rose_hsv, lower_blue, upper_blue)
plt.figure()
plt.imshow(mask, cmap = plt.cm.binary)

masked_rose = cv2.bitwise_and(rose, rose, mask = mask)
plt.figure()
plt.imshow(masked_rose[:,:,::-1])

Out[2]:

<matplotlib.image.AxesImage at 0x7f1c882fc410>

*So how to find HSV ranges for colors*

In [3]:

## by program,e .g., red
## after get the hsv, use (h-10, 100, 100) (h+10, 255, 255) as ranges for 
## cv2.inRange
cv2.cvtColor(np.uint8([[[255, 0, 0]]]), cv2.COLOR_RGB2HSV) # note it must be a 3D matrix
## by tools like gimp

Out[3]:

array([[[  0, 255, 255]]], dtype=uint8)

Image Transformations¶

OpenCV provides two transformation functions, cv2.warpAffine and cv2.warpPerspective, with which you can have all kinds of transformations. cv2.warpAffine takes a 2x3 transformation matrix while cv2.warpPerspective takes a 3x3 transformation matrix as input.
translation: warpAffine with [[1 0 x],[0 1 y]]
rotation: getRotationMatrix2D and warpAffine
scale: resize
all transformations are using languages of image domain (height, width, x, y), which correspond to (nrows, ncols, col, row) respectively in matrix language

*When using plt.imshow() with plt.plot, a very useful trick is to call ax.autoscale(False) right after imshow, which will maintain scale of images*

In [4]:

## translate
beach = cv2.imread("data/images/beach.png")[:,:,::-1]
translated_beach = cv2.warpAffine(beach, np.array([[1, 0, 50], [0., 1., -100]]), (beach.shape[1], beach.shape[0]))
plt.imshow(translated_beach)

Out[4]:

<matplotlib.image.AxesImage at 0x7f1c8831a650>

In [5]:

## rotate 
H, W = beach.shape[:-1]
rot_mat = cv2.getRotationMatrix2D((W/2, H/2), 90, 0.5)
rotated_beach = cv2.warpAffine(beach, rot_mat, (W, H))
plt.imshow(rotated_beach)

Out[5]:

<matplotlib.image.AxesImage at 0x7f1c884164d0>

In [6]:

## resize - most of time it makes sense to maintain height/width ratio
H, W = beach.shape[:-1]
resized_beach = cv2.resize(beach, (W*2, H*2))
plt.imshow(resized_beach)

Out[6]:

<matplotlib.image.AxesImage at 0x7f1c84350790>

*Affine Transformation* In affine transformation, all parallel lines in the original image will still be parallel in the output image. To find the transformation matrix, we need three points from input image and their corresponding locations in output image. Then cv2.getAffineTransform will create a 2x3 matrix which is to be passed to cv2.warpAffine.

*Perspective Transformation* For perspective transformation, you need a 3x3 transformation matrix. Straight lines will remain straight even after the transformation. To find this transformation matrix, you need 4 points on the input image and corresponding points on the output image. Among these 4 points, 3 of them should not be collinear. Then transformation matrix can be found by the function cv2.getPerspectiveTransform. Then apply cv2.warpPerspective with this 3x3 transformation matrix.

In [41]:

sudoku = cv2.imread("public-images/sudoku.jpg")[:,:,::-1]

## captured by GIMP - they must be float !!
original_xys = np.float32([[73, 85], [489, 69], [34, 514], [519, 518]])
mapped_xys = np.float32([[0, 0], [500, 0], [0, 500], [500, 500]])

## use perspective transformation
M = cv2.getPerspectiveTransform(original_xys, mapped_xys)
perspective_trans = cv2.warpPerspective(sudoku, M, (500, 500))

## use affine transformation
M = cv2.getAffineTransform(original_xys[:-1, :], mapped_xys[:-1, :])
affine_trans = cv2.warpAffine(sudoku, M, (500, 500))


fig, axes = plt.subplots(1, 3, figsize = (3 * 4, 4))
axes[0].imshow(sudoku)
axes[0].autoscale(False) ## SUPER USEFUL
axes[0].set_title("original")
axes[0].plot(original_xys[:, 0], original_xys[:, 1], "rs")


axes[1].imshow(perspective_trans)
axes[1].autoscale(False)
axes[1].plot(mapped_xys[:, 0], mapped_xys[:, 1], "rs")
axes[1].set_title("perspective transformation")

axes[2].imshow(affine_trans)
axes[2].autoscale(False)
axes[2].plot(mapped_xys[:, 0], mapped_xys[:, 1], "rs")
axes[2].set_title("affine transformation")

Out[41]:

<matplotlib.text.Text at 0x7f1bfa554610>

Image Thresholding¶

useful for image binarization, which is the basics of things such as masking
three main functions cv2.inRange (Color Segementation), cv2.threshold (basic thresholding), and cv2.adaptiveThreshold (adpative threshold and otsu thresholding). Note the input to the last two functions should be grayscale images. They all return retVal and maskImage
for practical application, most of time cv2.threshold will be used with cv2.THRESH_OTSU to automatically find the threshold (assuming bimodal images). So OTSU thresholding is still a global thresholding tech
usually a bluring (smoothing) method is performed before the thresholding method to reduce the noise, e.g., GaussianBlur
tuning parameters such as blocksize and C is key to algorithm success

In [40]:

## cv2.threshold performs global thresholding 
## (a single value for whole image)
## cv2.adaptiveThreshold can be either mean-based or gaussian-based
## which uses different thersholding for different regions
sudoku_gray = cv2.cvtColor(sudoku, cv2.COLOR_RGB2GRAY)
thr, global_thres = cv2.threshold(sudoku_gray, 127, 255, 
                                  cv2.THRESH_OTSU)
print thr
adaptive_mean = cv2.adaptiveThreshold(sudoku_gray, 255, 
                        cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 
                        7, 6)
adaptive_gauss = cv2.adaptiveThreshold(sudoku_gray, 255,
                        cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY,
                        11, 6)

fig, axes = plt.subplots(2, 2, figsize = (8, 8))
fig.tight_layout()

axes[0, 0].imshow(sudoku_gray, cmap = plt.cm.gray)
axes[0, 1].imshow(global_thres, cmap = plt.cm.gray)
axes[1, 0].imshow(adaptive_mean, cmap = plt.cm.gray)
axes[1, 1].imshow(adaptive_gauss, cmap = plt.cm.gray)

97.0

Out[40]:

<matplotlib.image.AxesImage at 0x7f1bfa707f90>

Smoothing/Blurring¶

essentially low-pass filtering of images
mostly implemented by convolution (2D), by cv2.filter2D()

As in one-dimensional signals, images also can be filtered with various low-pass filters(LPF), high-pass filters(HPF) etc. LPF helps in removing noises, blurring the images etc. HPF filters helps in finding edges in the images.

Image blurring is achieved by convolving the image with a low-pass filter kernel. It is useful for removing noises. It actually removes high frequency content (eg: noise, edges) from the image. So edges are blurred a little bit in this operation. (Well, there are blurring techniques which doesn’t blur the edges too). OpenCV provides mainly four types of blurring techniques.

Averaging by cv2.blur()
Gaussian bluring by cv2.GaussianBlur(): highly effective in removing gaussian noise.
Median blurring by cv2.medianBlur(): highly effective against salt-and-pepper noise.
Bilateral filtering by cv2.bilateralFilter(): highly effective in noise removal while keeping sharp edges, but it is generally slower than other blurring methods.

In [68]:

def show_bluring(img, method_f, block_size):
    plt.imshow(sudoku)
    plt.figure()
    if method_f == cv2.blur:
        blur_img = cv2.blur(img, (block_size, block_size))
    elif method_f == cv2.GaussianBlur:
        blur_img = cv2.GaussianBlur(img, (block_size, block_size), 0)
    elif method_f == cv2.medianBlur:
        blur_img = cv2.medianBlur(img, block_size)
    else:
        raise ValueError("Unknown method")
    plt.imshow(blur_img)

methods = dict(zip(["blur", "gaussian", "median"], [cv2.blur, cv2.GaussianBlur, cv2.medianBlur]))
widgets.interact(show_bluring, img = widgets.fixed(sudoku), 
                 method_f = methods,
                 block_size = (1, 21, 2))

In [67]:

## bilateralFilter
## cv2.bilateralFilter(src, d, sigmaColor, 
##           sigmaSpace[, dst[, borderType]]) → dst

## Sigma values: For simplicity, you can set the 2 sigma values to be the same. 
## If they are small (< 10), the filter will not have much effect, 
## whereas if they are large (> 150), they will have a very strong effect, 
## making the image look “cartoonish”.

## Filter size: Large filters (d > 5) are very slow, 
## so it is recommended to use d=5 for real-time applications, 
## and perhaps d=9 for offline applications that need heavy noise filtering.
from skimage import io
noise_edge = io.imread("public-images/edge_noise.gif")

def show_bilateral_filter(img, d, sigmaColor, sigmaSpace):
    plt.imshow(noise_edge)
    plt.figure()
    blur_img = cv2.bilateralFilter(img, d, sigmaColor, sigmaSpace)
    plt.imshow(blur_img)
    
widgets.interact(show_bilateral_filter, img = widgets.fixed(noise_edge),
                 d = (5, 10, 1), 
                 sigmaColor = (10, 150, 10),
                 sigmaSpace = (10, 150, 10))

In [ ]: