0% found this document useful (0 votes)
58 views6 pages

Opencv-Python Notes

Uploaded by

Badr Lakhal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views6 pages

Opencv-Python Notes

Uploaded by

Badr Lakhal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd

>>>> the notes are taken from: https://youtu.be/oXlwWbU8l2o?

si=mYneDeE7SarRADRX

******* OpenCV - Python *******

NOTE: in openCV we use the format BGR (blue, green, red) but ooutside of this
library we use the classic one RGB. Ypu can visualize this by using
cv.imshow('title', img) and plt.imshow(img) and plt.show()

# pip install opencv-contrib-python

# import cv2 as cv

<<<< Reading images and videos >>>>

# img = cv.imread("PICTURE_PATH.jpg")
> Transform the picture into a matrix with values between 0 and 255 and store in a
variabale img.

# cv.imshow("window_title", img)
> Displays the image as a new window but to look at the window you need the next
method.

# cv.waitKey(n)
> if n <= 0: the displayed window will wait until you press a button.
else if n > 0: the displayed window will wait only n milliseconds.

# capture = cv.VideoCapture("VIDEO_PATH")
> it is used for reading videos
if you give a integer instead of the path as an argument then you are
refering to take videos in reel time.
for example 0 is your main camera, 1 second camera ...

> to read the video it not like images, you gotta read it a frame by frame using w
loop
while True:
isTrue, frame = capture.read() # return the frame and a boolean
indicating if the frame is readen or not successfully
cv.imshow('Video', frame)

if cv.waitKey(20) and 0xFF==ord('d'): # waits for 20milliseconds if


the button d is pressed we beak. Otherwise, we pass to the next frame
break

capture.release() # similar to file.close()


cv.destroyAllWindows() # close all the windows opened by cv2

<<<< Resize and rescaling frames >>>>

# frame.shape
> return a tuple containing (width, height, channels). for example if channels=3
then its BGR

# cv.resize(frame, dimensions, interpolation=cv.INTER_AREA) # dimensions =


(new_width, new_height)
> resize the frame the dimensions

# def rescaleFrame(frame, scale=0.75): # it can be also used on images, give img as


input inplace of frame
widght = int(frame.shape[1] * scale)
height = int(frame.shape[0] * scale)

dimensions = (widght, height)


return cv.resize(frame, dimensions, interpolation=cv.INTER_AREA)
# def changeRes(width, height): >> this function is used while working on live
videos. Otherwise, use rescaleFrame().
capture.set(3, width)
capture.set(4, height)
> 3 is the identifier for setting the width of the video frame.
4 is the identifier for setting the height of the video frame.
There are many other identifiers for different properties such as brightness,
contrast, etc.

<<<< Drawing shapes 1 putting text >>>>

# blank = np.zeros((500, 500, 3)), dtype='uint8') # (500, 500, 3) = (width,


height, channels)
> creating a black picture. 'uint8' is the data type of an image.

# blank[:] = 0, 255, 0
> changing the color of the image to green intead of black
> to color just a portion of the image give indexes, like this [start_row,
last_row: start_col, last_col]

# cv.rectangle(blank, (0,0), (250, 500), (0, 255, 0), thickness=2)


> draws a rectangle from (0, 0) to (250, 500) with green
> if you wanna fill the rectangle give thckness=-1

# cv.circle(blank, (250, 250), 40, (0, 0, 255), thickness=3)


> draws a circle with center = (250, 250) and radius = 40 and a red color with
thickness = 3
> if you wanna fill the rectangle give thckness=-1

# cv.line(blank, point1, point2, (0, 0, 255), thickness=3)


> draws a line that strats from point1 to point2 using Red with thickness 3. the
points are tuples.

# cv.putText(blank, "Hello, Mom!", point, cv.FONT_HERSHEY_TRIPLEX, 1.0, (0, 255,


0), thickness=2)
> point is tuple representing the (x, y) coordinate of the top-left corner of the
text.
> cv.FONT_HERSHEY_TRIPLEX the style of the text, there is alot of choices in
python.
> 1.0 is the size of the text, 1.0 mean the original size.

<<<< 5 Essential Functions >>>>

# gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)


> Converting the image to grayscale

# blur = cv.GaussianBlur(img, (n, m), cv.BORDER_DEFAULT) # kadebeb tswira


> (n, m) is 2 by 2 tuple with odd numbers, its the size of the blur.
> cv.BORDER_DEFAULT is used so that function can handle the paper edges.

# canny = cv.Canny(img, 125, 175) # returns the edges of img


> The number 125 is like saying, "Hey, Canny, only look for edges where the colors
change a little bit, not too much."
> The number 175 is like saying, "Hey, Canny, if the colors change a lot, that's
also an edge! Don't miss it!"

# dilated = cv.dilate(canny, (3, 3), iterations=3)


> Dilation makes objects in the image appear thicker or larger.
> (3, 3) is the kernel size

# enroded = cv.enrode(dilated, (3, 3), iterations=3)


> Erosion makes objects in the image appear thinner or smaller.
> (3, 3) is the kernel size

# resized = cv.resize(img, (500, 100), interpolation=cv.INTER_AREA)


> you can play wiith interpolation the way you want depending on on your needs
> (500, 100) is the width and the height of the new image.

<<<< Image Transformations >>>>

# translation
def translate(img, x, y):
transMat = np.float32([[1, 0, x], [0, 1, y]])
dimensions = (img.shape[1], img.shape[0])
return cv.warpAffine(img, transMat, dimensions)
> "warpAffine" can change the shape or position of a picture.
> np.float32 is a function that takes a matrix as input and change the type oof
this matrix.

# Rotation
def rotate(img, angle, rotPoint=None):
(height, width) = img.shape[:2]

if rotPoint is None: # we re gonne rotate by center


rotPoint = (width//2, height//2)

rotMat = cv.getRotationMatrix2D(rotPoint, angle, 1.0) # 1.0 the scaling factor


for the rotation
dimensions = (img.shape[1], img.shape[0])
return cv.warpAffine(img, rotMat, dimensions)

# Flipping
flip = cv.flip(img, x)
> if x = 0: flip the image vertically
else if x = 1: flip the image horizontally
else if x = -1: flip the image horizontally and vertically

<<<< Contour detection >>>>

# contours, hierarchies = cv.findContours(canny, cv.RETR_LIST,


cv.CHAIN_APPROX_NONE)
> It gives you a list of these outlines (contours) and tells you how they're
connected (hierarchies).
<<<< Color Spaces >>>>

# BGR to HSV
hsc = cv.cvtColor(img, cv.COLOR_BGR2HSV)
> converting the color to hsv color moodel

# BGR to L*a*b
lab = cv.cvtColor(img, cv.COLOR_BGR2LAB)
> converting the color to l* a* b* color model

<<<< Color Channels >>>>

# b, g, r = cv.split(img)
> Split img into three grayscale images. Each image shows lighter shades where the
colors blue, red, and green are more prevalent and darker shades where these colors
are less prominent or absent in that region.

# merged = cv.merge([b, g, r])


> the reverse of cv.split it takes those gray pictures and merge theme into one
picture with BGR color model

<<<< Blurring Techniques >>>>


NOTE: window in this potion refer to the kernel size we use to blur an image, which
is like a window with n rows and m columns. Also the siz e

# average = cv.blur(img, (3, 3))


> this method calculate the intensity of the middle pixel by taking the averge on
the surrounding pixels.

# gauss = cv.GaussianBlur(img, (7, 7), 0)


> 0 is the standard deviation of the Gaussian distribution
> takes a weighted average of the surrounding pixels for each pixel in the image,
with the weights determined by the Gaussian distribution. The Gaussian distribution
assigns weights to these surrounding pixels based on their distance from the center
pixel. Pixels closer to the center have higher weights, while pixels farther away
have lower weights.

# median = cv.medianBlur(img, 3)
> this method calculate the intensity of the middle pixel by taking the median on
the surrounding pixels.

# bilateral = cv.bilateralFilter(img, 5, 15, 15)


> make the picture look smoother and remove messy colors, but still keep the edges
of objects sharp, like the lines in your coloring book.
> the diameter of the pixel neighborhood used during filtering. In simpler terms,
it determines how far around each pixel.
> first 15 is the sigmaColor, larger sigmaColor means taking in consideration more
colors in the neighborhood pixels.
> second 15 is the sigmaSpace, larger digmaSpace means that pixels further out
from the center pixel will influence the blurring calculations.
<<<< BITWISE Operations >>>>

# bitwise_and = cv.bitwise_and(rectangle, circle) # ET Logique


> return an image that represent the intersection between rectangle and cercle.

# bitwise_or = cv.bitwise_or(rectangle, circle) # OU Logique

# bitwise_xor = cv.bitwise_xor(rectangle, cercle)


> A XOR B = (A union B) - (A intersection B)

# bitwise_not = cv.bitwise_not(rectangle)

<<<< Masking >>>>

# mask = cv.circle(blank, center_tuple, rayon, color, thickness)


> the blank image has to be with the same shape as img.

# masked = cv.bitwise_and(img, img, mask=mask)


> taking img and cuting a window that have the same coordinates as mask.
> giving img two times is essential for the function to work properly.

<<<< Computing Histograms >>>>

# gray_hist = cv.calcHist(images_list, channels_list, mask, histSizel_list,


ranges_list)
> histSize is the number of bins.
plt.figure()
plt.xlabel("Bins")
plt.ylabel('# of pixels')
plt.plot(gray_hist)
plt.xlim([0, 256])
plt.show()

#
colors = ('b', 'g', 'r')
for i, col in enumerate(colors):
hist = cv.calcHist([new_img], [i], mask, [256], [0, 256])
plt.figure()
plt.xlabel("Bins")
plt.ylabel('# of pixels')
plt.plot(hist, color=col)
plt.xlim([0, 256])

plt.show()

> this is the code for BGR mode.

<<<< Thresholding >>>>

Thresholding is the binarization of an image (only two colors in the image, black
and white), the main idea is to fix a threshold (seuil), if pixel value is heigher
than this threshold we give it 255 else we give it 0.

# threshold, thresh = cv.threshold(gray, seuil, 255, cv.THRESH_BINARY)


> 255 is the value taken by the pixel if its value is hrigher than seuil, in our
case its 255 because we want a binary image, otherwise choose any number you want
> the function returns two values threshold which is seuil we gave as a parameter
and thresh our binary picture
> you can do the inverse just by passing cv.THRESH_BINARY_INV

# Adaptive Thresholding which is giving the computer the ability to choose the
threshold
adaptive_thresh = cv.adaptiveThresh(gray, max_value, cv.ADAPTIVE_THRESH_MEAN_C,
cv.THRESH_BINARY, 11, 0)
> cv.ADAPTIVE_THRESH_MEAN_C is the method used to choose the threshold. In this
case, it's using the mean of the neighborhood area
> 11 must be an odd number, it like we re spliting the kernel into 11 pieces.
> 3 a constant value subtracted from the calculated mean or weighted sum of the
neighborhood pixels. Must of the time we sit it to 0.

<<<< Edge Detection >>>>

# Laplacian
lap = cv.Laplacian(gray, cv.CV_64F)
lap = np.uint8(np.absolute(lap)) # taking off the negatif values
> cv.CV_64F it indicates that the output should be a 64-bit floating-point image,
which allows for more precise calculations.

# Sobel
sobelx = cv.Sobel(gray, cv.CV_64F, 1, 0) # across the x axis
sobely = cv.Sobel(gray, cv.CV_64F, 0, 1) # across the y axis
combined_sobel = cv.bitwise_or(sobelx, sobely)

You might also like