Professional Documents
Culture Documents
Introduction
3.1 Algorithm
In signature verification systems, the OC-SVM (One-Class Support Vector Machines)
algorithm is often used for anomaly detection. Here's an explanation of how the OC-SVM
algorithm works in the context of signature verification:
1. Training Phase:
In the training phase of the OC-SVM algorithm, only genuine or authentic signature
samples are used. Anomaly detection techniques are employed to learn the
characteristics of genuine signatures and define a decision boundary that separates them
from potential outliers.
The training set consists of feature vectors extracted from the genuine signature
samples. These features capture the distinctive properties and patterns of genuine
signatures.
The OC-SVM algorithm constructs a model based on the training data, specifically
designed to identify anomalies or deviations from the learned representation.
The algorithm finds the optimal hyperplane that maximizes the margin between the
hyperplane and the closest genuine signature samples, while still enclosing the majority
of the training samples.
The hyperplane is defined using a set of support vectors, which are the training samples
located closest to the decision boundary.
3. Testing/Verification Phase:
During the testing or verification phase, a new or unseen signature sample is presented
to the trained OC-SVM model.
The feature vector is extracted from the test sample, capturing its relevant properties.
The OC-SVM model evaluates the test sample based on its proximity to the decision
boundary learned during the training phase.
Based on the output of the OC-SVM algorithm, a decision is made regarding the
authenticity of the signature.
If the test sample is classified as within the decision boundary, it is considered a valid
signature.
If the test sample is classified as outside the decision boundary, it indicates a potential
anomaly or forgery, triggering further investigation or additional verification steps.
Image Preprocessing: Preprocessing is performed to enhance the image quality and make it
suitable for further analysis. It may involve operations like resizing, color space conversion, and
contrast adjustment.
Binarization: Binarization is the process of converting a gray scale or color image into a binary
image, where each pixel is either black or white. This step is often used to simplify subsequent
processing and extract meaningful features.
Background Noise Reduction: In many real-world images, there might be unwanted noise or
artifacts in the background that can interfere with object recognition. Background noise reduction
techniques aim to eliminate or reduce such noise, improving the accuracy of subsequent steps.
Extraction of Selected Objects: Once the image has been segmented, this step involves
extracting the objects or regions of interest from the segmented image. Depending on the
application, this could be accomplished by various methods such as connected component
analysis, contour detection, or template matching.
Train the Classifier: To perform recognition or matching, a classifier needs to be trained. This
step involves preparing a labeled dataset where each object or region of interest is associated with
its corresponding class or category. The classifier is then trained using this dataset to learn the
patterns and features that distinguish different classes.
Perform Matching: After the classifier is trained, it can be used to perform matching or
recognition on new images. The matching process involves feeding the extracted objects or
regions of interest into the trained classifier, which will determine the class or category of each
object based on its learned knowledge. The matching results can be further analyzed or used for
decision-making purposes.
Implementation
import numpy as np
import cv2
from matplotlib import pyplot as plt
import pandas as pd
import math
from sklearn import preprocessing
from sklearn import svm
#original image
img = cv2.imread('C:\\Users\\HP\\Downloads\\signature.png',0)
plt.imshow(img,'gray')
plt.show()
cimg=crop_image(image,tol=0)
plt.imshow(cimg,'gray')
plt.show()
ret,img = cv2.threshold(img,127,255,0)
element = cv2.getStructuringElement(cv2.MORPH_CROSS,(3,3))
done = False
# plt.imshow(skel,'gray')
# plt.show()
return skel
timg=thinning(cimg)
plt.imshow(timg,'gray')
plt.show()
tl_x,tl_y=COG(img_tl)
tr_x,tr_y=COG(img_tr)
bl_x,bl_y=COG(img_bl)
br_x,br_y=COG(img_br)
return tl_x,tl_y,tr_x,tr_y,bl_x,bl_y,br_x,br_y
rows,cols=timg.shape
img_tl=timg[0:int(rows/2),0:int(cols/2)]
img_tr=timg[int(rows/2)+1:rows,0:int(cols/2)]
img_bl=timg[0:int(rows/2),int(cols/2)+1:cols]
img_br=timg[int(rows/2)+1:rows,int(cols/2)+1:cols]
#fig,ax=plt.subplots(nrows=1,ncols=1)
plt.imshow(img_tl,'gray')
plt.show()
plt.imshow(img_tr,'gray')
plt.show()
plt.imshow(img_bl,'gray')
plt.show()
plt.imshow(img_br,'gray')
plt.show()
for i in range(img.shape[1]):
y_cor+=sum(img[:,i])*i/255
yrun_sum+=sum(img[:,i])/255
#print(img.shape[1])
if yrun_sum==0:
x_pos=0
else:
x_pos=y_cor/(yrun_sum)
if xrun_sum==0:
y_pos=0
else:
y_pos=x_cor/(xrun_sum)
# print(x_pos)
# print(y_pos)
return (x_pos/img.shape[1],y_pos/img.shape[0])
COG(img_bl)
cimg=crop_image(image,tol=0)
area=cv2.countNonZero(cimg)/(cimg.shape[0]*cimg.shape[1])
#Find proportion of white cells
img1=np.invert(cimg)
cc=cv2.connectedComponents(img1)[0]
#Generate connected components
timg=thinning(cimg)
#Thinning the image!
x=pd.Series([i,j,area,cc,tl_x,tl_y,tr_x,tr_y,bl_x,bl_y,br_x,br_y],index=
["Writer_no","Sample_no","area","connected_comps","tl_x","tl_y","tr_x","tr_y","bl_x","bl_y",
"br_x","br_y"])
data.append(x)
df=pd.DataFrame(data)
cimg=crop_image(image,tol=0)
area=cv2.countNonZero(cimg)/(cimg.shape[0]*cimg.shape[1])
#Find proportion of white cells
img1=np.invert(cimg)
cc=cv2.connectedComponents(img1)[0]
#Generate connected components
timg=thinning(cimg)
#Thinning the image!
tl_x,tl_y,tr_x,tr_y,bl_x,bl_y,br_x,br_y=coords(timg)
#Extracting features
x=pd.Series([i,j,area,cc,tl_x,tl_y,tr_x,tr_y,bl_x,bl_y,br_x,br_y],index=
["Writer_no","Sample_no","area","connected_comps","tl_x","tl_y","tr_x","tr_y","bl_x","bl_y",
"br_x","br_y"])
data_f.append(x)
df_f=pd.DataFrame(data_f)
def alt_coords(timg):
rows,cols=timg.shape
Dept., AI&ML(2020-Batch) 1 2022-23
img_tl1=timg[0:int(rows/2),0:int(cols/4)]
img_tl2=timg[0:int(rows/2),int(cols/4)+1:int(cols/2)]
img_tr1=timg[0:int(rows/2),int(cols/2)+1:int(0.75*cols)]
img_tr2=timg[0:int(rows/2),int(0.75*cols)+1:cols]
img_bl1=timg[int(rows/2)+1:rows,0:int(cols/4)]
img_bl2=timg[int(rows/2)+1:rows,int(cols/4)+1:int(cols/2)]
img_br1=timg[int(rows/2)+1:rows,int(cols/2)+1:int(0.75*cols)]
img_br2=timg[int(rows/2)+1:rows,int(0.75*cols)+1:cols]
#plt.imshow(timg,'gray')
#plt.show()
tl1=tan_i(COG(img_tl1))
tl2=tan_i(COG(img_tl2))
tr1=tan_i(COG(img_tr1))
tr2=tan_i(COG(img_tr2))
bl1=tan_i(COG(img_bl1))
bl2=tan_i(COG(img_bl2))
br1=tan_i(COG(img_br1))
br2=tan_i(COG(img_br2))
#plt.imshow(img_br1,'gray')
#plt.show()
#print(COG(img_br1))
return tl1,tl2,tr1,tr2,bl1,bl2,br1,br2
alt_coords(timg)
cimg=crop_image(image,tol=0)
area=cv2.countNonZero(cimg)/(cimg.shape[0]*cimg.shape[1])
#Find proportion of white cells
img1=np.invert(cimg)
cc=cv2.connectedComponents(img1)[0]
#Generate connected components
Dept., AI&ML(2020-Batch) 1 2022-23
timg=thinning(cimg)
#Thinning the image!
tl1,tl2,tr1,tr2,bl1,bl2,br1,br2=alt_coords(timg)
#Extracting features
x=pd.Series([i,j,area,cc,tl1,tl2,tr1,tr2,bl1,bl2,br1,br2],index=
["Writer_no","Sample_no","area","connected_comps","tl1","tl2","tr1","tr2","bl1","bl2",
"br1","br2"])
alt_data.append(x)
alt_df=pd.DataFrame(alt_data)
cimg=crop_image(image,tol=0)
area=cv2.countNonZero(cimg)/(cimg.shape[0]*cimg.shape[1])
#Find proportion of white cells
img1=np.invert(cimg)
cc=cv2.connectedComponents(img1)[0]
#Generate connected components
timg=thinning(cimg)
#Thinning the image!
tl1,tl2,tr1,tr2,bl1,bl2,br1,br2=alt_coords(timg)
#Extracting features
x=pd.Series([i,j,area,cc,tl1,tl2,tr1,tr2,bl1,bl2,br1,br2],index=
["Writer_no","Sample_no","area","connected_comps","tl1","tl2","tr1","tr2","bl1","bl2",
"br1","br2"])
alt_data_f.append(x)
alt_df_f=pd.DataFrame(alt_data_f)
# Choosing dataset
os=24*32
data=alt_df.iloc[0+os:14+os]
print(data.shape)
data=data.drop("Writer_no",axis=1)
data=data.drop("Sample_no",axis=1)
return best_nu,best_gamma
#print("Optimal parameters are ",best_nu,best_gamma)
def AER(preds):
ac=preds[0:10]
FRR=len(ac[ac==-1])/len(ac)
fg=preds[10:]
FAR=len(fg[fg==1])/len(fg)
AER1=(FRR+FAR)/2
return round(AER1*100,3)
def AER2(preds):
ac=preds[0:10]
FRR=ac.count(-1)/len(ac)
fg=preds[10:]
FAR=fg.count(1)/len(fg)
AER1=(FRR+FAR)/2
return round(AER1*100,3),FRR,FAR
def soft_thres(pdf,pdf1):
m=np.mean(pdf1)
sig=np.std(pdf1)
best_thres=0
best_AER=100
best_FFR=100
best_FAR=100
for k in range(-300,300):
k_d=k/100
thres=m+k_d*sig
preds2=[1 if x >= thres else -1 for x in pdf]
#print(preds2)
cur_AER,FRR,FAR=AER2(preds2)
# print(k_d)
if cur_AER < best_AER:
best_AER=cur_AER
best_thres=thres
best_FRR=FRR
best_FAR=FAR
return best_thres,best_FRR,best_FAR,best_AER
def fixed_soft_thres(pdf,thres):
preds2=[1 if x >= thres else -1 for x in pdf]
cur_AER,FRR,FAR=AER2(preds2)
return cur_AER,FRR,FAR
# Dataset of 30 writers
master_pdf1 = np.array([])
master_pdf2 = np.array([])
master_data = []
for i in li:
os = 24 * i
data = alt_df.iloc[0 + os:14 + os]
data = data.drop("Writer_no", axis=1)
data = data.drop("Sample_no", axis=1)
aer_h = AER(preds)
pdf1 = clf.decision_function(test[0:10])
pdf2 = clf.decision_function(test[10:])
pdf = clf.decision_function(test)
master_pdf1 = np.append(master_pdf1, pdf1)
master_pdf2 = np.append(master_pdf2, pdf2)
soft_t, frr, far, aer_s = soft_thres(pdf, pdf1)
master_df = pd.DataFrame(master_data)
master_df
test_m=alt_df.drop(alt_df.index[li1])
rand_signs=test_m.sample(n=20)
test=alt_df.iloc[14+os:24+os]
test=test.append(rand_signs)
test=test.drop("Writer_no",axis=1)
test=test.drop("Sample_no",axis=1)
aer_h=AER(preds)
pdf1=clf.decision_function(test[0:10])
pdf2=clf.decision_function(test[10:])
pdf=clf.decision_function(test)
testing_pdf1=np.append(testing_pdf1,pdf1)
testing_pdf2=np.append(testing_pdf2,pdf2)
aer_s,frr,far=fixed_soft_thres(pdf,thres)
x=pd.Series([i+1,aer_h,aer_s,frr,far],index=
["Writer_no","Hard_AER","Soft_AER","FRR","FAR"])
testing_data.append(x)
testing_df=pd.DataFrame(testing_data)
testing_df
print("Average AER with hard thresholding is ",np.mean(testing_df.Hard_AER))
print("Average AER with soft thresholding is ",np.mean(testing_df.Soft_AER))
Fig 4.2 Applying Gaussian Blurring and Otsu Thresholding for Binarization
[1] "Offline Signature Verification and Recognition: An Overview" by A. Kumar et al. (2012)
[3] "Online Signature Verification: Techniques and Features" by A. Konwar et al. (2016)