Professional Documents
Culture Documents
IoT Lab Report
IoT Lab Report
Submitted By
Dr.VELMATHI.G
Chennai – 600127
May 2021
ACKNOWLEDGEMENT
I wish to express my sincere thanks and deep sense of gratitude to our Lab
guide, Dr.VELMATHI.G Associate Professor, School of Electronics
Engineering, for her immense and consistent encouragement and valuable
guidance offered to me in a pleasant manner throughout the course of the lab
work.
I express my thanks to our Head of the Department Dr. Vetrivelan. P for his
support throughout the course of this Lab.
I also take this opportunity to thank all the faculty of the School for their
support and their wisdom imparted to us throughout the course.
I thank my parents, family, and friends for bearing with us throughout the
course of our Lab and for the opportunity they provided us in undergoing this
course in such a prestigious institution.
RADHA MOHAN
TABLE OF CONTENTS
Tableau Sheets:
Result Dashboard:
Experiment 1b
AIM: To do the data insight visualization using Tableau for the traffic related data.
Dataset: The dataset comprises of the information of the I-94 Westbound Traffic Volume
For Minnesota Department of Transportation Automatic Traffic Recorder Station 301.
Tableau Sheets:
Result Dashboard:
Experiment 2
AIM: To do the predictive analysis using WEKA for the leaf dataset.
Dataset: The dataset comprises of the information of particular leaf based on
15 parameters such as Aspect ratio, eccentricity, elongation etc
WEKA screenshots:
Result:
Following is analysis using logistic regression classifier over the given dataset.
Experiment 3
AIM: To do the predictive analysis using WEKA for the weather dataset.
Dataset: The dataset comprises of the information about weather based on
parameters such as temperature, humidity, windy.
WEKA Screenshots:
Result:
Analysis is done for the given dataset using random forest based classification
method.
Experiment 4
AIM: To do the predictive analysis using KNIME Analytics platform for the
imdb dataset.
Dataset:
KNIME screenshots:
This KNIME workflow focuses on creating a scoring model based on historical
data. As with all data mining modeling activities, it is unclear in advance which
analytic method is most suitable. This workflow therefore uses three different
methods simultaneously – Decision Trees, Neural Networking and SVM – then
automatically determines which model is most accurate and writes that model
out for further use.
Code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import MinMaxScaler
import torch
import torch.nn as nn
from google.colab import drive
drive.mount('/content/drive')
filepath = '/content/drive/MyDrive/stock_data/AAPL_2006-01-01_to_2018-01-
01.csv'
data = pd.read_csv(filepath)
data = data.sort_values('Date')
data.head()
sns.set_style("darkgrid")
plt.figure(figsize = (15,9))
plt.plot(data[['Close']])
plt.xticks(range(0,data.shape[0],100),data['Date'].loc[::100],rotation=45)
plt.title("Amazon Stock Price",fontsize=18, fontweight='bold')
plt.xlabel('Date',fontsize=18)
plt.ylabel('Close Price (USD)',fontsize=18)
plt.show()
price = data[['Close']]
print(price)
scaler = MinMaxScaler(feature_range=(-1, 1))
price['Close'] = scaler.fit_transform(price['Close'].values.reshape(-1,1))
def split_data(stock, lookback):
data_raw = stock.to_numpy() # convert to numpy array
data = []
data = np.array(data);
test_set_size = int(np.round(0.2*data.shape[0]));
train_set_size = data.shape[0] - (test_set_size);
x_train = data[:train_set_size,:-1,:]
y_train = data[:train_set_size,-1,:]
x_test = data[train_set_size:,:-1:]
y_test = data[train_set_size:,-1,:]
input_dim = 1
hidden_dim = 32
num_layers = 2
output_dim = 1
num_epochs = 100
import math, time
from sklearn.metrics import mean_squared_error
# make predictions
y_test_pred = model(x_test)
# invert predictions
y_train_pred = scaler.inverse_transform(y_train_pred.detach().numpy())
y_train = scaler.inverse_transform(y_train_lstm.detach().numpy())
y_test_pred = scaler.inverse_transform(y_test_pred.detach().numpy())
y_test = scaler.inverse_transform(y_test_lstm.detach().numpy())
# make predictions
y_test_pred = model(x_test)
# invert predictions
y_train_pred = scaler.inverse_transform(y_train_pred.detach().numpy())
y_train = scaler.inverse_transform(y_train_lstm.detach().numpy())
y_test_pred = scaler.inverse_transform(y_test_pred.detach().numpy())
y_test = scaler.inverse_transform(y_test_lstm.detach().numpy())
fig = plt.figure()
fig.subplots_adjust(hspace=0.2, wspace=0.2)
plt.subplot(1, 2, 1)
ax = sns.lineplot(x = original.index, y = original[0], label="Data", color='roy
alblue')
ax = sns.lineplot(x = predict.index, y = predict[0], label="Training Prediction
(LSTM)", color='tomato')
ax.set_title('Stock price', size = 14, fontweight='bold')
ax.set_xlabel("Days", size = 14)
ax.set_ylabel("Cost (USD)", size = 14)
ax.set_xticklabels('', size=10)
plt.subplot(1, 2, 2)
ax = sns.lineplot(data=hist, color='royalblue')
ax.set_xlabel("Epoch", size = 14)
ax.set_ylabel("Loss", size = 14)
ax.set_title("Training Loss", size = 14, fontweight='bold')
fig.set_figheight(6)
fig.set_figwidth(16)
trainPredictPlot = np.empty_like(price)
trainPredictPlot[:, :] = np.nan
trainPredictPlot[lookback:len(y_train_pred)+lookback, :] = y_train_pred
original = scaler.inverse_transform(price['Close'].values.reshape(-1,1))
original = scaler.inverse_transform(price['Close'].values.reshape(-1,1))
predictions = np.append(trainPredictPlot, testPredictPlot, axis=1)
predictions = np.append(predictions, original, axis=1)
result = pd.DataFrame(predictions)
import plotly.express as px
import plotly.graph_objects as go
fig = go.Figure()
fig.add_trace(go.Scatter(go.Scatter(x=result.index, y=result[0],
mode='lines',
name='Train prediction')))
fig.add_trace(go.Scatter(x=result.index, y=result[1],
mode='lines',
name='Test prediction'))
fig.add_trace(go.Scatter(go.Scatter(x=result.index, y=result[2],
mode='lines',
name='Actual Value')))
fig.update_layout(
xaxis=dict(
showline=True,
showgrid=True,
showticklabels=False,
linecolor='white',
linewidth=2
),
yaxis=dict(
title_text='Close (USD)',
titlefont=dict(
family='Rockwell',
size=12,
color='white',
),
showline=True,
showgrid=True,
showticklabels=True,
linecolor='white',
linewidth=2,
ticks='outside',
tickfont=dict(
family='Rockwell',
size=12,
color='white',
),
),
showlegend=True,
template = 'plotly_dark'
annotations = []
annotations.append(dict(xref='paper', yref='paper', x=0.0, y=1.05,
xanchor='left', yanchor='bottom',
text='Results (LSTM)',
font=dict(family='Rockwell',
size=26,
color='white'),
showarrow=False))
fig.update_layout(annotations=annotations)
fig.show()
Result:
Following is analysis over the given dataset.
Experiment 6
AIM: To do the predictive analysis using SCIKIT learn for the Crop
production dataset.
Code:
from __future__ import print_function
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import classification_report
from sklearn import metrics
from sklearn import tree
import warnings
warnings.filterwarnings('ignore')
df = pd.read_csv('/content/drive/MyDrive/CropDataset/Crop_recommendation.csv')
df.head()
features = df[['N', 'P','K','temperature', 'humidity', 'ph', 'rainfall']]
target = df['label']
labels = df['label']
from sklearn.model_selection import train_test_split
Xtrain, Xtest, Ytrain, Ytest = train_test_split(features,target,test_size = 0.2
,random_state =2)
from sklearn.tree import DecisionTreeClassifier
DecisionTree = DecisionTreeClassifier(criterion="entropy",random_state=2,max_de
pth=5)
DecisionTree.fit(Xtrain,Ytrain)
predicted_values = DecisionTree.predict(Xtest)
x = metrics.accuracy_score(Ytest, predicted_values)
acc.append(x)
model.append('Decision Tree')
print("DecisionTrees's Accuracy is: ", x*100)
print(classification_report(Ytest,predicted_values))
#-----------------------------------------
from sklearn.naive_bayes import GaussianNB
NaiveBayes = GaussianNB()
NaiveBayes.fit(Xtrain,Ytrain)
predicted_values = NaiveBayes.predict(Xtest)
x = metrics.accuracy_score(Ytest, predicted_values)
acc.append(x)
model.append('Naive Bayes')
print("Naive Bayes's Accuracy is: ", x)
print(classification_report(Ytest,predicted_values))
#----------------------------------------------------------
from sklearn.svm import SVC
SVM = SVC(gamma='auto')
SVM.fit(Xtrain,Ytrain)
predicted_values = SVM.predict(Xtest)
x = metrics.accuracy_score(Ytest, predicted_values)
acc.append(x)
model.append('SVM')
print("SVM's Accuracy is: ", x)
print(classification_report(Ytest,predicted_values))
#---------------------------------------------------
from sklearn.linear_model import LogisticRegression
LogReg = LogisticRegression(random_state=2)
LogReg.fit(Xtrain,Ytrain)
predicted_values = LogReg.predict(Xtest)
x = metrics.accuracy_score(Ytest, predicted_values)
acc.append(x)
model.append('Logistic Regression')
print("Logistic Regression's Accuracy is: ", x)
print(classification_report(Ytest,predicted_values))
#----------------------------------------------------
RF = RandomForestClassifier(n_estimators=20, random_state=0)
RF.fit(Xtrain,Ytrain)
predicted_values = RF.predict(Xtest)
x = metrics.accuracy_score(Ytest, predicted_values)
acc.append(x)
model.append('RF')
print("RF's Accuracy is: ", x)
print(classification_report(Ytest,predicted_values))
#------------------------------------------------------
plt.figure(figsize=[10,5],dpi = 100)
plt.title('Accuracy Comparison')
plt.xlabel('Accuracy')
plt.ylabel('Algorithm')
sns.barplot(x = acc,y = model,palette='dark')
data = np.array([[104,18, 30, 23.603016, 60.3, 6.7, 140.91]])
prediction = RF.predict(data)
print(prediction)
Result:
Following is analysis over the given dataset.
Experiment 7
AIM: To do the predictive analysis using Orange Analytics platform for the
corn production in United States.
Orange screenshots:
Orange is an open source data visualization and analysis tool, where data
mining is done through visual programming or Python scripting. The tool has
components for machine learning, add-ons for bioinformatics and text mining
and it is packed with features for data analytics.
For this dataset we are Linear regression model to predict the values , orange
provides plenty of built in ML models to start with. Following are the
snapshots of orange working window.
Results
Here we are selecting the features and checking for any missing or corrupt data
in our dataset.
Here we are selecting the model over which we want to train our dataset. As we
are focused on predictive analysis so we are choosing Generalized Linear
Model, Decision Tree, Random forest and Gradient Boosted Trees.
Results
As we can observe from the below screenshot that here Generalized Linear
Model is best to work with as we are getting lowest error rate and standard
Deviation.
Here we can see that how much our label is dependent on various features of
our dataset.
Here we can simulate various features of our model and simultaneously see the
factors on which our prediction is dependent.
In the graph below we can observe that most of predicted values are close to
true values so we can state that our model is capable of predicting new values
correctly.
BIODATA
Photo
Name Name : Radha Mohan
E-mail : radha.mohan2018@vitstudent.ac.in