TTNT 0001

TRƯỜNG ĐẠI HỌC BÁCH KHOA
KHOA Công Nghệ Thông Tin
ĐỀ THI VÀ BÀI LÀM

Tên học phần: Trí tuệ nhân tạo
Mã học phần: Hình thức thi: Tự luận có giám sát
Đề số: 00001 Thời gian làm bài: 70 phút (không kể thời gian chép/phát đề)
Được sử dụng tài liệu khi làm bài.
Họ tên: Nguyễn Thanh Hoàng Lớp : 20TCLC_DT1 MSSV: 102200049

Sinh viên làm bài trực tiếp trên tệp này, lưu tệp với định dạng MSSV_HọTên.pdf và nộp bài thông qua
MSTeam:
Câu 1 (3 điểm): Cho bài toán múc nước như sau:
- Cho n cái gáo nước, mỗi gáo i có thể chứa tối đa ai lít nước. Bạn cần múc đúng M lít nước từ bờ sông
qua bể nước lớn với số thao tác ít nhất, không được múc quá cũng như múc thiếu. Biết, bạn không có bất
kỳ dụng cụ nào khác để đo số lượng nước. Bạn cũng có thể vứt bỏ số nước đã múc nếu cần và việc vứt
bỏ này không tính là số thao tác.
Hãy viết chương trình sử dụng thuật toán A* nhập vào các số nguyên n, M và a1, a2,…,an và in ra cách thức múc
nước. Nếu không có đáp án thì in “Không có đáp án”.
Ví dụ:
- Nhập: 2543
- Xuất:
o Chuyển/Múc 4 lít nước từ bờ sông qua gáo 1 (Gáo 1: 4 lít, Gáo 2: 0 lít, Bể: 0 lít)
o Chuyển/Múc 4 lít nước từ gáo 1 qua bể (Gáo 1: 0 lít, Gáo 2: 0 lít, Bể: 4 lít)
o Chuyển/Múc 4 lít nước từ bờ sông qua gáo 1 (Gáo 1: 4 lít, Gáo 2: 0 lít, Bể: 4 lít)
o Chuyển/Múc 3 lít nước từ gáo 1 qua gáo 2 (Gáo 1: 1 lít, Gáo 2: 3 lít, Bể: 4 lít)
o Chuyển/Múc 1 lít nước từ gáo 1 qua bể (Gáo 1: 0 lít, Gáo 2: 3 lít, Bể: 5 lít)
# Trả lời: Dán code vào bên dưới (1.5 điểm)
import heapq
class State:
def __init__(self, water_amounts, path_cost):
self.water_amounts = water_amounts
self.path_cost = path_cost
def __lt__(self, other):
return self.path_cost < other.path_cost
def a_star_search(n, M, water_capacities):
initial_state = State([0] * n, 0)
frontier = []
heapq.heappush(frontier, initial_state)
explored = set()
while frontier:
current_state = heapq.heappop(frontier)
if sum(current_state.water_amounts) == M:
return current_state.water_amounts
explored.add(tuple(current_state.water_amounts))
for i in range(n):
for action in ['pour_in', 'pour_out']:
new_water_amounts = current_state.water_amounts.copy()
if action == 'pour_in':
if new_water_amounts[i] < water_capacities[i]:
new_water_amounts[i] = water_capacities[i]
elif action == 'pour_out':
if new_water_amounts[i] > 0:
new_water_amounts[i] = 0
if tuple(new_water_amounts) not in explored:
new_path_cost = current_state.path_cost + 1
new_state = State(new_water_amounts, new_path_cost)
heapq.heappush(frontier, new_state)
return None
result = a_star_search(n, M, water_capacities)
# Trả lời: Dán kết quả thực thi với dữ liệu Nhập: “3 13 7 8 9” vào bên dưới (1 điểm)
# Trả lời: Hãy giải thích hàm h’ (hàm khoảng cách trong thuật toán A* ở chương trình trên. (0.5 điểm)
Câu 2 (4 điểm): Cho tập dữ liệu input.csv với 90 mẫu dữ liệu, mỗi mẫu có 4 đặc trưng ( chiều dài đài hoa, chiều
rộng đài hoa, chiều dài cánh hoa, chiều rộng cánh hoa) và tên loài hoa tương ứng.
a) (3 điểm) Hãy viết chương trình phân loại hoa sử dụng Logistic Regression kết hợp với lớp softmax. Nêu
rõ mô hình thức phân loại trong chương trình như thế nào (Ví dụ: có bao nhiêu tế bào nơ-ron, mỗi nơ-
ron phụ trách công việc gì, làm sao để phân loại,…)?
# Trả lời: Dán code vào bên dưới
import numpy as np
import pandas as pd
from sklearn.preprocessing import LabelEncoder

def softmax(z):
z_exp = np.exp(z)
z_sum = np.sum(z_exp, axis=1, keepdims=True)
return z_exp / z_sum
def one_hot(y):
n_classes = len(np.unique(y))
one_hot_y = np.zeros((len(y), n_classes))
one_hot_y[np.arange(len(y)), y] = 1
return one_hot_y
def logistic_regression(X, y, lr=0.1, n_iter=1000):
# Thêm cột 1 vào X
X = np.hstack((np.ones((X.shape[0], 1)), X))
# Khởi tạo tham số w
w = np.zeros((X.shape[1], y.shape[1]))
# Lặp qua các vòng lặp
for i inrange(n_iter):
# Tính giá trị dự đoán
y_pred = softmax(X.dot(w))
# Tính gradient
grad = X.T.dot(y_pred - y) / X.shape[0]

# Cập nhật tham số w
w -= lr * grad
return w
# Đọc dữ liệu từ file input.csv
data = pd.read_csv('input.csv', header=None)
X = data.iloc[:, :-1].values
y = data.iloc[:, -1].values
# Chuyển đổi nhãn thành số và dạng one-hot
le = LabelEncoder()
y = le.fit_transform(y)
y = one_hot(y)
# Huấn luyện mô hình Logistic Regression bằng GD
w = logistic_regression(X, y)
# Trả lời: Mô tả mô hình phân loại bằng hình ảnh hoặc bằng lời.
X1,x2,x3,x4: thông tin đầu vào của 1 mẫu gồm 4 đặc trưng :
Mô hình phân loại chỉ gồm input layer và output layer. Output layer chỉ có 1 tế bào neuron và hàm
activation function softmax của nó là hàm sigmoid.
Lặp:
Y^=sigmoid(b+w1*x1+w2*x2+w3*x3+……)
 Tính loss (binary cross-entropy) : 𝐿 = (−y.T )*log(y^) - (1−y).T* log(1−y^)

 Tính đạo hàm: L’=x.T (y^-y)
 Cập nhật tham số:theta=theta-learning_rate.L’
Phân loại
y^ = P(x(i) )=0: không phải là tên hoa
y^ = P(x(i) )=1: la tên hoa
b) (1 điểm) Hãy thực thi chương trình và cho biết nhãn của 60 mẫu dữ liệu trong output.csv
# Trả lời: Dán code thực thi thành công
# Đọc dữ liệu từ file output.csv
data = pd.read_csv('output.csv', header=None)
X_test = data.values
# Dự đoán nhãn cho dữ liệu trong file output.csv
X_test = np.hstack((np.ones((X_test.shape[0], 1)), X_test))
predictions = np.argmax(softmax(X_test.dot(w)), axis=1)
predicted_labels = le.inverse_transform(predictions)
# In kết quả ra màn hình
for i, label in enumerate(predicted_labels):
print(f'{i+1}: {label}')
# Trả lời: Dán kết quả nhãn ứng với 60 mẫu dữ liệu
1: Iris-setosa
2: Iris-setosa
3: Iris-setosa
4: Iris-setosa
5: Iris-setosa
6: Iris-setosa
7: Iris-setosa
8: Iris-setosa
9: Iris-setosa
10: Iris-setosa
11: Iris-setosa
12: Iris-setosa
13: Iris-setosa
14: Iris-setosa
15: Iris-setosa
16: Iris-setosa
17: Iris-setosa
18: Iris-setosa
19: Iris-setosa
20: Iris-setosa
21: Iris-versicolor
22: Iris-versicolor
23: Iris-versicolor
24: Iris-versicolor
25: Iris-versicolor
26: Iris-versicolor
27: Iris-versicolor
28: Iris-versicolor
29: Iris-versicolor
30: Iris-versicolor
31: Iris-versicolor
32: Iris-versicolor
33: Iris-versicolor
34: Iris-versicolor
35: Iris-versicolor
36: Iris-versicolor
37: Iris-versicolor
38: Iris-versicolor
39: Iris-versicolor
40: Iris-versicolor
41: Iris-virginica
42: Iris-virginica
43: Iris-virginica
44: Iris-virginica
45: Iris-virginica
46: Iris-virginica
47: Iris-virginica
48: Iris-virginica
49: Iris-virginica
50: Iris-virginica
51: Iris-virginica
52: Iris-virginica
53: Iris-virginica
54: Iris-virginica
55: Iris-virginica
56: Iris-virginica
57: Iris-virginica
58: Iris-virginica
59: Iris-virginica
60: Iris-virginica
Câu 3 (3 điểm): Cho tập dữ liệu input.csv với 90 mẫu dữ liệu như câu 2, Hãy viết chương trình phân cụm bằng
thuật toán k-means
a) (1 điểm) Viết hàm thực thi thuật toán k-means
# Trả lời: Dán code vào bên dưới
import pandas as pd
import numpy as np
def initialize_K_centroids(X, K):

m,n = X.shape
K_rand = np.ones((K,n))
K_rand = X[np.random.choice(range(len(X)), K, replace=False), :]
return K_rand
def find_closest_centroids(X, centroids):

m= len(X)
c= np.zeros(m)
for i in range(m):
distances = np.linalg.norm(X[i] - centroids, axis=1)
c[i] = np.argmin(distances)
return c
def compute_means(X, idx, K):

m,n = X.shape
centroids = np.zeros((K, n))
for k in range(K):
points_belong_k = X[np.where(idx == k)]
centroids[k] = np.mean(points_belong_k, axis=0,)
return centroids
def find_k_mean(X, K, max_iter =10):

_, n = X.shape
centroids = initialize_K_centroids(X, K)
centroids_hits = np.zeros((max_iter, K, n))
for i in range(max_iter):
idx = find_closest_centroids(X, centroids)
centroids = compute_means(X, idx, K)
return centroids, idx
b) (2 điểm) Nếu sử dụng thuật toán k-means với k = 3 thì kết quả phân nhóm sẽ như thế nào? (Trọng tâm
của các cụm, tỷ lệ phân cụm đúng, tiêu chí đánh giá việc phân cụm đúng là gì?).
# Trả lời: viết câu trả lời vào bên dưới

1. Trọng tâm của các cụm (in ra trọng tâm của 3 cụm):
[[5.92 2.78 4.41 1.43]
[4.99 3.38 1.48 0.25]
[6.84 3.09 5.68 2.08]]
2. Tỷ lệ phân cụm đúng (kết quả %):

90%
3. Tiêu chí đánh giá việc phân cụm (viết bằng lời)
Tiêu chí đánh giá cho việc phân cụm là dựa trên nhãn của dữ liệu, tỉ lệ phân cụm đúng là tỷ lệ số mẫu
cùng nhãn dữ liệu nhiều hơn trong một cụm.
Đà Nẵng, ngày 14 tháng 05 năm 2023

GIẢNG VIÊN BIÊN SOẠN ĐỀ THI TRƯỞNG BỘ MÔN
(đã duyệt)

TTNT 0001

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

TTNT 0001

Uploaded by

Copyright:

Available Formats

TRƯỜNG ĐẠI HỌC BÁCH KHOA

KHOA Công Nghệ Thông Tin

ĐỀ THI VÀ BÀI LÀM

Họ tên: Nguyễn Thanh Hoàng Lớp : 20TCLC_DT1 MSSV: 102200049

def __init__(self, water_amounts, path_cost):

return self.path_cost < other.path_cost

def a_star_search(n, M, water_capacities):

for action in ['pour_in', 'pour_out']:

if new_water_amounts[i] < water_capacities[i]:

elif action == 'pour_out':

if tuple(new_water_amounts) not in explored:

new_state = State(new_water_amounts, new_path_cost)

result = a_star_search(n, M, water_capacities)

from sklearn.preprocessing import LabelEncoder

z_sum = np.sum(z_exp, axis=1, keepdims=True)

return z_exp / z_sum

one_hot_y = np.zeros((len(y), n_classes))

def logistic_regression(X, y, lr=0.1, n_iter=1000):

# Thêm cột 1 vào X

X = np.hstack((np.ones((X.shape[0], 1)), X))

# Khởi tạo tham số w

# Lặp qua các vòng lặp

# Tính giá trị dự đoán

grad = X.T.dot(y_pred - y) / X.shape[0]

# Đọc dữ liệu từ file input.csv

data = pd.read_csv('input.csv', header=None)

# Chuyển đổi nhãn thành số và dạng one-hot

# Huấn luyện mô hình Logistic Regression bằng GD

 Tính loss (binary cross-entropy) : 𝐿 = (−y.T )*log(y^) - (1−y).T* log(1−y^)

y^ = P(x(i) )=0: không phải là tên hoa

y^ = P(x(i) )=1: la tên hoa

# Đọc dữ liệu từ file output.csv

data = pd.read_csv('output.csv', header=None)

# Dự đoán nhãn cho dữ liệu trong file output.csv

X_test = np.hstack((np.ones((X_test.shape[0], 1)), X_test))

predictions = np.argmax(softmax(X_test.dot(w)), axis=1)

# In kết quả ra màn hình

for i, label in enumerate(predicted_labels):

# Trả lời: Dán code vào bên dưới

def initialize_K_centroids(X, K):

def find_closest_centroids(X, centroids):

def compute_means(X, idx, K):

def find_k_mean(X, K, max_iter =10):

# Trả lời: viết câu trả lời vào bên dưới

2. Tỷ lệ phân cụm đúng (kết quả %):

Đà Nẵng, ngày 14 tháng 05 năm 2023

You might also like

def init(self, water_amounts, path_cost):

 Tính loss (binary cross-entropy) : 𝐿 = (−y.T )log(y^) - (1−y).T log(1−y^)