程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
您现在的位置: 程式師世界 >> 編程語言 >  >> 更多編程語言 >> Python

kmeans算法python實現

編輯:Python
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
def calc_distance(dataset, centroids):
n, l = dataset.shape
m, l = centroids.shape
dataset = dataset.reshape(n, 1, l)
centroids = centroids.reshape(1, m, l)
sum = np.sum(np.square(dataset[..., :] - centroids[..., :]), axis=-1)
distance = np.sqrt(sum)
return distance
def kmeans(dataset, k):
data_num = len(dataset)
# 第一列存樣本屬於哪一簇
# 第二列存樣本的到簇的中心點的誤差
clusterAssment = np.zeros((data_num, 2))
centroids = dataset[np.random.choice(data_num, k, replace=False)]
last_nearest = np.zeros((data_num,))
while True:
distances = calc_distance(dataset, centroids)
current_nearest = np.argmin(distances, axis=1)
if (last_nearest == current_nearest).all():
break
clusterAssment = np.hstack([np.expand_dims(current_nearest, axis=1),
np.expand_dims(distances[np.arange(data_num), current_nearest], axis=1)])
# update cluster
for idx in range(k):
centroids[idx] = np.mean(dataset[current_nearest == idx], axis=0)
last_nearest = current_nearest
return centroids, clusterAssment

 

 


  1. 上一篇文章:
  2. 下一篇文章:
Copyright © 程式師世界 All Rights Reserved