標題: 分群法及其應用
Clustering with Applications
作者: 林武杰
Wu-Ja Lin
林志青
Ja-Chen Lin
資訊科學與工程研究所
關鍵字: 分群法;解析式公式;二分法;色彩減用;彩色影像壓縮;禁忌式搜尋法;特徵保留;clustering;analytical formulas;bisection;color quantization;color image compression;tabu search;feature-preserving
公開日期: 1998
摘要: 在本論文中,我們提出了兩個產生群代表點的方法來解決分群的問題。 第一個方法是用來解決二群分群的問題, 第二個則是針對多群分群,尋找能產生接近最小平方誤差和之群代表點。 我們所提的第一個方法可以使產生的兩個群代表點所成系統保留住原始資料的 中心位置、變異度、及平均徑長。事實上,藉由保留這些資料分佈特性, 我們推導出一些公式,可以用來計算欲找尋之群代表點。 這些公式被證明具有下列幾項性質: (一)找出之群代表點始終是實數值, (二)移動之不變性, (三)尺度大小之不變性, (四)二群分佈比例之計算公式不受旋轉影響。 我們所設計的第一個方法其優點是快速,而且公式可適用於任何維度的資料。 我們所設計的第二個方法是用來找尋產生接近最小平方誤差和之群代表點。 在本方法中,我們利用禁忌式搜尋法來引導我們找到較佳之群代表點, 這是因為禁忌式搜尋法具有跳脫局部最佳解的能力。 此外,我們在搜尋過程中是利用具有潛力的解, 而非單就解本身的好壞來決定搜尋的路徑。 在和「k 個代表點法」及「模擬退火法」比較後, 實驗結果顯示,我們的方法可以在找尋好的群代表點和縮短執行時間之間取得一個良好的平衡點。 我們也在本論文中,將所提的第一個找尋群代表點之方法應用在色彩減用及彩色影像壓縮上。 在色彩減用上,我們要求色彩減用後的影像要保留住原圖的平均色、色彩對比、 及色彩活性。實驗結果顯示,利用我們的方法所產生的色彩減用影像其品質不錯, 而且執行時間也非常出色。在彩色影像壓縮上, 我們將我們的分群法搭配我們另外提出的一種新的粗估影像產生法。 這種粗估影像產生法是利用線性模型來粗估影像, 並且要求粗估的影像要保留住原圖的平均色及色彩對比。 實驗結果顯示,我們設計的影像壓縮法與 JPEG法相比頗具競爭性。
In the dissertation, we proposed two clustering methods which generate cluster representatives. The first method handled the bisection problem, and the second method focused on minimizing the sum of square errors for multiple-class clustering. We designed the first method, i.e., the bisection method, in a way that the mean, variance, and average radius of the input data could be preserved. More specifically, by preserving these data distribution features, we derived some formulas which could be used to calculate cluster representatives. The derived formulas were then proved to have the following properties: (1) always yield real value solution, (2) translation invariance, (3) scaling invariance, and (4) the formulas computing population ratios satisfy rotation invariance. The advantage of the first method is that it is fast and the formulas can be applied to data of any dimension. The second method that we designed was in order to minimize the sum of the square distances (in$l_2$ norm) between the data points and the representatives of the clusters to which they belonged. In this method, we used the tabu search (T.S.) to get a good searching path of cluster representatives for multiple-class clustering. The T.S. was used because T.S. could avoid being trapped in a local minimum. In our algorithm, the potential, instead of the error introduced, was used to judge which solution should be used to start a new searching path. We compared our method with the k-means algorithm and the Simulated Annealing algorithm, and found that our method made a good compromise between the quality of the searched cluster representatives and the execution time. Finally, we applied our bisection tool to color palette design and color image compression. In color palette design, we required that the average color, color contrast, and color activity of the original image are all preserved in the quantized image. The experimental results showed that the obtained image quality was good and the computation time was also attractive. In color image compression, we combined our bisection clustering method with a new image approximation method proposed by us which used a linear function to generate a roughly approximated image, and in the roughly approximated image the average color and color contrast of the original image were both preserved. The experimental results showed that our image compression method can compete with the JPEG method.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT870394026
http://hdl.handle.net/11536/64165
顯示於類別:畢業論文