程式扎記: [ ML Foundation ] Section2 : Learning to Answer Yes/No

2015年8月13日星期四

[ ML Foundation ] Section2 : Learning to Answer Yes/No - PLA (Part1)

Source From Here
Introduction
在第一講中老師介紹了何謂 Machine Learning 與它的應用實例. 基本上本質就是:

A takes D and H to get g

上面的 A 是 Learning Algorithm; D 是 Training Data; H 是 Hypothesis Set; g 則是學習出來接近 target function f 的結果:

在這一講 (第二講: Learning to Answer Yes/No) 要介紹的是 PLA (Perception Learning Algorithm), 並透過這個簡單的演算法來了解 ML 的運作過程:

* Perception Hypothesis Set
* Perception Learning Algorithm (PLA)
* Guarantee of PLA
* Non-Separable Data

Perception Hypothesis Set
ML 是一個 Toolkit, 本身不具備任何意義, 而它的價值來自於所解決的問題. 因此這邊要考慮一個經典的問題 "Credit Approval Problem Revisited". 當銀行給你一堆過去申請 Credit Approval 的顧客資料 (Training Data - X), 並透過歷史紀錄知道這些顧客最後是否有還款或是積欠不還 (Training Data - Y), 而 ML (需決定 Hypothesis Set 與 Learning Algorithm) 能透過這些資料訓練出來 g 來告訴你未來當有某個客戶 x, 它會被歸類成 y="會還款" 還是 "不會還款".

而在使用 ML 時, 當 Training Data (X/Y) 已知時, 要面對的第一個問題就是決定 "Hypothesis Set". 底下是 Perceptron 使用的 Hypothesis Set 的說明:

上面的式子稍作整理可以更 General (拿掉 threshold):

上面的 X 可以視作為某個客戶的特徵向量 (收入, 年紀 etc); 而 W 則是該特徵向量在決定 Y (+1 or -1) 時的權重向量. 在進行訓練過程中, X 與 Y 是已知, 而 W 則是訓練的結果; 在訓練後使用 W 來預測某筆 X 的輸出 Y. 如果 X 是二維向量, 那麼 h(X) 看起來便像是區隔 Y={(+1), (-1)} 的分隔線:

Perceptron Learning Algorithm (PLA)
那到底我麼怎麼透過 Training 利用 X 與 Y 找出 g? 基本個概念是給定一個初始的 g0, 接著從錯誤中學習直到無法找出更好的 g:

而這樣的 Training 過程便是透過錯誤的預測結果, 不斷的修正 W 來得到最少誤判機率的 g :

而一個最簡單與直覺的實作方法稱為 Cyclic PLA:

一個學習過程的範例如下, 首先是決定一條初始的 g0 線, 接著每個點去做判斷. 當遇到錯誤則更新 g0 成 g':

如果這個資料是 Linear Separable 的話, 後面的證明會說明 PLA 一定會找到 g 來正確分辨 "Training" data 每個點的類型:

那現在可能的問題如下:

* 如果 Training data 不是 Linear Separable 的話, 那 PLA 何時該停?
* 如果 Training data 是 Linear Separable 的話, 哪 PLA 至少得跑多少次 Correction 才會停?

有興趣的同學可以跟進 "Guarantee of PLA" 章節, 因為蠻多證明與說明這邊就略過.

Experiment Lab
接著我們要來看怎麼透過代碼使用 PLA, 下面示範使用 GML 套件中的 PLA 類別進行說明:
- 第一步 Prepare Trainint Data

view plaincopy to clipboardprint?
// 1)　Prepare Training Data  
def x = [[1,7], [1,2], [1,4], [-1,3], [-4,-2], [-3,2], [3,-2], [-2, -11], [2.5, -15], [-1, -12], [1, 22]]  
def y = [1,1,1,-1,-1,-1,1, -1, 1, 1, -1]  
DataInXYChart demo = new DataInXYChart("Training Data", x, y)  
  
demo.pack();  
RefineryUtilities.centerFrameOnScreen(demo);  
demo.setVisible(true);  

上面代碼的 DataInXYChart 物件會將 Training data 在二維空間中顯示, 結果如下:

(紅色的點是對應 y=1 的 x 在二維空間 (x,y) 的位置)

- 第二步進行 PLA Training

view plaincopy to clipboardprint?
PLA pla = new PLA()  
PLAClassify cfy = pla.cyclic(x, y)  
printf("\t[Info] Weighting Matrix(%d):\n", cfy.loop)  
cfy.w.eachWithIndex{v, i->printf("\t\tw[%d]=%s\n", i, v)}  

執行結果為:

- 第三步對 Testing data 進行 Predicting
接著我們便可以使用 Testing data 來測試剛剛透過 PLA 訓練出來的 classifier:

view plaincopy to clipboardprint?
def t = [[1,3], [-4,1], [2,2], [3,6], [-1,9], [3, 39]]  
def r = [1, -1, 1, 1, -1, -1]  
def p = []  
t.eachWithIndex{ v, i->  
    e = cfy.classify(v)  
    printf("\t[Info] %s is classified as %d\n", v, e)  
    if(e==r[i]) p.add(e) // Correct  
    else p.add(3) // Miss  
}  
t.addAll(x)  
p.addAll(y)  
  
DataInXYChart demo = new DataInXYChart("Training Data", t, p, cfy.w)  
demo.pack();  
RefineryUtilities.centerFrameOnScreen(demo);  
demo.setVisible(true);  

執行結果如下:

從下圖 UI 可以清楚看出產生的 w (綠色線) 成功將大部分預測結果進行區隔:

(黃色點為預測錯誤的點)

程式扎記

標籤

2015年8月13日星期四

[ ML Foundation ] Section2 : Learning to Answer Yes/No - PLA (Part1)

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

檢舉濫用情形

學習筆記

標籤

2015年8月13日 星期四

[ ML Foundation ] Section2 : Learning to Answer Yes/No - PLA (Part1)

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

檢舉濫用情形

學習筆記

2015年8月13日星期四