Week 1 The Learning Problem
What is machine learning
- learning : acquiring skill with experience accumulated from observations
- machine learning : acquiring skill with experience accumulated/computed from data
- skill : improve some performance measure
Key Essence of Machine Learning
- exists some underlying pattern to be learned - so performance measure can be improved
- but no programmable (easy) definition - so ML is needed
- somehow there is data about the pattern - so ML has some inputs to learn
Machine Learning vs Data Mining
- ML : use data to compute hypothesis g that approximates target f
- DM: use (huge) data to find property that is interesting
if interesting property same as hypothesis that approximates target , ML=DM
if interesting property related to hypothesis that approximates target , ML help DM or DM help ML
if interesting property related to hypothesis that approximates target , ML help DM or DM help ML
Machine Learning vs Artificial Intelligence
ML can realize AI
- AI : compute something that shows intelligent behavior
- 以圍棋AI為例,AlphaGO 只是實現AI的手段之一
Machine Learning vs Statistics
statistics can be used to achieve ML
- Statistics : use data to make inference about an unknown process
- In ML , g is an inference outcome
Week 2 Learning to Answer Yes/No
Perceptron Learning Algorithm
- 感知器為線性分類,解決是非或對錯的問題
- 用向量的概念,找一條線分割兩個區塊,不斷測試並修正該直線直到沒有錯誤
- 遇到線性可分割問題,一定有解,卻不知道何時有解
- 但通常不知道問題是不是線性的
- http://beader.me/2013/12/21/perceptron-learning-algorithm
Pocket Algorithm
- modify from PLA
- 每次跌代PLA先去計算犯錯的程度
- 保留運算過程最好的結果
- 騎驢找馬
- 增加計算成本
- 較適合非線性分割問題
Week 3 Type of Learning
Learning with different output space
- 二元分類
- 多元分類
- 回歸 : 輸出為某個值,可能是某個範圍內的值
- structured learning : 輸出為一個結構性的序列,例如文法判斷
Learning with different label
- supervised learning : 有label
- unsupervised learning : 無label
- semi-supervised learning : 部分label,可能是因為labeling的成本太高,藉由部分label加快unsupervised learning
- reinforcement learning : 獎勵與懲罰
Learning with different protocol
- batch : 提供現有的資料去學習
- online : 動態依照最新的資料修正模型
- active learning : 電腦詢問人類意見,例如在看網頁google會突然問你這張圖片是甚麼
Learning with different input space
- concrete : 明確的特徵 (domain knowledge),例如顏色、大小、角點
- raw : 最簡單的物理意義,例如圖片像素值
- abstract :沒有物理意義的,例如資料的ID或者編號,他只是代表某個順序
Week 4 Feasibility of Learning
Learning is impossible
- 如果問題沒有假設或者條件,會有無限多個解釋,導致學習是假的
- e.g: f(5,3,2) = 151022 , f(7,2,5) = ? 實際上是沒有標準答案的,出題者有任何解釋可以反駁你的答案(或者接受)
- No free lunch
- 沒有解法是萬用的,一定只適合某些條件與假設,且不適合其他情況
Probability to the Rescue
- 隨機取樣
- Hoeffding's Inequality : 隨機取樣的結果與母體實際的結果相距不遠,且與取樣數正相關,當取樣數越大則越接近
Connection to Learning
- 用機率證明ML 得到g當data夠多的時候約等於f
Connection to Real Learning
- 跌代產生很多個g
- g也有可能是錯的,當你運氣很差或者資料有問題
- e.g: 假如你連丟5次銅板都正面,你的機率是100%但實際上丟銅板是50%機率是正面
- Hoeffding's Inequality 證明所有的g裡面,錯誤的g的數量很少
- 當資料夠多,hypothesis有限,則learning is possible
Week 5 Training versus Testing
Effective Number of Lines
- 以PLA的範例來說,原本hypothesis 有無限個,但有些g是相似的,因此可以化簡hypothesis的量
- hypothesis的數量有個N的成長函數,資料的數量會影響hypothesis的數量
Break Point
(以PLA 2D資料 二元分類為例)
(以PLA 2D資料 二元分類為例)
- hypothesis理論值為2^N
- hypothesis成長函數的結果小於hypothesis理論值時的N
例如PLA範例2維的資料,當有4個點,hypothesis 數量的理論值是2^4=16
但因為限定在用直線分割所以實際上hypothesis 數量比16還小,代表這是break point
即有4個點以後,要做好ML是可能的
但因為限定在用直線分割所以實際上hypothesis 數量比16還小,代表這是break point
即有4個點以後,要做好ML是可能的
Week 6 Theory of Generalization
- 藉由break point 推倒bounding function
- 找出break point 就能反推成長函數
- Vapnik-Chervonenkis bound
- bounding founction 引入Effidenc
沒有留言:
張貼留言