2015年11月29日 星期日

[ Algorithm ] Radix Sorts - LSD Radix Sort

Source From Here 
Introduction 
Least-significant-digit-first string sort (LSD Sort) is a fast stable sorting algorithm which can be used to sort keys in integer representation order. Keys may be a string of characters, or numerical digits in a given 'radix'. The processing of the keys begins at the least significant digit (i.e., the rightmost digit), and proceeds to the most significant digit (i.e., the leftmost digit). The sequence in which digits are processed by an LSD radix sort is the opposite of the sequence in which digits are processed by a most significant digit (MSD) radix sort. An LSD radix sort operates in O(nk) time, where n is the number of keys, and k is the average key length. 

Key-indexed Counting 
So far we have known below sorting algorithms which have frequency of operations = key compares. 



which has Lower bound. ~ N lg N compares required by any compare-based algorithm. Can we do better (despite the lower bound)? Yes, if we don't depend on key compares. Before we talk about Key-indexed counting, we have below assumption and implication: 

Assumption. Keys are integers between 0 and R - 1.
Implication. Can use key as an array index.
Remark. Keys may have associated data.

Below is a sorting example: 



Key-indexed counting demo 
Goal. Sort an array a[] of N integers between 0 and R - 1. The algorithm includes below steps: 

* Count frequencies of each letter using key as index.
* Compute frequency cumulates which specify destinations.
* Access cumulates using key as index to move items.
* Copy back into original array.

Below is the pseudo code and the testing data used in this demonstration: 



Firstly we count frequencies of each letter using key as index: 


Then we compute frequency cumulatively which specify destinations which help us in moving element from original list to sorted list: 


Next we start to move element from original list a to temple list aux


Finally, replace original list with sorted list: 


Analysis 
Proposition. Key-indexed counting uses ~ 11 N + 4 R array accesses to sort N items whose keys are integers between 0 and R - 1. Key-indexed counting uses extra space proportional to N + R and it is a stable sorting algorithm. 

LSD string (radix) sort 
Here explain how to use LSD algorithm to sorting string: 

* Consider characters from right to left.
* Stably sort using dth character as the key (using key-indexed counting).

Below is a simple example: 



Java Implementation 


What if strings do not have same length? Create a customized charAt API to handle out-of-boundary access. For example: 

  1. /** 
  2. * Return character as Integer of input String at specific position 
  3.  
  4. * @param val: String value 
  5. * @param d: Position of character 
  6.  
  7. * @return character at specific position of input string as integer. 
  8. */  
  9. int charAt(String val, d)  
  10. {  
  11.     if( d >= val.length()) return 0  
  12.     else return (int)val[d]  
  13. }  
Groovy Implementation 
The groovy implementation is from Git hub project AlgorithmCA
- LSDSortStr.groovy 
  1. package sort  
  2.   
  3. import java.util.List;  
  4.   
  5. class LSDSortStr extends SortAlgImp {  
  6.     int R = 10  
  7.        
  8.     /** 
  9.      * Return character as Integer of input String at specific position 
  10.      *  
  11.      * @param val: String value 
  12.      * @param d: Position of character 
  13.      *  
  14.      * @return character at specific position of input string as integer. 
  15.      */  
  16.     int charAt(String val, d)  
  17.     {  
  18.         if( d >= val.length()) return 0  
  19.         else return (int)val[d]  
  20.     }  
  21.       
  22.     void sort(List list, int mlen)  
  23.     {  
  24.         int N = list.size()  
  25.         def aux = []; N.times{aux[it]=null}  
  26.         def count = new int[R+1] as int[]; (R+1).times{count[it]=0}  
  27.         mlen.times{ d->  
  28.             d = (mlen-d-1)  
  29.             // Count frequencies of each letter using key as index.  
  30.             for(String str:list) {                
  31.                 count[charAt(str, d)+1]++                 
  32.             }  
  33.               
  34.             // Compute frequency cumulates which specify destinations.  
  35.             int p=0  
  36.             R.times{ r->               
  37.                 count[r+1]+=count[r]                  
  38.             }  
  39.               
  40.             // Access cumulates using key as index to move items.  
  41.             N.times{ n->               
  42.                 aux[count[charAt(list[n], d)]++] = list[n]  
  43.             }  
  44.               
  45.             // Copy back into original array.  
  46.             N.times{ n->  
  47.                 list[n] = aux[n]  
  48.             }  
  49.               
  50.             // Clean  
  51.             (R+1).times{count[it]=0}  
  52.         }  
  53.     }  
  54.   
  55.     @Override  
  56.     public void sort(List list) {  
  57.         int mlen=0  
  58.           
  59.         // Find the longest length among all strings -> mlen  
  60.         list.each{if(it.length()>mlen) mlen=it.length()}  
  61.           
  62.         // Start sorting with LSD  
  63.         sort(list, mlen)      
  64.     }  
  65.   
  66.     static void main(args)  
  67.     {  
  68.         def data = ["John""Joseph""Ken""Mary""Peter""johnson",  
  69.                     "Tim""Tom""Anderson""Andy""Bob""Forry",  
  70.                     "Carter""Derek"]  
  71.         // Java Character has 16 bits (2 bytes) = 2^16 = 0~65535   
  72.         LSDSortStr lsdSort = new LSDSortStr(R:65536)  
  73.         lsdSort.test(data)  
  74.     }  
  75. }  
Summary of the performance of sorting algorithms 



Supplement 
[ 資料結構 小學堂 ] 排序 : 基數排序法

[Python 學習筆記] 進階議題 : 修飾器 (函式修飾器)

轉載自 這裡 
前言 : 
你設計了一個點餐程式,目前主餐有炸雞,價格為49元 : 
  1. def friedChicken():  
  2.     return 49.0  
  3.   
  4. print(friedChicken()) # 49.0  
之後在幾個地方都 呼叫了 friedChicken() 函式來計算餐點價格,現在你打算增加附餐,但又不想直接修改 friedChicken() 函式,也不想另外增加一個 friedChickenside1() 函式,然後到處修改先前使用到 friedchicken() 函式的地方,則你可以這麼撰寫 : 
  1. def friedChicken():  
  2.     return 49.0  
  3.   
  4. print('Fried Chicken=', friedChicken()) # 49.0  
  5.   
  6. def sideDish1(meal):  
  7.     return lambda: meal() + 30  
  8.   
  9. friedChicken = sideDish1(friedChicken)  
  10. print('Fried Chicken Dish=',friedChicken()) # 79.0  
sideDish1() 接受函式物件,函式中使用 lambda 建立一個函式物件,該函式物件執行傳入的函式取得主餐價格,再加上附餐價格,sideDish1() 傳回所建立的函式物件給friedchicken參考,所以之後執行的 friedChicken(),就會是主餐加附餐的價格. 

函式修飾器 : 
以上是傳遞函式的一個應用. 在 Python 中,你還可以使用以下的語法 : 
- test2.py : 
  1. def sideDish1(meal):  
  2.     return lambda: meal() + 30  
  3.   
  4. @sideDish1  
  5. def friedChicken():  
  6.     return 49.0  
  7.   
  8. print('Fried Chicken Dish=', friedChicken()) # 79  
@ 之後所接上的名稱,實際上就是個函式@sideDish1 這樣的標注方式,讓 @sideDish1 更像是個修飾器(Decorator),將 friedChicken() 函式加以修飾,增加附餐價格! 

你可以堆疊修飾器,例如 : 
- test3.py : 
  1. def sideDish1(meal):  
  2.     return lambda: meal() + 30  
  3.   
  4. def sideDish2(meal):  
  5.     return lambda: meal() + 40  
  6.   
  7. @sideDish2  
  8. @sideDish1  
  9. def friedChicken():  
  10.     return 49.0  
  11.   
  12. print(friedChicken()) # 119.0  
實際上, 上面代碼等同 : 
  1. def sideDish1(meal):  
  2.     return lambda: meal() + 30  
  3.       
  4. def sideDish2(meal):  
  5.     return lambda: meal() + 40  
  6.   
  7. def friedChicken():  
  8.     return 49.0  
  9.   
  10. friedchicken = sideDish2(sideDish1(friedChicken))  
  11.   
  12. print(friedchicken())   # 119.0  
如果你的修飾器語法需要帶有參數,則記得,會先以參數執行一次修飾器,傳回函式物件再修飾指定的函式. 例如 : 
  1. @deco('param')  
  2. def func():  
  3.     pass  
實際上等於 : 
  1. func = deco('param')(func)  
所以若要讓點餐程式更有彈性一些,你可以這麼設計 : 
- test4.py : 
  1. def sideDish(number):  
  2.     return {  
  3.         1 : lambda meal: (lambda: meal() + 30),  
  4.         2 : lambda meal: (lambda: meal() + 40),  
  5.         3 : lambda meal: (lambda: meal() + 50),  
  6.         4 : lambda meal: (lambda: meal() + 60)  
  7.         }.get(number, lambda meal: (lambda: meal()))  
  8.   
  9. @sideDish(2)  
  10. @sideDish(3)  
  11. def friedChicken():  
  12.     return 49.0  
  13.   
  14. print('Side Dish3+2=', friedChicken()) # 139.0  
以上的程式都是使用 lambda 建立傳回的函式,若不易理解,以下這個是個較清楚的版本 : 
  1. def sidedish(number):  
  2.     def dish1(meal):  
  3.         return lambda: meal() + 30  
  4.           
  5.     def dish2(meal):  
  6.         return lambda: meal() + 40  
  7.           
  8.     def dish3(meal):  
  9.         return lambda: meal() + 50  
  10.           
  11.     def dish4(meal):  
  12.         return lambda: meal() + 60  
  13.           
  14.     def nodish(meal):  
  15.         return lambda: meal()  
  16.           
  17.     return {  
  18.         1 : dish1,  
  19.         2 : dish2,  
  20.         3 : dish3,  
  21.         4 : dish4  
  22.     }.get(number, nodish)  
  23.   
  24. @sidedish(2)  
  25. @sidedish(3)  
  26. def friedchicken():  
  27.     return 49.0  
  28.      
  29. print(friedchicken()) # 139.0  
Supplement 
[Python 學習筆記] 函式、類別與模組 : 函式 (lambda 運算式)

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

  Source From  Here 方案1: // x -----删除忽略文件已经对 git 来说不识别的文件 // d -----删除未被添加到 git 的路径中的文件 // f -----强制运行 #   git clean -d -fx 方案2: 今天在服务器上  gi...