2015年11月29日 星期日

[ Algorithm ] Radix Sorts - LSD Radix Sort

Source From Here
Introduction
Least-significant-digit-first string sort (LSD Sort) is a fast stable sorting algorithm which can be used to sort keys in integer representation order. Keys may be a string of characters, or numerical digits in a given 'radix'. The processing of the keys begins at the least significant digit (i.e., the rightmost digit), and proceeds to the most significant digit (i.e., the leftmost digit). The sequence in which digits are processed by an LSD radix sort is the opposite of the sequence in which digits are processed by a most significant digit (MSD) radix sort. An LSD radix sort operates in O(nk) time, where n is the number of keys, and k is the average key length.

Key-indexed Counting
So far we have known below sorting algorithms which have frequency of operations = key compares.

which has Lower bound. ~ N lg N compares required by any compare-based algorithm. Can we do better (despite the lower bound)? Yes, if we don't depend on key compares. Before we talk about Key-indexed counting, we have below assumption and implication:

Assumption. Keys are integers between 0 and R - 1.
Implication. Can use key as an array index.
Remark. Keys may have associated data.

Below is a sorting example:

Key-indexed counting demo
Goal. Sort an array a[] of N integers between 0 and R - 1. The algorithm includes below steps:

* Count frequencies of each letter using key as index.
* Compute frequency cumulates which specify destinations.
* Access cumulates using key as index to move items.
* Copy back into original array.

Below is the pseudo code and the testing data used in this demonstration:

Firstly we count frequencies of each letter using key as index:

Then we compute frequency cumulatively which specify destinations which help us in moving element from original list to sorted list:

Next we start to move element from original list a to temple list aux

Finally, replace original list with sorted list:

Analysis
Proposition. Key-indexed counting uses ~ 11 N + 4 R array accesses to sort N items whose keys are integers between 0 and R - 1. Key-indexed counting uses extra space proportional to N + R and it is a stable sorting algorithm.

LSD string (radix) sort
Here explain how to use LSD algorithm to sorting string:

* Consider characters from right to left.
* Stably sort using dth character as the key (using key-indexed counting).

Below is a simple example:

Java Implementation

What if strings do not have same length? Create a customized charAt API to handle out-of-boundary access. For example:

1. /**
2. * Return character as Integer of input String at specific position
3.
4. * @param val: String value
5. * @param d: Position of character
6.
7. * @return character at specific position of input string as integer.
8. */
9. int charAt(String val, d)
10. {
11.     if( d >= val.length()) return 0
12.     else return (int)val[d]
13. }
Groovy Implementation
The groovy implementation is from Git hub project AlgorithmCA
- LSDSortStr.groovy
1. package sort
2.
3. import java.util.List;
4.
5. class LSDSortStr extends SortAlgImp {
6.     int R = 10
7.
8.     /**
9.      * Return character as Integer of input String at specific position
10.      *
11.      * @param val: String value
12.      * @param d: Position of character
13.      *
14.      * @return character at specific position of input string as integer.
15.      */
16.     int charAt(String val, d)
17.     {
18.         if( d >= val.length()) return 0
19.         else return (int)val[d]
20.     }
21.
22.     void sort(List list, int mlen)
23.     {
24.         int N = list.size()
25.         def aux = []; N.times{aux[it]=null}
26.         def count = new int[R+1] as int[]; (R+1).times{count[it]=0}
27.         mlen.times{ d->
28.             d = (mlen-d-1)
29.             // Count frequencies of each letter using key as index.
30.             for(String str:list) {
31.                 count[charAt(str, d)+1]++
32.             }
33.
34.             // Compute frequency cumulates which specify destinations.
35.             int p=0
36.             R.times{ r->
37.                 count[r+1]+=count[r]
38.             }
39.
40.             // Access cumulates using key as index to move items.
41.             N.times{ n->
42.                 aux[count[charAt(list[n], d)]++] = list[n]
43.             }
44.
45.             // Copy back into original array.
46.             N.times{ n->
47.                 list[n] = aux[n]
48.             }
49.
50.             // Clean
51.             (R+1).times{count[it]=0}
52.         }
53.     }
54.
55.     @Override
56.     public void sort(List list) {
57.         int mlen=0
58.
59.         // Find the longest length among all strings -> mlen
60.         list.each{if(it.length()>mlen) mlen=it.length()}
61.
62.         // Start sorting with LSD
63.         sort(list, mlen)
64.     }
65.
66.     static void main(args)
67.     {
68.         def data = ["John""Joseph""Ken""Mary""Peter""johnson",
69.                     "Tim""Tom""Anderson""Andy""Bob""Forry",
70.                     "Carter""Derek"]
71.         // Java Character has 16 bits (2 bytes) = 2^16 = 0~65535
72.         LSDSortStr lsdSort = new LSDSortStr(R:65536)
73.         lsdSort.test(data)
74.     }
75. }
Summary of the performance of sorting algorithms

Supplement
[ 資料結構 小學堂 ] 排序 : 基數排序法

[Python 學習筆記] 進階議題 : 修飾器 (函式修飾器)

1. def friedChicken():
2.     return 49.0
3.
4. print(friedChicken()) # 49.0

1. def friedChicken():
2.     return 49.0
3.
4. print('Fried Chicken=', friedChicken()) # 49.0
5.
6. def sideDish1(meal):
7.     return lambda: meal() + 30
8.
9. friedChicken = sideDish1(friedChicken)
10. print('Fried Chicken Dish=',friedChicken()) # 79.0
sideDish1() 接受函式物件，函式中使用 lambda 建立一個函式物件，該函式物件執行傳入的函式取得主餐價格，再加上附餐價格，sideDish1() 傳回所建立的函式物件給friedchicken參考，所以之後執行的 friedChicken()，就會是主餐加附餐的價格.

- test2.py :
1. def sideDish1(meal):
2.     return lambda: meal() + 30
3.
4. @sideDish1
5. def friedChicken():
6.     return 49.0
7.
8. print('Fried Chicken Dish=', friedChicken()) # 79
@ 之後所接上的名稱，實際上就是個函式@sideDish1 這樣的標注方式，讓 @sideDish1 更像是個修飾器（Decorator），將 friedChicken() 函式加以修飾，增加附餐價格!

- test3.py :
1. def sideDish1(meal):
2.     return lambda: meal() + 30
3.
4. def sideDish2(meal):
5.     return lambda: meal() + 40
6.
7. @sideDish2
8. @sideDish1
9. def friedChicken():
10.     return 49.0
11.
12. print(friedChicken()) # 119.0

1. def sideDish1(meal):
2.     return lambda: meal() + 30
3.
4. def sideDish2(meal):
5.     return lambda: meal() + 40
6.
7. def friedChicken():
8.     return 49.0
9.
10. friedchicken = sideDish2(sideDish1(friedChicken))
11.
12. print(friedchicken())   # 119.0

1. @deco('param')
2. def func():
3.     pass

1. func = deco('param')(func)

- test4.py :
1. def sideDish(number):
2.     return {
3.         1 : lambda meal: (lambda: meal() + 30),
4.         2 : lambda meal: (lambda: meal() + 40),
5.         3 : lambda meal: (lambda: meal() + 50),
6.         4 : lambda meal: (lambda: meal() + 60)
7.         }.get(number, lambda meal: (lambda: meal()))
8.
9. @sideDish(2)
10. @sideDish(3)
11. def friedChicken():
12.     return 49.0
13.
14. print('Side Dish3+2=', friedChicken()) # 139.0

1. def sidedish(number):
2.     def dish1(meal):
3.         return lambda: meal() + 30
4.
5.     def dish2(meal):
6.         return lambda: meal() + 40
7.
8.     def dish3(meal):
9.         return lambda: meal() + 50
10.
11.     def dish4(meal):
12.         return lambda: meal() + 60
13.
14.     def nodish(meal):
15.         return lambda: meal()
16.
17.     return {
18.         1 : dish1,
19.         2 : dish2,
20.         3 : dish3,
21.         4 : dish4
22.     }.get(number, nodish)
23.
24. @sidedish(2)
25. @sidedish(3)
26. def friedchicken():
27.     return 49.0
28.
29. print(friedchicken()) # 139.0
Supplement
[Python 學習筆記] 函式、類別與模組 : 函式 (lambda 運算式)

[Git 文章收集] Differences between git merge and git rebase

Source From  Here Preface Merging and rebasing are the two most popular way to applying changes from one branch into another one. They bot... 