程式扎記: [ Algorithm ] Maze problem

Question:
在 AI 的範疇中, 當你的問題的 State space 是未知, 且每個 State 可以執行的 actions 也只有到那個 State 時才可以知道時, 這意味著你無法像 offline search 的問題一樣在正式開始解問題前先進行沙盤推演 (建立 search tree), 然後套用 search algorithm (DFS, BFS, A* etc) 找到最佳路徑. 而這類只能走一步算一步的問題我們歸類為 online search problem. 底下是更多有關 online search problem 的描述:

Offline search algorithms compute a complete solution before setting foot in the real world and then execute the solution. In contrast, an online search agentinterleaves computation and action: first it takes an action, then it observes the environment and computes the next action. Online search is a good idea in dynamic or semidynamic domains - domains where there is a penalty for sitting around and computing too long. Online search is also helpful in nondeterministic domains because it allows the agent to focus its computational efforts on the contingencies that actually arise rather than those that might happen but probably won't. Of course, there is a tradeoff: the more an agent plans ahead, the less often it will find itself up the creek without a paddle.

當假設我們的問題是 deterministic 且 fully observable environment, 底下的訊息是已知:
- ACTIONS(s)

Which returns a list of actions allowed in state s.

- The step-cost function c(s,a,s')

Note that this cannot be used until the agent knows the s' is the outcome.

- GOAL-TEST(s)

Used to verify the goal state.

說到這裡我們可以發現 Maze(迷宮) 是不是很像一個 Online search 的的問題, 首先我們對迷宮一無所知 (Unknown state space), 走進去才知道每一步可以走的下一步有哪些 (Action(s) is known only for current state and passed states). 而底下我們要介紹 Online DFS Search 來解 Maze 問題. 考慮我們有下面的 Maze:

Online DFS Agent:
底下是 Online DFS Search 的 pseudo code, 該函數透過傳入一個當前的 State 來返回下一個可能的 Action. 並請把已經走過的 State 放到 unbacktracked 的堆疊中, 當走到死胡同時 (Dead end), 就把上一個 State 取回來 (Pop out unbacktracked) 並繼續該 State 尚未執行過的 Action(s). 而 untried 可以把它想成一個 Map 對應每個 State 與它尚未執行過的 Actions; 而 result 則是用來記錄走過的路徑:

Coding and definition:
在解任何 Search problem 之前要先定義幾個東西, 首先是 State Representation:

這邊用 tuple (x,y) 來代表當前的位置或是 State. 在上面的 Maze 的問題中, Initial State 即是 (0,0); 而 Goal state 是 (4,3).

再來是 Actions:

因為我們是二維的 Maze, 所以 Actions list 包含: UP,DOWN,LEFT,RIGHT 與 STOP. 通常走到 Goal state 或是沒路可以走時就返回 STOP.

接著是 Cost 的定義:

這邊沒有做 Optimal 的需求, 所以就定義每 take 一個 Action 就 Cost unit 1.

最後剩下的就是 Coding 了... 首先我使用類別 Maze 來代表迷宮. 在建構子中傳入迷宮的長相 (int[][] mb), Initial State s 與 Goal state g:

view plaincopy to clipboardprint?
package c4.maze;  
  
import java.util.LinkedList;  
import java.util.Queue;  
  
public class Maze {  
    public int[][] mazeBody; // 0 is obstacle, 1 is road  
    public State curState=null;  
    public State goalState=null;  
      
    public Maze(int[][] mb, State s, State g){this.mazeBody=mb; this.curState=s; this.goalState=g;}  
      
    public boolean move(){return false;}  
    public boolean goalTest(){return curState.equals(goalState);}  
    public boolean goalTest(State s){ return s.equals(goalState);}  
      
    public State result(Action a)  
    {  
        if(a.equals(Action.RIGHT)) curState.x++;  
        else if(a.equals(Action.LEFT)) curState.x--;  
        else if(a.equals(Action.UP)) curState.y++;  
        else if(a.equals(Action.DOWN)) curState.y--;  
        return curState;  
    }  
      
    public State result(State s, Action a)  
    {  
        State nextState = s.cloneself();  
        if(a.equals(Action.RIGHT)) nextState.x++;  
        else if(a.equals(Action.LEFT)) nextState.x--;  
        else if(a.equals(Action.UP)) nextState.y++;  
        else if(a.equals(Action.DOWN)) nextState.y--;  
        return nextState;  
    }  
      
    public Queue actions()  
    {  
        return actions(curState);  
    }  
      
    public Queue actions(State state)  
    {  
        Queue actions = new LinkedList();  
        int moveRight = state.x+1;  
        int moveLeft = state.x-1;  
        int moveDown = state.y-1;  
        int moveUp = state.y+1;  
        if(moveRight0) actions.add(Action.RIGHT);  
        if(moveLeft>=0 && mazeBody[moveLeft][state.y]>0) actions.add(Action.LEFT);  
        if(moveDown>=0 && mazeBody[state.x][moveDown]>0) actions.add(Action.DOWN);  
        if(moveUp0].length && mazeBody[state.x][moveUp]>0) actions.add(Action.UP);  
        System.out.printf("\t[Test] %s has %d actions...\n", state, actions.size());  
        return actions;  
    }  
      
    /** 
     * BD : Skip action which will lead you back to original state. 
     * @param src 
     * @param state 
     * @return 
     */  
    public Queue actions(Action prvAct, State state)  
    {  
        Queue actions = new LinkedList();  
        int moveRight = state.x+1;  
        int moveLeft = state.x-1;  
        int moveDown = state.y-1;  
        int moveUp = state.y+1;  
        if(moveUp0].length &&   
           mazeBody[state.x][moveUp]>0 &&   
           !Action.DOWN.equals(prvAct)) actions.add(Action.UP);  
        if(moveDown>=0 &&   
           mazeBody[state.x][moveDown]>0 &&   
           !Action.UP.equals(prvAct)) actions.add(Action.DOWN);  
        if(moveRight
           mazeBody[moveRight][state.y]>0 &&   
           !Action.LEFT.equals(prvAct)) actions.add(Action.RIGHT);  
        if(moveLeft>=0 &&   
           mazeBody[moveLeft][state.y]>0 &&   
           !Action.RIGHT.equals(prvAct)) actions.add(Action.LEFT);  
              
        System.out.printf("\t[Test] %s has %d actions...\n", state, actions.size());  
        return actions;  
    }  
      
    @Override  
    public String toString()  
    {  
        StringBuffer strBuf = new StringBuffer("");  
        for(int j=mazeBody[0].length-1; j>=0; j--)  
        {  
            for(int i=0; i
            {  
                if(i==curState.x && j==curState.y) strBuf.append("S");  
                else if(i==goalState.x && j==goalState.y) strBuf.append("G");  
                else if(mazeBody[i][j]>0) strBuf.append("R");  
                else strBuf.append("#");                  
            }  
            strBuf.append("\r\n");  
        }  
        return strBuf.toString();  
    }  
}  

這邊要特別說明的是函數 result(State s, Action a). 參數 s 是指當前的 State; 而參數 a 則是上一個 State 執行該 Action 來到當前的 State; 而返回的是當前的 State 可以執行的 Action(s), 從代碼中可以發現我把 Action 回將當前 State 帶回前一個 State 從返回可執行 Action 中移除, 為的是避免 loop 的發生. 接著是類別 State: 用來代表當前的狀態:

view plaincopy to clipboardprint?
package c4.maze;  
  
public class State {  
    public int x=-1;  
    public int y=-1;  
      
      
    public State(int x, int y){this.x = x; this.y = y;}  
      
    @Override  
    public boolean equals(Object o)  
    {  
        if(o!=null && o instanceof State)  
        {  
            State s = (State)o;  
            if(s.x == x && s.y == y) return true;  
        }  
        return false;  
    }  
      
      
      
    public State cloneself(){return new State(x,y);}  
      
    @Override  
    public String toString()  
    {  
        return String.format("(%d,%d)", x,y);  
    }  
}  

接著還有列舉 Action, 用來表示移動的方向或是終止條件:

view plaincopy to clipboardprint?
package c4.maze;  
  
public enum Action {  
    DOWN,UP,LEFT,RIGHT,STOP;  
      
    public Action Reverse()  
    {  
        if(this.equals(DOWN)) return UP;  
        else if(this.equals(UP)) return DOWN;  
        else if(this.equals(LEFT)) return RIGHT;  
        else if(this.equals(RIGHT)) return LEFT;  
        return null;  
    }  
      
    @Override  
    public String toString()  
    {  
        if(this.equals(DOWN)) return "Move Down";  
        else if(this.equals(UP)) return "Move Up";  
        else if(this.equals(LEFT)) return "Move Left";  
        else if(this.equals(RIGHT)) return "Move Right";  
        else if(this.equals(STOP)) return "Stop";  
        return "Unknown";  
    }  
}  

最後是 Online DFS Search 的實作, 使用類別 ODFSAgent 代表:

view plaincopy to clipboardprint?
package c4.maze;  
  
import java.util.HashMap;  
import java.util.Queue;  
import java.util.Stack;  
  
/** 
* BD : Online DFS Agents 
* @author John 
* 
*/  
public class ODFSAgent {  
    private Maze maze;  
    private State s=null;   // Previous state  
    private Action a=null;  // Previous action  
    private HashMap> untried= new HashMap>();  
    private Stack unbacktracked = new Stack();  
    private HashMap result = new HashMap();  
  
    public ODFSAgent(Maze m){maze = m;}  
      
      
    public Action odfs(State cs)  
    {  
        if(maze.goalTest()) return Action.STOP;  
        if(!untried.containsKey(cs)) untried.put(cs.x*100+cs.y, maze.actions(a, cs));  
        if(s!=null)  
        {  
            result.put(String.format("%s:%s", s,a), cs);  
            //cs = maze.result(s, a);  
            unbacktracked.add(s);  
        }  
        if(untried.get(cs.x*100+cs.y).isEmpty())  
        {  
            if(unbacktracked.isEmpty()) return Action.STOP;  
            else a = a.Reverse();             
        }  
        else  
        {  
            a = untried.get(cs.x*100+cs.y).poll();  
            System.out.printf("\t[Test] %s->%s\n", cs, a);  
        }  
        s = cs;  
        return a;  
    }  
      
    public void run() throws Exception  
    {  
        State is = maze.curState;  
        Action a = null;  
        do  
        {  
            a = odfs(is);           // Retrieve action based on current state  
            System.out.printf("\t[Info] %s...\n", a);             
            is = maze.result(a);    // Execute action and get result state.  
            System.out.printf("\t[Info] Maze:\n%s\n", maze);  
            Thread.sleep(1000);  
        }while(!a.equals(Action.STOP));  
    }  
      
    /** 
     * @param args 
     */  
    public static void main(String[] args) throws Exception{  
        State cs = new State(0,0);  
        State gs = new State(4,3);  
        int mb[][] = {{1,1,0,1},{1,0,0,1},{1,1,1,1},{1,0,0,0},{1,1,1,1}};  
        Maze maze = new Maze(mb, cs, gs);  
        System.out.printf("\t[Info] Maze:\n%s\n", maze);  
        ODFSAgent odfsAgent = new ODFSAgent(maze);  
        odfsAgent.run();  
    }  
}  

執行結果如下:

...略...
[Test] (4,2) has 1 actions...
[Test] (4,2)->Move Up
[Info] Move Up...
[Info] Maze:
RRR#S
##R#R
R#R#R
RRRRR

[Info] Stop...
[Info] Maze:
RRR#S
##R#R
R#R#R
RRRRR

完整代碼可以在這裡下載.

程式扎記

標籤

2012年10月21日星期日

[ Algorithm ] Maze problem - Using Online DFS Search

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

檢舉濫用情形

學習筆記

標籤

2012年10月21日 星期日