2015年4月12日 星期日

[ Java 常見問題 ] How to encode URL to avoid special characters in java

Source From Here 
Question 
I need java code to encode URL to avoid special characters such as spaces and % and & ...etc 

How-To 
URL construction is tricky because different parts of the URL have different rules for what characters are allowed: for example, the plus sign is reserved in the query component of a URL because it represents a space, but in the path component of the URL, a plus sign has no special meaning and spaces are encoded as "%20". 

RFC 2396 explains (in section 2.4.2) that a complete URL is always in its encoded form: you take the strings for the individual components (scheme, authority, path, etc.), encode each according to its own rules, and then combine them into the complete URL string. Trying to build a complete unencoded URL string and then encode it separately leads to subtle bugs, like spaces in the path being incorrectly changed to plus signs (which an RFC-compliant server will interpret as real plus signs, not encoded spaces). 

In Java, the correct way to build a URL is with the URI class. Use one of the multi-argument constructors that takes the URL components as separate strings, and it'll escape each component correctly according to that component's rules. The toASCIIString() method gives you a properly-escaped and encoded string that you can send to a server. To decode a URL, construct a URI object using the single-string constructor and then use the accessor methods (such asgetPath()) to retrieve the decoded components. 

Don't use the URLEncoder class! Despite the name, that class actually does HTML form encoding, not URL encoding. It's not correct to concatenate unencoded strings to make an "unencoded" URL and then pass it through a URLEncoder. Doing so will result in problems (particularly the aforementioned one regarding spaces and plus signs in the path). 

Sample code: 
  1. package howto;  
  2.   
  3. import java.net.URI;  
  4.   
  5. public class URIDemo {  
  6.   
  7.     public static void main(String[] args) throws Exception{  
  8.         // URI(String scheme, String userInfo, String host, int port, String path, String query, String fragment)  
  9.         String scheme="http";  
  10.         String host="www.google.com.tw";  
  11.         String userInfo="john";  
  12.         String path="/abc /test/index.html";  
  13.         String query="name=(#john#)&password=123";  
  14.         String fragment="";  
  15.         int port=80;  
  16.         URI uri = new URI(scheme, userInfo, host, port, path, query, fragment);  
  17.           
  18.         System.out.printf("\t[Info] Encoded URL: %s\n", uri.toASCIIString());  
  19.     }  
  20. }  
Execution result: 
[Info] Encoded URL: http://john@www.google.com.tw:80/abc%20/test/index.html?name=(%23john%23)&password=123#

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

  Source From  Here 方案1: // x -----删除忽略文件已经对 git 来说不识别的文件 // d -----删除未被添加到 git 的路径中的文件 // f -----强制运行 #   git clean -d -fx 方案2: 今天在服务器上  gi...