2015年1月13日 星期二

[Linux 文章收集] UNIX / Linux Convert DOS Newlines CR-LF to Unix/Linux Format

Source From Here 
Preface 
How do I convert DOS newlines CR/LF to Unix/Linux format? To converts text files between DOS and Unix formats you need to use special utility calleddos2unix. DOS text files traditionally have carriage return (\r) and line feed (\n) pairs as their newline characters while Unix text files have the line feed as their newline character. 

How-To 
UNIX/Linux Commands 
You can use the following tools: 
dos2unix (also known as fromdos) - converts text files from the DOS format to the Unix format
unix2dos (also known as todos) - converts text files from the Unix format to the DOS format.
sed - You can use sed command for same purpose
tr command
* Perl one liner

Task: Convert DOS file to UNIX format 
Consider we have a file called myfile.txt in windows. Please transfer it to Linux machine and use "vi -b myfile.txt" to edit it and using ":set list" to show line feedand carriage return
 
("$"-> line feed; "^M"->carriage return

Type the following command to convert file 
$ dos2unix myfile.txt

However above command will not make a backup of original file myfile.txt. To make a backup of original file. The original file is renamed with the original filename and a .bak extension. Type the following command: 
$ dos2unix -b myfile.txt

Task: Convert UNIX file to DOS format 
Type the following command to convert file called myfile_unix.txt
$ unix2dos myfile_unix.txt
$ unix2dos -b myfile.txt

Task: Convert Dos TO Unix Using tr Command 
Type the following command: 
$ tr -d '\r' < input.file > output.file

Task: Convert Dos TO Unix Using Perl One Liner 
Type the following command: 
$ perl -pi -e "s/\r\n/\n/g" < input.txt > output.txt

Task: Convert UNIX to DOS format using sed command 
Type the following command if you are using bash shell: 
$ sed 's/$'"/`echo \\\r`/" input.txt > output.txt

Task: Convert DOS newlines (CR/LF) to Unix format using sed command 
Type the following command if you are using bash shell: 
$ sed 's/^M//g' myfile.txt > output.txt

(press Ctrl-V then Ctrl-M to get pattern or special symbol

Toolkit: Transfer To Win/Mac/Lin EOL 
Below is a toolkit which can help you to switch text mode file into Win/Mac/Lin EOL without considering EOF of your input file. Let's check the usage: 
$ ./seof.py
Usage: ./seof.py input output type(win|mac|lin)

So if you have a file called myfile.txt in windows. You can try below commands to switch to different type of EOF: 
$ ./seof.py myfile.txt linout.txt lin # Switch to Linux like EOL
[Info] Newline=[10]...
[Info] Process 30 byte(s) from myfile.txt...
[Info] Output 25 byte(s) to linout.txt...

$ vi -b linout.txt # Using ":set list" to shows tabs (^l) and end of line ($)


$ ./seof.py myfile.txt macout.txt mac # Switch to Mac EOL
[Info] Newline=[13]...
[Info] Process 30 byte(s) from myfile.txt...
[Info] Output 25 byte(s) to linout.txt...

$ vi -b macout.txt


$ ./seof.py myfile.txt winout.txt win # Switch to Win EOL
[Info] Newline=[13, 10]...
[Info] Process 30 byte(s) from myfile.txt...
[Info] Output 30 byte(s) to winout.txt...

$ vi -b winout.txt
 

Below is the source code of seof.py
  1. # Input validation  
  2. newline=None  
  3. if len(sys.argv) < 4:  
  4.     print("Usage: {0} input output type(win|mac|lin)".format(sys.argv[0]))  
  5.     exit(1)  
  6.   
  7. type = {"win":[0x0d,0x0a], "mac":[0x0d], "lin":[0x0a]}  
  8. if not os.path.exists(sys.argv[1]):  
  9.     print("Error: Inputfile={0} doesn't exist!".format(sys.argv[1]))  
  10.     exit(1)  
  11.   
  12. if sys.argv[3] not in type.keys():  
  13.     print("Error: Wrong type='{0}'!".format(sys.argv[3]))  
  14.     exit(1)  
  15. else:  
  16.     newline = type[sys.argv[3]]  
  17. print("\t[Info] Newline={0}...".format(newline))  
  18.   
  19. # Analyze input  
  20. # http://www.ascii-code.com/  
  21. outbuf = []  
  22. bc=0  
  23. with open(sys.argv[1], "rb") as inf:  
  24.     b = inf.read(1)  
  25.     while b != "":  
  26.         bc+=1  
  27.         if ord(b) == 10:  
  28.             outbuf = outbuf + newline  
  29.         elif ord(b)==13:  
  30.             outbuf = outbuf + newline  
  31.             nc = inf.read(1)  
  32.             if nc!="":  
  33.                 bc+=1  
  34.                 if ord(nc) != 10:  
  35.                     outbuf.append(nc)  
  36.         else:  
  37.             outbuf.append(b)  
  38.         b = inf.read(1)  
  39. print("\t[Info] Process %d byte(s) from %s..." % (bc, sys.argv[1]))  
  40. print("\t[Info] Output %d byte(s) to %s..." % (len(outbuf), sys.argv[2]))  
  41.   
  42. # Output  
  43. with open(sys.argv[2], "wb") as of:  
  44.     of.write(bytearray(outbuf))  
Supplement 
Advanced Vi Cheat Sheet

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

  Source From  Here 方案1: // x -----删除忽略文件已经对 git 来说不识别的文件 // d -----删除未被添加到 git 的路径中的文件 // f -----强制运行 #   git clean -d -fx 方案2: 今天在服务器上  gi...