程式扎記: [Linux 文章收集] 10 good shell scripting practices

Source From Here

Preface
Everybody working in UNIX can do a decent level of shell scripting. Depending on your expertise, the kind of commands you use, the way you look at a problem, the mode of arriving at a solution could be different. For people in their early stages of shell scripting, it is good to follow certain practices which will help you in learning the art faster, cleaner and better. We discussed about the 5 important things to follow to become good in shell scripting in one of our earlier articles. Along the same lines, the following are the 10 points in which we will discuss about the good practices.

Learn less, do more
You want to do some scripting in UNIX, be it in shell, perl or python. So, you took a book and started studying. Some people study the entire thing first and start practising. This might be a good method for some, but I do not believe too much in this. Instead, just study the basic, the very basic with which you can start with something. Once done with the basic, start writing simple programs. Gradually, build your requirement. Go back and develop your program further. If you get stuck due to lack of knowledge, go back to the book. Read what you want. Come back and start developing. Build requirement further. Read a little more if stuck. Carry on. I believe in this way than to read the entire stuff first because how much ever we read, unless we start practicing, we cannot correlate lot of things, and it does not add much value to your study either. This learn-practice-stuck method helps me a lot.

Try at the command prompt
Some times, we get some error in our script. We fix the error, run the script, it errors again. And this fix and error process goes on for some time. At times, shell script errors will be a little misleading in the sense the actual location of the error could be something else. First, we need to zero in on the line or the command which is the problematic one. This can easily be done by having some debug statements before and after the statement. Once the statement is found, try to execute the same command at the command prompt by preparing the requisite inputs. Once it starts working properly at the command prompt, you can easily figure out the reason why it is not working in the script which could be due to some incorrect input, or due to some environment variables mismatch, or a binary from different location being referred, etc. This makes the debugging a lot easier because its easier to fix the issue when you run at the prompt that in the script which is surrounded by a lot of many other things.

Keep big files in mind
An issue comes to you. You provide a solution just to address the issue which is currently in hand without looking at the big picture or in other words, the big files. Say, for example, we want the first line in a file or the header line:

# sed -n '1p' file

This will, of course, give the first line which you are looking for. What if the file being worked upon is a huge file with millions of records? The above sed command, even though prints only the first line, ends up parsing the entire file which might create performance problems to you with big files. The solution:

# time sed -n '1p' verified_online.csv.20160520110305
...
real 0m0.014s
user 0m0.010s
sys 0m0.004s
# time sed -n '1p;1q' verified_online.csv.20160520110305 // This command, will simply print the first line and quit.
...
real 0m0.002s
user 0m0.000s
sys 0m0.002s

Try different ways always
There is more than one way to do anything in scripting. You get a requirement, and you provide a solution in a particular way. Next time you come across a requirement of almost the same type, do not do it in the same way as you did earlier. Just try doing it in some other way. Some other day, try something even new. The many different options you become aware, the more grip you will get on things, and the more different your thinking will be. For example:

view plaincopy to clipboardprint?
if [ $? -eq 0 ]   
then     
    echo "Success"   
fi  

Can be written this way:

view plaincopy to clipboardprint?
[ $? -eq 0 ] && echo "Success"  

Now, you will know why we have some many articles in this blog with the title "Different ways of ....": Different ways of deleting Ctrl-M character, different ways of doing arithmetic in UNIX, etc...All these articles are to help the blog subscribers to always keep your options open, and not to stick to one particular way of doing.

Do it. Do it faster
Scripting is done to save time, to improve our productivity, to make things faster. By the way, don't we take a lot of time to write and test a script? Say, we want to write a script. We open a file, write stuff, save the file. Run the script. Got errors. Open the file again. Correct errors. Save it. Run it. Open the file again. Correct errors. And this process goes on. In one of our earlier articles Shell script to do shell scripting faster, we saw how we can considerably reduce the time to write and test a shell script on the fly (as we write) without coming back to command prompt. Use methods like these to write or design your scripts faster. I use this script always. And I can definitely say that I have saved a lot of time with this.

Use Internal commands a lot
In scenarios, wherever possible, go for internal commands than external commands. In one of our earlier articles on Internal vs External commands, we saw the differences between internal and external commands. Using internal commands will always benefit you. Depending on the size of the input files being processed, the internal commands can save you a lot in performance. Not always you get a choice to choose internal over external, but in some scenarios, one can definitely take the right option.

Useless use of cat(UUC)
This is one of the things which we witness frequently in forums. The term "useless use of cat " refers to usage of the cat command when there was actually no need for it. In fact, many users are pretty much used to using of UUC. UUC makes your program ugly and leads to increase in performance even though you will always get the expected result. UUC Example:

$ cat /etc/passwd | grep guru

Right way:

$ grep guru /etc/passwd

As shown above, the usage of cat was not needed at all. Many users are so used to starting a command with "cat", they are not able to give it up :). Never use the cat if there is no need for it.

Read error messages
One of the common mistakes done by a user is When we type any command and if it results in error, most of us just look at the error in a flash without actually reading the error message. Most of the times, the error message itself contains the solution needed. More importantly, at times, say we are working to fix an error. After a fix, we run again only to get the same error. Fix again, and error again. This goes on for some time. In between, it might have so happened that the original error actually went off and some new error is coming which we might have overlooked. And we still keep wondering why the fix is not working. So, always read the error messages very very carefully.

Do not make big commands
You are trying to filter out a particular component of a big output. While doing this, we might end up achieving using a lot of commands in sequence with each command piping the output to the next one. Though we might actually get the expected end result, it does not look good, and more importantly pretty tough to understand for people. Having said this, there are situations where one cannot avoid this. Still, a user should try to avoid piping lot of commands into one. The following are some of the genuine cases where piping of multiple commands can be avoided.
- Ex: To retrieve the username for the user-id 502.
Not good approach:

$ grep 502 /etc/passwd | cut -d: -f1

$ grep 502 /etc/passwd | awk -F":" '{print $1}'

The good one should be:

$ awk -F":" '$3==502{print $1}' /etc/passwd

As shown above, the requirement can actually be achieved using a single awk command.

Always use comments to give a brief
A script is written. After a week or two, you open the script and go over it, you take a little time to understand it if there are no comments in it, inspite of we being the author, resulting in a little waste of time. Now, imagine if somebody else opens and tries to understand it, more waste of time. Scripts are written to save time, no way we can end up wasting time to understand those things whose sole purpose is to save time. Always make it a habit to put comments in your script and make it more readable. The comments need not be very detailed, but just enough for a person to understand when he reads it. It always helps.

Supplement
* VBird - sed 工具：行的新增/刪除, 行的取代/顯示, 搜尋並取代, 直接改檔
* A Shell Script to do shell scripting faster

程式扎記

標籤

2016年5月21日星期六

[Linux 文章收集] 10 good shell scripting practices

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

檢舉濫用情形

學習筆記

標籤

2016年5月21日 星期六