2018年7月31日 星期二

[ Python 常見問題 ] Formatting “yesterday's” date in python

Source From Here 
Question 
I need to find "yesterday's" date in this format MMDDYY in Python. So for instance, today's date would be represented like this: 111009 I can easily do this for today but I have trouble doing it automatically for "yesterday". 

How-To 
Check sample code below: 
>>> from datetime import date, timedelta
>>> yesterday = date.today() - timedelta(1)
>>> print yesterday.strftime('%m%d%y')
'110909'

>>> import time
>>> print("Time stamp={}".format(int(time.mktime(yesterday.timetuple()))))
Time stamp=1532966400

Supplement 
python时间,日期,时间戳处理

2018年7月27日 星期五

[ Python 常見問題 ] requests - How to download large file in python?

Source From Here 
Question 
Requests is a really nice library. I'd like to use it for download big files (>1GB). The problem is it's not possible to keep whole file in memory I need to read it in chunks. And this is a problem with the following code: 
  1. import requests  
  2.   
  3. def DownloadFile(url)  
  4.     local_filename = url.split('/')[-1]  
  5.     r = requests.get(url)  
  6.     f = open(local_filename, 'wb')  
  7.     for chunk in r.iter_content(chunk_size=512 * 1024):   
  8.         if chunk: # filter out keep-alive new chunks  
  9.             f.write(chunk)  
  10.     f.close()  
  11.     return   
By some reason it doesn't work this way. It still loads response into memory before save it to a file. 

How-To 
It's much easier if you use Response.raw and shutil.copyfileobj()
  1. import requests  
  2. import shutil  
  3.   
  4. def download_file(url):  
  5.     local_filename = url.split('/')[-1]  
  6.     r = requests.get(url, stream=True)  
  7.     with open(local_filename, 'wb') as f:  
  8.         shutil.copyfileobj(r.raw, f)  
  9.   
  10.     return local_filename  
This streams the file to disk without using excessive memory, and the code is simple. For large file, you need to write the content piece by piece to avoid "out of memory": 
  1. def download_file(url):  
  2.     local_filename = url.split('/')[-1]  
  3.     # NOTE the stream=True parameter  
  4.     r = requests.get(url, stream=True)  
  5.     with open(local_filename, 'wb') as f:  
  6.         for chunk in r.iter_content(chunk_size=1024):   
  7.             if chunk: # filter out keep-alive new chunks  
  8.                 f.write(chunk)  
  9.                 #f.flush() commented by recommendation from J.F.Sebastian  
  10.     return local_filename  
See http://docs.python-requests.org/en/latest/user/advanced/#body-content-workflow for further reference.

[ Python 常見問題 ] requests - How to “log in” to a website?

Source From Here
Question
I am trying to post a request to log in to a website using the Requests module in Python but its not really working.

How-To
If the information you want is on the page you are directed to immediately after login, below is the simple code:
  1. payload = {'inUserName''USERNAME/EMAIL''inUserPass''PASSWORD'}  
  2. url = 'http://www.locationary.com/home/index2.jsp'  
  3. requests.post(url, data=payload)  
Assuming your login attempt was successful, you can simply use the session instance to make further requests to the site. The cookie that identifies you will be used to authorize the requests:
  1. import requests  
  2.   
  3. # Fill in your details here to be posted to the login form.  
  4. payload = {  
  5.     'inUserName''username',  
  6.     'inUserPass''password'  
  7. }  
  8.   
  9. # Use 'with' to ensure the session context is closed after use.  
  10. with requests.Session() as s:  
  11.     p = s.post('LOGIN_URL', data=payload)  
  12.     # print the html returned or something more intelligent to see if it's a successful login page.  
  13.     print p.text  
  14.   
  15.     # An authorised request.  
  16.     r = s.get('A protected web page url')  
  17.     print r.text  
  18.         # etc...  

2018年7月25日 星期三

[ Python 常見問題 ] Pandas - how do you filter pandas dataframes by multiple columns

Source From Here 
Question 
To filter a dataframe (df) by a single column, if we consider data with male and females we might: 
  1. males = df[df[Gender]=='Male']  
Question 1 - But what if the data spanned multiple years and i wanted to only see males for 2014? In other languages I might do something like: 
  1. if A = "Male" and if B = "2014" then   
(except I want to do this and get a subset of the original dataframe in a new dataframe object) 

Question 2. How do I do this in a loop, and create a dataframe object for each unique sets of year and gender (i.e. a df for: 2013-Male, 2013-Female, 2014-Male, and 2014-Female 

How-To 
Using & operator, don't forget to wrap the sub-statements with (): 
  1. males = df[(df[Gender]=='Male') & (df[Year]==2014)]  
To store your dataframes in a dict using a for loop: 
  1. from collections import defaultdict  
  2. dic={}  
  3. for g in ['male', 'female']:  
  4.   dic[g]=defaultdict(dict)  
  5.   for y in [2013, 2014]:  
  6.     dic[g][y]=df[(df[Gender]==g) & (df[Year]==y)] #store the DataFrames to a dict of dict  


2018年7月24日 星期二

[ Python 常見問題 ] Print in terminal with colors?

Source From Here 
Question 
How can I output colored text to the terminal, in Python? What is the best Unicode symbol to represent a solid block? 

How-To 
One easier option would be to use the cprint function from termcolor package: 
>>> from termcolor import cprint
>>> cprint('john', 'red')
john
>>> cprint('ken', 'green')
ken
>>> cprint('%s is %d years-old' % ('john', 37), 'blue')
john is 37 years-old

Or you can define a string that starts a color and a string that ends the color, then print your text with the start string at the front and the end string at the end: 
  1. CRED = '\033[91m'  
  2. CEND = '\033[0m'  
  3. print(CRED + "Error, does not compute!" + CEND)  
This produces the following in bash, in urxvt with a Zenburn-style color scheme: 
 

Through experemintation, we can get more colors: 
 

This way we can create a full color collection: 
  1. CEND      = '\33[0m'  
  2. CBOLD     = '\33[1m'  
  3. CITALIC   = '\33[3m'  
  4. CURL      = '\33[4m'  
  5. CBLINK    = '\33[5m'  
  6. CBLINK2   = '\33[6m'  
  7. CSELECTED = '\33[7m'  
  8.   
  9. CBLACK  = '\33[30m'  
  10. CRED    = '\33[31m'  
  11. CGREEN  = '\33[32m'  
  12. CYELLOW = '\33[33m'  
  13. CBLUE   = '\33[34m'  
  14. CVIOLET = '\33[35m'  
  15. CBEIGE  = '\33[36m'  
  16. CWHITE  = '\33[37m'  
  17.   
  18. CBLACKBG  = '\33[40m'  
  19. CREDBG    = '\33[41m'  
  20. CGREENBG  = '\33[42m'  
  21. CYELLOWBG = '\33[43m'  
  22. CBLUEBG   = '\33[44m'  
  23. CVIOLETBG = '\33[45m'  
  24. CBEIGEBG  = '\33[46m'  
  25. CWHITEBG  = '\33[47m'  
  26.   
  27. CGREY    = '\33[90m'  
  28. CRED2    = '\33[91m'  
  29. CGREEN2  = '\33[92m'  
  30. CYELLOW2 = '\33[93m'  
  31. CBLUE2   = '\33[94m'  
  32. CVIOLET2 = '\33[95m'  
  33. CBEIGE2  = '\33[96m'  
  34. CWHITE2  = '\33[97m'  
  35.   
  36. CGREYBG    = '\33[100m'  
  37. CREDBG2    = '\33[101m'  
  38. CGREENBG2  = '\33[102m'  
  39. CYELLOWBG2 = '\33[103m'  
  40. CBLUEBG2   = '\33[104m'  
  41. CVIOLETBG2 = '\33[105m'  
  42. CBEIGEBG2  = '\33[106m'  
  43. CWHITEBG2  = '\33[107m'  
Here is the code to generate the test: 
  1. x = 0  
  2. for i in range(24):  
  3.   colors = ""  
  4.   for j in range(5):  
  5.     code = str(x+j)  
  6.     colors = colors + "\33[" + code + "m\\33[" + code + "m\033[0m "  
  7.   print(colors)  
  8.   x=x+5  


Supplement 
URxvt 和 Bash 的自定义配色

[ Py DS ] Ch3 - Data Manipulation with Pandas (Part5)

Source From  Here   Pivot Tables   We have seen how the  GroupBy  abstraction lets us explore relationships within a dataset. A pivot ta...