2018年8月19日 星期日

[ Python 常見問題 ] Pandas read_csv from url

Source From Here 
Question 
I am using Python 3.4 with IPython and have the following code. I'm unable to read a csv-file from the given URL: 
  1. import pandas as pd  
  2. import requests  
  3.   
  4. url="https://github.com/cs109/2014_data/blob/master/countries.csv"  
  5. s=requests.get(url).content  
  6. c=pd.read_csv(s)  
I have the following error 
"Expected file path name or file-like object, got type"

How can I fix this? 

How-To 
Just as the error suggests , pandas.read_csv needs a file-like object as the first argument. If you want to read the csv from a string, you can use io.StringIO (Python 3.x) or StringIO.StringIO (Python 2.x) . Also, for the URL - https://github.com/cs109/2014_data/blob/master/countries.csv - you are getting back html response , not raw csv, you should use the url given by the Raw link in the github page for getting raw csv response , which is - https://raw.githubusercontent.com/cs109/2014_data/master/countries.csv 
  1. import pandas as pd  
  2. import io  
  3. import requests  
  4.   
  5. url="https://raw.githubusercontent.com/cs109/2014_data/master/countries.csv"  
  6. s=requests.get(url).content  
  7. c=pd.read_csv(io.StringIO(s.decode('utf-8')))  
From pandas 0.19.2 you can now just pass the url directly.

沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

  Source From  Here 方案1: // x -----删除忽略文件已经对 git 来说不识别的文件 // d -----删除未被添加到 git 的路径中的文件 // f -----强制运行 #   git clean -d -fx 方案2: 今天在服务器上  gi...