2020年9月16日 星期三

[ Python 常見問題 ] Shuffling Several DataFrames Together

 Source From Here

Question
Is it possible to shuffle several DataFrames together?

For example I have a DataFrame df1 and a DataFrame df2. I want to shuffle the rows randomly, but for both DataFrames in the same way.

Example
  1. import pandas as pd  
  2.   
  3. df1 = pd.DataFrame(data=[[1,2,3],[4,5,6], [7,8,9]], columns=['f1''f2''f3'])  
  4. df2 = pd.DataFrame({'label':[0,1,0]})  


HowTo
I think you can double reindex with applying numpy.random.permutation to index, but it is necessary both DataFrames have same length and same unique index values:
  1. import numpy as np  
  2.   
  3. new_idx = np.random.permutation(df1.index)  
  4. df1 = df1.reindex(new_idx)  
  5. df2 = df2.reindex(new_idx)  


Alternative with reindex_axis:
  1. print (df1.reindex_axis(idx, axis=0))  
  2. print (df2.reindex_axis(idx, axis=0))  





沒有留言:

張貼留言

[Git 常見問題] error: The following untracked working tree files would be overwritten by merge

  Source From  Here 方案1: // x -----删除忽略文件已经对 git 来说不识别的文件 // d -----删除未被添加到 git 的路径中的文件 // f -----强制运行 #   git clean -d -fx 方案2: 今天在服务器上  gi...