Source From Here
Question
Is it possible to shuffle several DataFrames together?
For example I have a DataFrame df1 and a DataFrame df2. I want to shuffle the rows randomly, but for both DataFrames in the same way.
Example
HowTo
I think you can double reindex with applying numpy.random.permutation to index, but it is necessary both DataFrames have same length and same unique index values:
Alternative with reindex_axis:
Is it possible to shuffle several DataFrames together?
For example I have a DataFrame df1 and a DataFrame df2. I want to shuffle the rows randomly, but for both DataFrames in the same way.
Example
- import pandas as pd
- df1 = pd.DataFrame(data=[[1,2,3],[4,5,6], [7,8,9]], columns=['f1', 'f2', 'f3'])
- df2 = pd.DataFrame({'label':[0,1,0]})
HowTo
I think you can double reindex with applying numpy.random.permutation to index, but it is necessary both DataFrames have same length and same unique index values:
- import numpy as np
- new_idx = np.random.permutation(df1.index)
- df1 = df1.reindex(new_idx)
- df2 = df2.reindex(new_idx)
Alternative with reindex_axis:
- print (df1.reindex_axis(idx, axis=0))
- print (df2.reindex_axis(idx, axis=0))
沒有留言:
張貼留言