Reindexing in Pandas DataFrame - GeeksforGeeks (2024)

Reindexing in Pandas can be used to change the index of rows and columns of a DataFrame. Indexes can be used with reference to many index DataStructure associated with several pandas series or pandas DataFrame. Let’s see how can we Reindex the columns and rows in Pandas DataFrame.

Reindexing the Rows

One can reindex a single row or multiple rows by using reindex() method. Default values in the new index that are not present in the dataframe are assigned NaN.

Example #1:

Python3

# import numpy and pandas module

import pandas as pd

import numpy as np

column=['a','b','c','d','e']

index=['A','B','C','D','E']

# create a dataframe of random values of array

df1 = pd.DataFrame(np.random.rand(5,5),

columns=column, index=index)

print(df1)

print('\n\nDataframe after reindexing rows: \n',

df1.reindex(['B', 'D', 'A', 'C', 'E']))

Output:

a b c d e
A 0.129087 0.445892 0.898532 0.892862 0.760018
B 0.635785 0.380769 0.757578 0.158638 0.568341
C 0.713786 0.069223 0.011263 0.166751 0.960632
D 0.913553 0.676715 0.141932 0.202201 0.346274
E 0.050204 0.132140 0.371349 0.633203 0.791738

Dataframe after reindexing rows:
a b c d e
B 0.635785 0.380769 0.757578 0.158638 0.568341
D 0.913553 0.676715 0.141932 0.202201 0.346274
A 0.129087 0.445892 0.898532 0.892862 0.760018
C 0.713786 0.069223 0.011263 0.166751 0.960632
E 0.050204 0.132140 0.371349 0.633203 0.791738

Example #2:

Python3

# import numpy and pandas module

import pandas as pd

import numpy as np

column = ['a', 'b', 'c', 'd', 'e']

index = ['A', 'B', 'C', 'D', 'E']

Reindexing the columns using the axis keyword

One can reindex a single column or multiple columns by using reindex() method and by specifying the axis we want to reindex. Default values in the new index that are not present in the dataframe are assigned NaN.

Example #1:

Python3

# import numpy and pandas module

import pandas as pd

import numpy as np

column=['a','b','c','d','e']

index=['A','B','C','D','E']

#create a dataframe of random values of array

df1 = pd.DataFrame(np.random.rand(5,5),

columns=column, index=index)

column=['e','a','b','c','d']

# create the new index for columns

print(df1.reindex(column, axis='columns'))

Output:

 e a b c d
A 0.592727 0.337282 0.686650 0.916076 0.094920
B 0.235794 0.030831 0.286443 0.705674 0.701629
C 0.882894 0.299608 0.476976 0.137256 0.306690
D 0.758996 0.711712 0.961684 0.235051 0.315928
E 0.911693 0.436031 0.822632 0.477767 0.778608

Example #2:

Python3

# import numpy and pandas module

import pandas as pd

import numpy as np

column =['a', 'b', 'c', 'd', 'e']

index =['A', 'B', 'C', 'D', 'E']

# create a dataframe of random values of array

df1 = pd.DataFrame(np.random.rand(5, 5),

columns = column, index = index)

column =['a', 'b', 'c', 'g', 'h']

# create the new index for columns

print(df1.reindex(column, axis ='columns'))

Output:

 a b c g h
A 0.390460 0.795073 0.369077 NaN NaN
B 0.855556 0.856980 0.132092 NaN NaN
C 0.662565 0.230554 0.215567 NaN NaN
D 0.712128 0.424346 0.813452 NaN NaN
E 0.543142 0.847750 0.168018 NaN NaN

Replacing the missing values

Code #1: Missing values from the dataframe can be filled by passing a value to the keyword fill_value. This keyword replaces the NaN values.

Python3

# import numpy and pandas module

import pandas as pd

import numpy as np

column =['a', 'b', 'c', 'd', 'e']

index =['A', 'B', 'C', 'D', 'E']

# create a dataframe of random values of array

df1 = pd.DataFrame(np.random.rand(5, 5),

columns = column, index = index)

column =['a', 'b', 'c', 'g', 'h']

# create the new index for columns

print(df1.reindex(column, axis ='columns', fill_value = 1.5))

Output:

 a b c g h
A 0.945594 0.492603 0.705738 1.5 1.5
B 0.794345 0.068308 0.017898 1.5 1.5
C 0.622142 0.880565 0.035528 1.5 1.5
D 0.577288 0.934063 0.824655 1.5 1.5
E 0.636026 0.316232 0.244597 1.5 1.5

Code #2: Replacing the missing data with a string.

Python3

# import numpy and pandas module

import pandas as pd

import numpy as np

column =['a', 'b', 'c', 'd', 'e']

index =['A', 'B', 'C', 'D', 'E']

# create a dataframe of random values of array

df1 = pd.DataFrame(np.random.rand(5, 5),

columns = column, index = index)

column =['a', 'b', 'c', 'g', 'h']

# create the new index for columns

print(df1.reindex(column, axis ='columns', fill_value ='data missing'))

Output:

 a b c g h
A 0.227380 0.809179 0.879175 data missing data missing
B 0.212493 0.335610 0.306006 data missing data missing
C 0.406346 0.852985 0.422182 data missing data missing
D 0.145821 0.648285 0.004842 data missing data missing
E 0.002305 0.694541 0.657602 data missing data missing

Don't miss your chance to ride the wave of the data revolution! Every industry is scaling new heights by tapping into the power of data. Sharpen your skills and become a part of the hottest trend in the 21st century.

Dive into the future of technology - explore the Complete Machine Learning and Data Science Program by GeeksforGeeks and stay ahead of the curve.

Last Updated : 31 Jul, 2023

Like Article

Save Article

Clean the string data in the given Pandas Dataframe

Mapping external values to dataframe values in Pandas

Share your thoughts in the comments

Reindexing in Pandas DataFrame - GeeksforGeeks (2024)

Reindexing the Rows

Python3

Python3

Reindexing the columns using the axis keyword

Python3

Python3

Replacing the missing values

Python3

Python3

Please Login to comment...

References