How to Select Rows by Column Value in Pandas?

python

In a Pandas DataFrame, rows can be selected based on the column values using loc or query.

Working with the following dataframe:

import pandas as pd

df = pd.DataFrame({'colA': [1, 2, 3], 'colB': [2, 4, 8]})

Select using df.loc:

df2 = df.loc[df['colB'] == 4]
print(df2)
'''Output
   colA  colB
1     2     4
'''

df2 = df.loc[df['colB'] >= 4]
print(df2)
'''Output
   colA  colB
1     2     4
2     3     8
'''

# Multiple Boolean Conditions
df2 = df.loc[(df['colB'] == 2) | (df['colA'] == 2)]
print(df2)
'''Output
df2 = df.loc[(df['colB'] == 2) | (df['colA'] == 2)]
print(df2)
'''

# Use isin to check if value is in a list of multiple allowed values
df2 = df.loc[df['colB'].isin([2,3,4])]
print(df2)
'''Output
   colA  colB
0     1     2
1     2     4
'''

Alternatively use the df.query function:

df2 = df.query('colB==2')
print(df2)
'''Output
   colA  colB
0     1     2
'''

df2 = df.query('colB==2 | colB==8')
print(df2)
'''Output
   colA  colB
0     1     2
2     3     8
'''

Latest Questions

python How to Fix ""zsh: command not found: python" Error on MacOS X? python How to Fix "xlrd.biffh.XLRDError: Excel xlsx file; not supported" Error in Pandas? python How to Remove All Whitespace From a Python String?