How to Select Rows by Column Value in Pandas?
In a Pandas DataFrame, rows can be selected based on the column values using loc or query.
Working with the following dataframe:
import pandas as pd
df = pd.DataFrame({'colA': [1, 2, 3], 'colB': [2, 4, 8]})
Select using df.loc:
df2 = df.loc[df['colB'] == 4]
print(df2)
'''Output
colA colB
1 2 4
'''
df2 = df.loc[df['colB'] >= 4]
print(df2)
'''Output
colA colB
1 2 4
2 3 8
'''
# Multiple Boolean Conditions
df2 = df.loc[(df['colB'] == 2) | (df['colA'] == 2)]
print(df2)
'''Output
df2 = df.loc[(df['colB'] == 2) | (df['colA'] == 2)]
print(df2)
'''
# Use isin to check if value is in a list of multiple allowed values
df2 = df.loc[df['colB'].isin([2,3,4])]
print(df2)
'''Output
colA colB
0 1 2
1 2 4
'''
Alternatively use the df.query function:
df2 = df.query('colB==2')
print(df2)
'''Output
colA colB
0 1 2
'''
df2 = df.query('colB==2 | colB==8')
print(df2)
'''Output
colA colB
0 1 2
2 3 8
'''