
How can I remove none row or column from pandas DataFrame ?
Pandas is very useful to handle table data.
In table data, sometimes it contains None data.
In that case we would like to remove None from specific column.
So how can we remove None ?
Today I will introduce about "How to remove none from pandas DataFrame".
How to remove none from pandas DataFrame


In order to remove None data, use dropna() method.
As its name, dropna() drops None data.
We can use it like below.
import pandas as pd
data_list1 = [
[1,2,None],
[2,None,4],
[None,4,5],
[4,5,6]
]
col_list1 = ["c1","c2","c3"]
df1 = pd.DataFrame(data=data_list1, columns=col_list1)
print(df1)
# c1 c2 c3
# 0 1.0 2.0 NaN
# 1 2.0 NaN 4.0
# 2 NaN 4.0 5.0
# 3 4.0 5.0 6.0
df2 = df1.dropna()
print(df2)
# c1 c2 c3
# 3 4.0 5.0 6.0With using dropna(), we could extract rows that does not have None.
Then how can we handle more complex data ?
None in specific column


We could remove data that has None.
Then how can we check None in specific column ?
In order to set column condition in dropna(), we can use subset.
We can set column names in subset like below.
df3 = df1.dropna(subset=["c1","c2"])
print(df3)
# c1 c2 c3
# 0 1.0 2.0 NaN
# 3 4.0 5.0 6.0Now it removed rows that contain None in column c1 or c2.
None in all columns


So how can we remove data that has none in all columns ?
This case, use how="all".
If you set how="all", you can get data without rows that has none in all columns.
data_list1 = [
[1,2,None],
[2,None,4],
[None,None,None],
[4,5,6]
]
col_list1 = ["c1","c2","c3"]
df1 = pd.DataFrame(data=data_list1, columns=col_list1)
print(df1)
# c1 c2 c3
# 0 1.0 2.0 NaN
# 1 2.0 NaN 4.0
# 2 NaN NaN NaN
# 3 4.0 5.0 6.0
df2 = df1.dropna()
print(df2)
# c1 c2 c3
# 3 4.0 5.0 6.0
df4 = df1.dropna(how="all")
print(df4)
# c1 c2 c3
# 0 1.0 2.0 NaN
# 1 2.0 NaN 4.0
# 3 4.0 5.0 6.0Remove column that has none


With using dropna(), we could remove rows that has None.
Then how can we drop columns ?
In order to remove column, use axis=1 option.
data_list1 = [
[1,2,None],
[2,None,4],
[3,4,5],
[4,5,6]
]
col_list1 = ["c1","c2","c3"]
df1 = pd.DataFrame(data=data_list1, columns=col_list1)
print(df1)
# c1 c2 c3
# 0 1.0 2.0 NaN
# 1 2.0 NaN 4.0
# 2 3.0 4.0 5.0
# 3 4.0 5.0 6.0
df5 = df1.dropna(axis=1)
print(df5)
# c1
# 0 1
# 1 2
# 2 3
# 3 4Now it removed columns that contain None.
Conclusion


Today I described about "How to remove none from pandas DataFrame".
In order to remove None, we can use dropna().
And we can use these options.
- Filter by specific columns:
subset=["column name"] - Remove rows that has None in all columns:
how="all" - Remove columns:
axis=1



It is useful. So I'd like to remember it.










