How to remove none from pandas DataFrame

How to remove none from pandas DataFrame

How can I remove none row or column from pandas DataFrame ?

Pandas is very useful to handle table data.

In table data, sometimes it contains None data.

In that case we would like to remove None from specific column.

So how can we remove None ?

Today I will introduce about "How to remove none from pandas DataFrame".

目次

How to remove none from pandas DataFrame

How to remove none from pandas DataFrame

In order to remove None data, use dropna() method.

As its name, dropna() drops None data.

We can use it like below.

import pandas as pd

data_list1 = [
[1,2,None],
[2,None,4],
[None,4,5],
[4,5,6]
]
col_list1 = ["c1","c2","c3"]
df1 = pd.DataFrame(data=data_list1, columns=col_list1)
print(df1)

#     c1   c2   c3
# 0  1.0  2.0  NaN
# 1  2.0  NaN  4.0
# 2  NaN  4.0  5.0
# 3  4.0  5.0  6.0

df2 = df1.dropna()

print(df2)

#     c1   c2   c3
# 3  4.0  5.0  6.0

With using dropna(), we could extract rows that does not have None.

Then how can we handle more complex data ?

None in specific column

We could remove data that has None.

Then how can we check None in specific column ?

In order to set column condition in dropna(), we can use subset.

We can set column names in subset like below.

df3 = df1.dropna(subset=["c1","c2"])
print(df3)

#     c1   c2   c3
# 0  1.0  2.0  NaN
# 3  4.0  5.0  6.0

Now it removed rows that contain None in column c1 or c2.

None in all columns

So how can we remove data that has none in all columns ?

This case, use how="all".

If you set how="all", you can get data without rows that has none in all columns.

data_list1 = [
[1,2,None],
[2,None,4],
[None,None,None],
[4,5,6]
]

col_list1 = ["c1","c2","c3"]
df1 = pd.DataFrame(data=data_list1, columns=col_list1)
print(df1)

#     c1   c2   c3
# 0  1.0  2.0  NaN
# 1  2.0  NaN  4.0
# 2  NaN  NaN  NaN
# 3  4.0  5.0  6.0

df2 = df1.dropna()
print(df2)

#     c1   c2   c3
# 3  4.0  5.0  6.0

df4 = df1.dropna(how="all")
print(df4)

#     c1   c2   c3
# 0  1.0  2.0  NaN
# 1  2.0  NaN  4.0
# 3  4.0  5.0  6.0

Remove column that has none

With using dropna(), we could remove rows that has None.

Then how can we drop columns ?

In order to remove column, use axis=1 option.

data_list1 = [
[1,2,None],
[2,None,4],
[3,4,5],
[4,5,6]
]
col_list1 = ["c1","c2","c3"]
df1 = pd.DataFrame(data=data_list1, columns=col_list1)
print(df1)

#     c1   c2   c3
# 0  1.0  2.0  NaN
# 1  2.0  NaN  4.0
# 2  3.0  4.0  5.0
# 3  4.0  5.0  6.0

df5 = df1.dropna(axis=1)
print(df5)

#    c1
# 0   1
# 1   2
# 2   3
# 3   4

Now it removed columns that contain None.

Conclusion

Today I described about "How to remove none from pandas DataFrame".

In order to remove None, we can use dropna().

And we can use these options.

  • Filter by specific columns: subset=["column name"]
  • Remove rows that has None in all columns: how="all"
  • Remove columns: axis=1

pandas.DataFrame.dropna — pandas 1.2.4 documentation

It is useful. So I'd like to remember it.

How to remove none from pandas DataFrame

この記事が気に入ったら
いいね または フォローしてね!

If you like this article, please share !
  • URLをコピーしました!
  • URLをコピーしました!

Author

karasanのアバター karasan System engineer

Mid-career engineer (AI, Data science, Salesforce, etc.).
Good at Python and SQL.

目次