当前位置: 首页 > news >正文

Pandas - How to know which columns of a dataframe has null value?

 

df = pd.read_csv('housing.csv')df.info()

 

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 13580 entries, 0 to 13579
Data columns (total 21 columns):#   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  0   Suburb         13580 non-null  object 1   Address        13580 non-null  object 2   Rooms          13580 non-null  int64  3   Type           13580 non-null  object 4   Price          13580 non-null  float645   Method         13580 non-null  object 6   SellerG        13580 non-null  object 7   Date           13580 non-null  object 8   Distance       13580 non-null  float649   Postcode       13580 non-null  float6410  Bedroom2       13580 non-null  float6411  Bathroom       13580 non-null  float6412  Car            13518 non-null  float6413  Landsize       13580 non-null  float6414  BuildingArea   7130 non-null   float6415  YearBuilt      8205 non-null   float6416  CouncilArea    12211 non-null  object 17  Lattitude      13580 non-null  float6418  Longtitude     13580 non-null  float6419  Regionname     13580 non-null  object 20  Propertycount  13580 non-null  float64
dtypes: float64(12), int64(1), object(8)
memory usage: 2.2+ MB

 

df.isnull().any()

 

Suburb           False
Address          False
Rooms            False
Type             False
Price            False
Method           False
SellerG          False
Date             False
Distance         False
Postcode         False
Bedroom2         False
Bathroom         False
Car               True
Landsize         False
BuildingArea      True
YearBuilt         True
CouncilArea       True
Lattitude        False
Longtitude       False
Regionname       False
Propertycount    False
dtype: bool

 

df.columns

 

Index(['Suburb', 'Address', 'Rooms', 'Type', 'Price', 'Method', 'SellerG','Date', 'Distance', 'Postcode', 'Bedroom2', 'Bathroom', 'Car','Landsize', 'BuildingArea', 'YearBuilt', 'CouncilArea', 'Lattitude','Longtitude', 'Regionname', 'Propertycount'],dtype='object')

 

has_null_cols = df.columns[df.isnull().any()].tolist()
has_null_cols# ['Car', 'BuildingArea', 'YearBuilt', 'CouncilArea']

 

df.isnull().any()[df.isnull().any() == True].index.tolist()# ['Car', 'BuildingArea', 'YearBuilt', 'CouncilArea']

 

df.isnull().sum()[df.isnull().sum() > 0].index.tolist()# ['Car', 'BuildingArea', 'YearBuilt', 'CouncilArea']