I was given a dataset by my professor and one of my questions is, "Find the number of missing values, 99999, in each column and list them." How would I do this in python? I have multiple columns all with numerical data. The missing values in开发者_开发问答 the dataset are denoted by '99999' instead of NA like usual.
I don't have much experience in python and have tried many things to no avail
Use a lambda function to find all occurrences of 99999; then use sum() to get the total number of occurrences per column
# import pandas package
import pandas as pd
# load dataset with pandas, for example if you have a csv:
df = pd.read_csv("YOUR_FILEPATH_HERE")
# print out the number of occurrences of 99999 in each column
print(df.apply(lambda x: (x == 99999).sum()))
A non pandas answer:
NA = 99999
data = [
  [  1, NA, 3 ],
  [ NA, NA, 6 ],
]
NAs = [0] * len(data[0])  # create an array of counters; 1 for each column
for row in data:
  for x,value in enumerate(row):
    if value == NA:
      NAs[x] += 1
print( NAs )
# Replace the missing value code '99999' with the default missing value code NaN
df = df.replace(99999, np.nan)
# Identify the missing values in each column of the DataFrame (where NaN is the default missing value code)
missing_values = df.isnull()
Remember to import numpy as np.
 
         
                                         
                                         
                                         
                                        ![Interactive visualization of a graph in python [closed]](https://www.devze.com/res/2023/04-10/09/92d32fe8c0d22fb96bd6f6e8b7d1f457.gif) 
                                         
                                         
                                         
                                         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论