About
Kodeclik is an online coding academy for kids and teens to learn real world programming. Kids are introduced to coding in a fun and exciting way and are challeged to higher levels with engaging, high quality content.
Popular Classes
Scratch Coding
Minecraft Coding
TinkerCAD
Roblox Studio
Python for Kids
Javascript for Kids
Pre-Algebra
Geometry for Kids
Copyright @ Kodeclik 2025. All rights reserved.
Sometimes when working with Python numpy arrays, we might end up with some elements of the array being NaN (ie Not a Number). For instance, assume that you are given an array of numbers and you would like to find the average square root of the elements in the array. So we write a program such as:
Note that this program creates a numpy array my_array containing six numbers: [1, 4, 9, 16, 25, -36] where we have conveniently included a negative number (to illustrate our idea in this blogpost). Then, it applies the square root function np.sqrt() to each element of the array, storing the results in my_array_sqrts. When calculating square roots, the program will generate a NaN (Not a Number) for the last element (-36) since the square root of a negative number is undefined in real numbers.
The output will be:
Note that the program prints the resulting array of square roots, which will show the square roots of the positive numbers (1, 2, 3, 4, 5) followed by a NaN. When it tries to calculate the mean of all values in my_array_sqrts using np.mean(), however, the resulting mean will also be NaN because NumPy's default mean function doesn’t know what to do with NaN values.
Remember that -36 is indeed a number but square root of -36 is the computation that leads to an NaN.
So not only do we get an NaN, that NaN leads to further problems down the line.
What we really need is a way to conveniently discard these values when we are computing this average.
The np.nanmean() function is specifically designed to handle arrays containing NaN (Not a Number) values by excluding them from the calculation.
Consider the updated code:
The output is:
Note that we still get the complaint in the console, the NaN in the array, but the average computation is no longer affected! The program correctly outputs 3, which is the average of all numbers sans the NaN.
Here is a second example:
This example creates an array with explicit NaN values using the “np.nan” notation, and then computes the average by ignoring the NaN values.
The output will be:
In this example we calculate row-wise averages in a 2D array using the axis=1 parameter:
The output is:
This example shows column-wise average calculation using axis=0, which is particularly useful when dealing with datasets that have missing values in different columns.
The output will be:
In all of the above examples, the np.nanmean() function automatically adjusts the denominator in the mean calculation to account for only the non-NaN values, ensuring accurate averages.
You might be wondering - how would you find the mean of these arrays if you do not have access to the nanmean() method? Here are a couple of ideas for that!
The first idea uses boolean indexing with ~np.isnan() to create a mask that filters out NaN values. Here is some example code:
This is the most straightforward approach and works well for simple calculations.
The output will be:
The second method uses NumPy's masked array functionality, which is particularly useful in cases like this where we wish to mask (ie ignore) NaN values:
The output will be:
For 2D arrays, you can use list comprehension with boolean indexing to calculate means along specific axes:
The output will be:
In summary, these methods will produce the same results as np.nanmean() but offer more explicit control over how missing values are handled.
If you liked this blogpost, checkout our other numpy related blogposts, such as numpy.isnan(),numpy.unique(), and numpy.sum().
Want to learn Python with us? Sign up for 1:1 or small group classes.
import numpy as np
my_array = np.array([1, 4, 9, 16, 25, -36])
my_array_sqrts = np.sqrt(my_array)
print(my_array_sqrts)
print (np.mean(my_array_sqrts))
main.py:4: RuntimeWarning: invalid value encountered in sqrt
my_array_sqrts = np.sqrt(my_array)
[ 1. 2. 3. 4. 5. nan]
nan
import numpy as np
my_array = np.array([1, 4, 9, 16, 25, -36])
my_array_sqrts = np.sqrt(my_array)
print(my_array_sqrts)
print (np.nanmean(my_array_sqrts))
main.py:4: RuntimeWarning: invalid value encountered in sqrt
my_array_sqrts = np.sqrt(my_array)
[ 1. 2. 3. 4. 5. nan]
3.0
import numpy as np
# Example 2: Simple 1D array
arr2 = np.array([10, np.nan, 20, 30, np.nan, 40])
result2 = np.nanmean(arr2)
print(f"1D Array average: {result2}")
1D Array average: 25.0
import numpy as np
# Example 3: 2D array with row-wise average
arr3 = np.array([[10, 20, np.nan],
[40, 50, np.nan],
[np.nan, 6, np.nan]])
result3 = np.nanmean(arr3, axis=1)
print(f"2D Array row-wise averages: {result3}")
2D Array row-wise averages: [15. 45. 6.]
import numpy as np
arr4 = np.array([[24, 32, 85],
[57, np.nan, 16],
[8, 17, np.nan],
[43, 78, 39]])
result4 = np.nanmean(arr4, axis=0)
print(f"Column-wise averages: {result4}")
Column-wise averages: [33. 42.33333333 46.66666667]
import numpy as np
# Create sample array with NaN values
arr = np.array([10, np.nan, 20, 30, np.nan, 40])
# Method 1: Using boolean indexing
mean1 = np.mean(arr[~np.isnan(arr)])
print("Method 1 (Boolean indexing):", mean1)
Method 1 (Boolean indexing): 25.0
import numpy as np
# Create sample array with NaN values
arr = np.array([10, np.nan, 20, 30, np.nan, 40])
# Method 2: Using masked arrays
masked_arr = np.ma.masked_array(arr, np.isnan(arr))
mean2 = np.mean(masked_arr)
print("Method 2 (Masked array):", mean2)
Method 2 (Masked array): 25.0
import numpy as np
# For 2D arrays
arr2d = np.array([[10, 20, np.nan],
[40, np.nan, 60],
[70, 80, np.nan]])
# Row-wise mean using boolean indexing
row_means = np.array([np.mean(row[~np.isnan(row)]) for row in arr2d])
print("Row-wise means:", row_means)
Row-wise means: [15. 50. 75.]