โณ Loading Python Engine...

๐Ÿ“Š Day 22 : NumPy Advanced

๐ŸŽฏ Enterprise Objective

To manipulate data efficiently, you must master the art of selecting exactly what you need. Today we cover Slicing matrices, filtering data with Boolean Masks, and applying mathematical transformations across different shapes using the magic of Broadcasting.

๐Ÿ“‹ Strategic Overview

#TopicConcept
1Slicingarr[0:2, :], Views
2Maskingarr[arr > 5], Filtering
3BroadcastingShape stretching (+)

1. Slicing & Indexing : Navigating Matrices

๐Ÿ” What is it?

NumPy slicing is similar to Python lists but extended to multiple dimensions. You access elements using arr[row, col]. Slices return Views (not copies!), meaning if you modify a slice, the original array changes.

# [start_row:stop_row, start_col:stop_col]
sub_matrix = matrix[0:2, 1:3]

๐Ÿ’ผ Why Data Analysts Care

โ€ข Image Cropping: An image is just a 3D NumPy array (Height, Width, RGB). Cropping is just slicing: img[100:200, 100:200]

โ€ข Data Sampling: Extracting every 10th row using a step slice: data[::10, :]

โš ๏ธ The View Trap

Slicing a NumPy array does not copy data; it creates a 'View'. slice = arr[:2]; slice[0] = 99 WILL change the original array. Use arr[:2].copy() if you need an independent copy.

In [ ]:

๐Ÿงช Concept Checks: Slicing

Q1. Create arr = np.arange(10). Slice the first 5 elements.

In [ ]:

Q2. Create a 4x4 matrix using np.arange(16).reshape((4,4)). Print it.

In [ ]:

Q3. From the 4x4 matrix, extract the 2x2 square in the top-right corner. Print it.

In [ ]:

Q4. Demonstrate the view trap: extract the first row, change its first element to 999, and print the original matrix to see it changed.

In [ ]:

Q5. Extract the second column from the matrix as a 1D array using matrix[:, 1].

In [ ]:

2. Boolean Masking : Filtering Data

๐Ÿ” What is it?
Boolean Masking is how we filter arrays. When you apply a condition to an array (e.g., arr > 5), it returns an array of Booleans (True/False). You can use this Boolean array inside the brackets to select only the True elements.
mask = arr > 5       # [False, True, ...]
filtered = arr[mask] # Keeps only the True values

๐Ÿ’ผ Why Data Analysts Care

โ€ข Outlier Removal: clean_data = data[(data > -3) & (data < 3)] to remove extreme z-scores

โ€ข Conditional Assignment: arr[arr < 0] = 0 to instantly cap all negative numbers to zero

๐Ÿง  Pro Tip

When combining conditions, you MUST use bitwise operators & (and) / | (or) instead of Python's and/or. You MUST also wrap conditions in parentheses: (arr > 2) & (arr < 8).

In [ ]:

๐Ÿงช Concept Checks: Boolean Masking

Q1. Create arr = np.array([10, 50, 30, 80, 20]). Create a mask for values > 40 and print the mask.

In [ ]:

Q2. Use the mask from Q1 to extract and print the values greater than 40.

In [ ]:

Q3. Use a combined mask (arr > 20) & (arr < 60) to filter the array. Print the result.

In [ ]:

Q4. Try to use the Python and keyword instead of & for the combined mask. Catch the ValueError.

In [ ]:

Q5. Replace all values in the array that are < 30 with -1 using a mask assignment. Print the updated array.

In [ ]:

3. Broadcasting : Shape Alignment

๐Ÿ” What is it?
Broadcasting is NumPy's way of doing math between arrays of different shapes. If the shapes are compatible, NumPy 'stretches' the smaller array to match the larger one without actually making copies in memory.
Array AArray BResultWorks?
(3, 3)Scalar 5(3, 3)Yes (Scalar stretches)
(3, 3)(3,)(3, 3)Yes (Row stretches down)
(3, 3)(4,)ErrorNo (Dimensions mismatch)

๐Ÿ’ผ Why Data Analysts Care

โ€ข Standardization: Subtracting the mean of each column from a large matrix: matrix - means_array

โ€ข Color Adjustments: Multiplying an RGB image matrix (1080, 1920, 3) by a brightness vector (3,)

๐Ÿง  Pro Tip

Broadcasting starts checking dimensions from the trailing (rightmost) edge. They must be equal, or one of them must be 1. If not, you get a ValueError: operands could not be broadcast together.

In [ ]:

๐Ÿงช Concept Checks: Broadcasting

Q1. Create a 3x2 matrix of ones. Multiply it by the scalar 10. Print the result.

In [ ]:

Q2. Create a 3x3 matrix of zeros. Add a 1D array [1, 2, 3] to it. Observe how it broadcasts across rows.

In [ ]:

Q3. Reshape [1, 2, 3] into a column (3, 1). Add it to the zeros matrix. Observe how it broadcasts across columns.

In [ ]:

Q4. Try to add a 1D array of length 4 to a 3x3 matrix. Catch the ValueError and print the error message.

In [ ]:

Q5. Explain why (4, 3) broadcasts with (3,) but fails with (4,). (Hint: Right-to-left dimension matching).

In [ ]:

๐Ÿ› ๏ธ Professional Practice Tasks

Theory is useless without muscle memory. Complete these tasks to solidify your understanding.

Task 1 (Matrix Borders): Create a 5x5 array of zeros. Use slicing to set the outer border (first row, last row, first col, last col) to 1. Print the result.

In [ ]:

Task 2 (Checkerboard): Create an 8x8 array of zeros. Use step slicing [::2] to create a checkerboard pattern of 1s and 0s (like a chess board).

In [ ]:

Task 3 (Outlier Capping): Create an array of 50 random numbers from a standard normal distribution (np.random.randn). Use boolean masking to cap any values > 2 to 2, and any values < -2 to -2. Print the min and max to verify.

In [ ]:

Task 4 (Column Standardization): Create a 10x3 matrix of random integers. Calculate the mean of each column (.mean(axis=0)). Subtract this mean array from the matrix using broadcasting. The new matrix columns should have a mean of 0.

In [ ]:

Task 5 (Distance Matrix): Create a 1D array x = np.arange(5). Use broadcasting to create a 5x5 matrix where each element M[i,j] = abs(x[i] - x[j]). (Hint: reshape one x to column).

In [ ]:

๐Ÿ’ป Pure Coding Interview Questions

Q1.

What is an array View in NumPy? How does it differ from a Copy?

In [ ]:

Q2.

How do you forcefully create a copy of a slice instead of a view?

In [ ]:

Q3.

Explain Boolean Masking. What type of array is generated as the mask?

In [ ]:

Q4.

Why does NumPy require & and | instead of and and or for boolean arrays?

In [ ]:

Q5.

Explain the broadcasting rules in NumPy. What does 'trailing dimensions' mean?

In [ ]:

Q6.

Write code to add a 1D array of length 3 to the columns of a 4x3 matrix.

In [ ]:

Q7.

How do you add a 1D array of length 4 to the rows of a 4x3 matrix? (Hint: np.newaxis or reshape).

In [ ]:

Q8.

What does np.where(condition, x, y) do? Write an example replacing negatives with 0.

In [ ]:

Q9.

How do you select specific arbitrary rows from a matrix using a list of indices? (Fancy Indexing).

In [ ]:

Q10.

Explain the difference between Slicing (arr[1:3]) and Fancy Indexing (arr[[1,2]]) in terms of Views vs Copies.

In [ ]:

Q11.

Write a one-liner to reverse the rows of a 2D matrix.

In [ ]:

Q12.

How do you find the unique elements and their counts in a NumPy array? (np.unique).

In [ ]:

Q13.

Write code to stack two 1D arrays horizontally and vertically (np.hstack, np.vstack).

In [ ]:

Q14.

Explain the axis parameter. What does .sum(axis=0) do on a 2D matrix?

In [ ]:

Q15.

Write a boolean mask to filter out all np.nan values from an array.

In [ ]:

Q16.

How do you concatenate two 2D matrices along the column axis?

In [ ]:

Q17.

Explain np.argmax() and np.argmin(). What do they return?

In [ ]:

Q18.

Write code to sort a 2D array by the values in its second column using np.argsort().

In [ ]:

Q19.

What is the difference between arr.flatten() and arr.ravel()? (View vs Copy).

In [ ]:

Q20.

Write a broadcasting operation that computes the outer product of two vectors [1,2,3] and [4,5,6].

In [ ]:

Q21.

How do you use np.clip()? Compare it to using boolean mask assignments.

In [ ]:

Q22.

Explain how memory layout (C-order vs Fortran-order) affects NumPy performance.

In [ ]:

Q23.

Write code to extract the diagonal elements of a matrix without using a loop.

In [ ]:

Q24.

What is the Ellipsis (...) used for in NumPy slicing?

In [ ]:

Q25.

How does NumPy handle operations between arrays of different dtypes? (Type Promotion).

In [ ]:

๐Ÿ“Š Day 22 Executive Summary

#TopicKey Takeaway
1SlicesSlices are Views! Modifying a slice alters the original data
2MasksUse & and ` with parentheses (arr>1) & (arr<5)`
3BroadcastNumPy automatically stretches dimensions to align math operations

โœ… Instructor's End-of-Day Checklist

โ€ข [ ] I can slice rows and columns from a 2D matrix.

โ€ข [ ] I can filter an array using a boolean condition.

โ€ข [ ] I understand how scalar broadcasting works.