โณ Loading Python Engine...

๐Ÿ“Š Day 11 : Modules

๐ŸŽฏ Enterprise Objective

To build large-scale applications, you must master code organization. Today we learn how to modularize code into separate files, leverage Python's powerful Standard Library, and isolate project dependencies using virtual environments. These are essential software engineering practices.

๐Ÿ“‹ Strategic Overview

#TopicConcept
1Modulesimport, code splitting
2StdLibos, json, collections
3Venvspip, isolation

1. Modules & Packages : Organizing Code

๐Ÿ” What is it?

A module is just a .py file containing Python code. A package is a directory of modules with an init.py file. Instead of writing monolithic 5000-line scripts, we split code into logical modules and import them.

# 1. Import whole module
import math
math.sqrt(16)

# 2. Import specific function
from math import sqrt
sqrt(16)

# 3. Import with alias
import pandas as pd

๐Ÿ’ผ Why Data Analysts Care

โ€ข Code Reuse: Import the same data-cleaning functions across multiple notebooks

โ€ข Namespace Management: math.pi and numpy.pi can coexist without overwriting each other

โš ๏ธ Circular Imports

If Module A imports Module B, and Module B imports Module A, Python will crash with an ImportError. Always structure code hierarchically to avoid circular dependencies.

In [ ]:

๐Ÿงช Concept Checks: Modules

Q1. Import the random module. Use random.randint(1, 10) to print a random number.

In [ ]:

Q2. Import math using an alias m. Print the value of m.pi.

In [ ]:

Q3. Import ONLY the choice function from random. Use it to pick a random color from ["Red", "Blue", "Green"].

In [ ]:

Q4. Try to import a module that doesn't exist (e.g., import fake_module). Catch the ModuleNotFoundError and print a friendly message.

In [ ]:

Q5. Explain why import pandas as pd is preferred over from pandas import *.

In [ ]:

2. The Standard Library : Batteries Included

๐Ÿ” What is it?

Python's philosophy is 'Batteries Included'. It comes with a massive standard library covering file systems, mathematics, network protocols, and data formats without needing pip install.

ModulePurposeExample Use
os / sysOperating SystemFile paths, command-line args
jsonData FormatParsing API responses
collectionsData StructuresCounter, defaultdict
datetimeTime & DatesTimestamps, durations

๐Ÿ’ผ Why Data Analysts Care

โ€ข No Dependencies: Code relies only on built-in tools, making it easy to deploy on servers

โ€ข Reliability: Standard library modules are heavily tested and rarely introduce breaking changes

๐Ÿง  Pro Tip

Before installing a third-party package for a simple task, check the standard library. For example, use pathlib for file paths instead of string manipulation.

In [ ]:

๐Ÿงช Concept Checks: Standard Lib

Q1. Import json. Parse '{"name": "Alice"}' into a dictionary using json.loads(). Print the dictionary.

In [ ]:

Q2. Import collections.defaultdict. Create a dict that defaults to 0. Increment a key "x" and print it.

In [ ]:

Q3. Import sys. Print the list of command-line arguments using sys.argv.

In [ ]:

Q4. Import math. Calculate the factorial of 5 using math.factorial(). Print it.

In [ ]:

Q5. Import time. Use time.sleep(1) to pause execution for 1 second, then print "Done".

In [ ]:

3. Virtual Environments & pip : Dependency Isolation

๐Ÿ” What is it?

Different projects need different versions of packages. A Virtual Environment (venv) is an isolated folder containing a specific Python interpreter and its own installed packages. pip is the package installer used to download libraries from PyPI (Python Package Index).

# 1. Create a virtual environment
python -m venv myenv

# 2. Activate it (Windows)
myenv\Scripts\activate

# 3. Install a package
pip install requests==2.31.0

# 4. Save dependencies
pip freeze > requirements.txt

๐Ÿ’ผ Why Data Analysts Care

โ€ข Reproducibility: Ensures your code runs on your coworker's machine exactly as it does on yours

โ€ข Conflict Prevention: Project A uses Pandas 1.0, Project B uses Pandas 2.0 without crashing each other

โš ๏ธ Global Installs

Never use pip install globally on your machine (outside a venv). It will eventually corrupt your system Python installation.

In [ ]:

๐Ÿงช Concept Checks: Pip & Venv

Q1. What command creates a new virtual environment named env?

In [ ]:

Q2. What command activates the environment on Windows? On Mac/Linux?

In [ ]:

Q3. Write the command to install exactly version 1.5.3 of the pandas library.

In [ ]:

Q4. Explain what pip freeze > requirements.txt does and why it is important for version control.

In [ ]:

Q5. Why should you exclude the virtual environment folder (e.g., venv/) from GitHub using .gitignore?

In [ ]:

๐Ÿ› ๏ธ Professional Practice Tasks

Theory is useless without muscle memory. Complete these tasks to solidify your understanding.

Task 1 (Math Operations): Create a file my_math.py with an add function. In your main cell, import it and use it. (Simulate this by writing both blocks).

In [ ]:

Task 2 (JSON Config): Import json. Create a dictionary, save it to a file config.json using json.dump(), then read it back using json.load().

In [ ]:

Task 3 (Collections Counter): Write a function that takes a long string, removes punctuation, and uses collections.Counter to find the 3 most common words.

In [ ]:

Task 4 (Datetime Math): Import datetime. Calculate how many days have passed since January 1st, 2000. Print the result.

In [ ]:

Task 5 (Requirements): Write a Python script that reads a simulated requirements.txt file and prints out only the package names, stripping out the version numbers (==X.Y.Z).

In [ ]:

๐Ÿ’ป Pure Coding Interview Questions

Q1.

Write a Python script that creates a folder mypack/ with an init.py containing version = '1.0'. Then import it and print mypack.version.

In [ ]:

Q2.

Write code that imports math two ways: import math and from math import . Show how can shadow a local variable named pow by printing pow before and after.

In [ ]:

Q3.

Write code that appends a custom directory to sys.path using sys.path.insert(0, '/custom'). Print sys.path to show the change.

In [ ]:

Q4.

Write a script that reads sys.path and prints each path entry on its own line, numbered 1 through N.

In [ ]:

Q5.

Write a Python script that reads a requirements.txt string 'pandas==2.0.0\nnumpy==1.24.0' and prints each package name and version as a formatted table.

In [ ]:

Q6.

Write code to parse 'pandas==2.0.0\nnumpy==1.24.0' and extract just the package names ['pandas', 'numpy'] into a list.

In [ ]:

Q7.

Write code that creates a file called math.py with print('Custom math!'), then shows what happens when you try import math. Explain the shadowing.

In [ ]:

Q8.

Write code using os.path.join('data', 'users', 'file.csv') and compare it with manual string concatenation 'data' + '/' + 'users'. Print both results.

In [ ]:

Q9.

Write code using collections.defaultdict(list) to group words by their first letter from ['apple', 'banana', 'avocado', 'blueberry'].

In [ ]:

Q10.

Write code using urllib.request.urlopen() to fetch the HTML of http://example.com and print the first 200 characters.

In [ ]:

Q11.

Write a module-like script with a function and an if name == 'main': block. Show what runs when imported vs executed directly.

In [ ]:

Q12.

Write the pip command to upgrade a package to the latest version. Then write Python code using subprocess.run() to execute that same command.

In [ ]:

Q13.

Write code using importlib.metadata.version('pip') to print the installed version of pip.

In [ ]:

Q14.

Write a function using datetime.strptime() to convert '2023-12-25' into a datetime object. Print the day of the week.

In [ ]:

Q15.

Write code that uses both os.path.join() and pathlib.Path() / 'subdir' / 'file.txt' to build the same path. Print both.

In [ ]:

Q16.

Write code using the json module: try to load a config file, catch FileNotFoundError, and return a default {'debug': False} dict.

In [ ]:

Q17.

Write code using importlib.metadata.version('pip') to print the installed version of pip programmatically.

In [ ]:

Q18.

Write the pip commands (as strings) to: install, uninstall, and reinstall all packages from a requirements file.

In [ ]:

Q19.

Write code that uses sys.exit(1) to terminate a script if a required environment variable is missing. Use os.environ.get().

In [ ]:

Q20.

Write code that prints the absolute path of the site-packages directory using import site; print(site.getsitepackages()).

In [ ]:

Q21.

Write a function parse_toml_deps(text) that extracts dependency names from a TOML-style string like '[dependencies]\npandas = "^2.0"\nnumpy = "^1.24"'.

In [ ]:

Q22.

Write code that imports a module, then modifies it, and uses importlib.reload() to reload the updated version. Demonstrate with print statements.

In [ ]:

Q23.

Write code that measures the time taken to import json vs pandas using time.perf_counter(). Print both durations.

In [ ]:

Q24.

Write code that checks if a package name exists on PyPI by trying to import it and catching ModuleNotFoundError.

In [ ]:

Q25.

Write a Python script using os.walk() to recursively list all .py files in the current directory. Print each file path.

In [ ]:

๐Ÿ“Š Day 11 Executive Summary

#TopicKey Takeaway
1ImportsAvoid circular and * imports
2LibraryUse built-ins before pip installing
3VenvsNever install globally. Always use requirements.txt

โœ… Instructor's End-of-Day Checklist

โ€ข [ ] I can import specific functions from a module.

โ€ข [ ] I know how to use os and collections.

โ€ข [ ] I can create and activate a virtual environment.