Advanced Python I/O

advanced
Published

October 8, 2024

1. File Handling with with Statements: Error Handling and Resource Management

The with statement provides a clean and efficient way to handle files, automatically closing them even if errors occur. This prevents resource leaks and simplifies error handling.

try:
    with open("my_file.txt", "r") as f:
        contents = f.read()
        # Process the file contents
        print(contents)
except FileNotFoundError:
    print("File not found!")
except Exception as e:
    print(f"An error occurred: {e}")

This code elegantly manages the file, ensuring it’s closed regardless of success or failure.

2. Working with Different File Modes

Python offers various file modes beyond the common “r” (read) and “w” (write). Understanding these modes is essential for flexible I/O operations:

  • "a" (append): Adds data to the end of an existing file.
  • "x" (exclusive creation): Creates a new file and fails if the file already exists.
  • "b" (binary): Used for working with binary files (images, audio, etc.).
  • "t" (text): Used for working with text files (default). This mode handles text encoding.
  • Combining modes (e.g., "r+", "w+b"): Allows both reading and writing.
#Append to a file
with open("my_file.txt", "a") as f:
    f.write("\nThis line is appended.")

#Write in binary mode
with open("image.jpg", "rb") as f:
    image_data = f.read()

3. Efficient I/O with Buffers

For large files, using buffers can significantly improve performance. Buffers store data temporarily before writing it to disk, reducing the number of disk access operations.

import io

buffer_size = 4096  # Adjust as needed

with open("large_file.txt", "r") as f:
    while True:
        chunk = f.read(buffer_size)
        if not chunk:
            break
        # Process the chunk
        print(f"Processing chunk: {len(chunk)} bytes")

#Using io.BufferedIOBase for more control over buffering
with open("large_file.txt", "r") as f:
    buffered_file = io.BufferedReader(f)
    #process buffered_file.read(buffer_size)

4. Object Serialization and Deserialization (Pickling)

Python’s pickle module allows you to serialize Python objects (convert them into a byte stream) and deserialize them (convert them back into objects). This is extremely useful for saving and loading complex data structures.

import pickle

data = {"name": "John Doe", "age": 30, "city": "New York"}

#Serialization
with open("data.pickle", "wb") as f:
    pickle.dump(data, f)

#Deserialization
with open("data.pickle", "rb") as f:
    loaded_data = pickle.load(f)
    print(loaded_data)

Remember that pickle is not secure for untrusted data.

5. Working with CSV Files

The csv module provides tools for easily reading and writing CSV (Comma Separated Values) files.

import csv

data = [["Name", "Age", "City"], ["John", "30", "New York"], ["Jane", "25", "London"]]

with open("data.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerows(data)

with open("data.csv", "r") as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

6. Handling Large Files with Generators

For extremely large files that don’t fit into memory, using generators is crucial. Generators yield data piece by piece, avoiding memory overload.

def read_large_file(filename, chunk_size=1024):
    with open(filename, 'r') as f:
        while True:
            chunk = f.read(chunk_size)
            if not chunk:
                break
            yield chunk

for chunk in read_large_file("massive_file.txt"):
    #process chunk
    pass