Chapter 6: File Input/Output#
In real-world applications, you rarely type all your data directly into code. Instead, you:
Read data from files (sensor logs, experimental results, configuration files)
Write results to files (analysis outputs, reports, processed data)
Save and load data between sessions
File I/O (Input/Output) is essential for:
Loading experimental data
Saving calculation results
Reading configuration parameters
Creating data analysis pipelines
Sharing data with other programs
In this chapter, we’ll cover:
Reading files
Writing files
File paths and directories
6.1 Reading Text Files#
Python provides built-in functions to read text files. The basic process is:
Open the file
Read the contents
Close the file
Basic syntax:
file = open('filename.txt', 'r') # 'r' means read mode
content = file.read()
file.close()
File modes:
'r'- Read (default)'w'- Write (overwrites existing file)'a'- Append (adds to end of file)'r+'- Read and write
# Basic File Reading - Method 1: Read entire file
print("=== Reading Entire File ===")
# Open file in read mode
file = open('06_temperature_data.txt', 'r')
# Read entire contents
content = file.read()
# Close the file (important!)
file.close()
content
# print(content)
# print(f"\nType: {type(content)}")
# print(f"Length: {len(content)} characters")
=== Reading Entire File ===
'Time,Temperature\n0,25.0\n1,27.5\n2,30.2\n3,32.8\n4,35.1\n'
# Basic File Reading - Method 2: Read line by line
print("=== Reading Line by Line ===")
file = open('06_temperature_data.txt', 'r')
# Read lines into a list
lines = file.readlines()
file.close()
#print(lines)
# print(f"results: {lines}\n")
# print(f"Number of lines: {len(lines)}\n")
for i, line in enumerate(lines):
print(f"Line {i}: {repr(line)}")
=== Reading Line by Line ===
Line 0: 'Time,Temperature\n'
Line 1: '0,25.0\n'
Line 2: '1,27.5\n'
Line 3: '2,30.2\n'
Line 4: '3,32.8\n'
Line 5: '4,35.1\n'
6.1.1 Using Context Managers (Best Practice)#
Context managers automatically close files, even if errors occur. This is the recommended way to work with files!
Syntax:
with open('filename.txt', 'r') as file:
content = file.read()
# File automatically closes here
Advantages:
Automatically closes the file
Prevents resource leaks
Cleaner code
Handles errors gracefully
# Using Context Manager (Recommended!)
print("=== Reading with Context Manager ===")
# File opens and automatically closes
with open('06_temperature_data.txt', 'r') as f:
content = f.read()
print(content)
# File is already closed here
print("\nFile has been automatically closed")
=== Reading with Context Manager ===
Time,Temperature
0,25.0
1,27.5
2,30.2
3,32.8
4,35.1
File has been automatically closed
# Processing File Line by Line
print("=== Processing Data Line by Line ===")
with open('06_temperature_data.txt', 'r') as file:
# Skip header line
header = file.readline()
print(f"Header: {header.strip()}\n")
# .strip() removes extra whitespace/newline
print("Data:")
for line in file:
# Remove whitespace and split by comma
line = line.strip()
if line: # Skip empty lines
time, temp = line.split(',')
print(f" Time: {time} min, Temperature: {temp}°C")
=== Processing Data Line by Line ===
Header: Time,Temperature
Data:
Time: 0 min, Temperature: 25.0°C
Time: 1 min, Temperature: 27.5°C
Time: 2 min, Temperature: 30.2°C
Time: 3 min, Temperature: 32.8°C
Time: 4 min, Temperature: 35.1°C
6.2 Writing Text Files#
Writing files is just as important as reading them. You can:
Save calculation results
Create data logs
Generate reports
Export data for other programs
Write modes:
'w'- Write (creates new file or overwrites existing)'a'- Append (adds to end of existing file)'x'- Exclusive creation (fails if file exists)
# Writing to a File
print("=== Writing Data to File ===")
# Data to write
reactor_data = [
("R-101", 85.0, 2.5),
("R-102", 92.0, 2.8),
("R-103", 78.5, 2.3),
]
# Write to file
with open('06_reactor_report.txt', 'w') as file:
# Write header
file.write("Reactor Status Report\n")
file.write("=" * 40 + "\n\n")
# Write data
for reactor_id, temp, pressure in reactor_data:
line = f"{reactor_id}: Temp={temp}°C, Pressure={pressure} bar\n"
file.write(line)
print("Report written to 06_reactor_report.txt")
# Read it back to verify
print("\nFile contents:")
with open('06_reactor_report.txt', 'r') as file:
print(file.read())
=== Writing Data to File ===
Report written to 06_reactor_report.txt
File contents:
Reactor Status Report
========================================
R-101: Temp=85.0°C, Pressure=2.5 bar
R-102: Temp=92.0°C, Pressure=2.8 bar
R-103: Temp=78.5°C, Pressure=2.3 bar
# Appending to a File
print("=== Appending to Existing File ===")
# Add more data
new_data = ("R-104", 88.5, 2.6)
with open('06_reactor_report.txt', 'a') as file:
reactor_id, temp, pressure = new_data
line = f"{reactor_id}: Temp={temp}°C, Pressure={pressure} bar\n"
file.write(line)
print("Data appended to 06_reactor_report.txt")
# Read updated file
print("\nUpdated file contents:")
with open('06_reactor_report.txt', 'r') as file:
print(file.read())
=== Appending to Existing File ===
Data appended to 06_reactor_report.txt
Updated file contents:
Reactor Status Report
========================================
R-101: Temp=85.0°C, Pressure=2.5 bar
R-102: Temp=92.0°C, Pressure=2.8 bar
R-103: Temp=78.5°C, Pressure=2.3 bar
R-104: Temp=88.5°C, Pressure=2.6 bar
(Optional) 6.3 Working with CSV Files#
CSV (Comma-Separated Values) files are one of the most common data formats in science and engineering.
Why CSV?
Simple, universal format
Works with Excel, Google Sheets, MATLAB, etc.
Human-readable
Easy to parse
Python has a built-in csv module for working with CSV files.
In addition, other Python packages can also be used to read and write CSV files, depending on your needs:
pandas→ best for data analysis, tables, and large datasetsnumpy→ useful for numerical data stored in CSV formatopenpyxl/xlsxwriter→ when working with Excel files that include CSV-like data
For simple tasks, the built-in csv module is often sufficient.
For more complex data processing and analysis, pandas is commonly preferred.
import csv
# Writing CSV Files
print("=== Writing CSV File ===")
# Sample experimental data
experimental_data = [
["Time (min)", "Temperature (C)", "Pressure (bar)", "Flow Rate (L/min)"],
[0, 25.0, 1.0, 50.0],
[5, 45.2, 1.5, 55.3],
[10, 65.8, 2.0, 60.1],
[15, 85.3, 2.5, 65.7],
[20, 95.1, 2.8, 70.2],
]
# Write to CSV
with open('06_experiment_data.csv', 'w', newline='') as file:
writer = csv.writer(file)
writer.writerows(experimental_data)
print("Data written to 06_experiment_data.csv")
# Display what we wrote
print("\nCSV contents:")
with open('06_experiment_data.csv', 'r') as file:
print(file.read())
=== Writing CSV File ===
Data written to 06_experiment_data.csv
CSV contents:
Time (min),Temperature (C),Pressure (bar),Flow Rate (L/min)
0,25.0,1.0,50.0
5,45.2,1.5,55.3
10,65.8,2.0,60.1
15,85.3,2.5,65.7
20,95.1,2.8,70.2
# Reading CSV Files
print("=== Reading CSV File ===")
with open('06_experiment_data.csv', 'r') as file:
reader = csv.reader(file)
# Read header
header = next(reader)
print("Header:", header)
print()
# Read data rows
print("Data:")
for row in reader:
time, temp, pressure, flow = row
print(f" t={time} min: T={temp}°C, P={pressure} bar, F={flow} L/min")
=== Reading CSV File ===
Header: ['Time (min)', 'Temperature (C)', 'Pressure (bar)', 'Flow Rate (L/min)']
Data:
t=0 min: T=25.0°C, P=1.0 bar, F=50.0 L/min
t=5 min: T=45.2°C, P=1.5 bar, F=55.3 L/min
t=10 min: T=65.8°C, P=2.0 bar, F=60.1 L/min
t=15 min: T=85.3°C, P=2.5 bar, F=65.7 L/min
t=20 min: T=95.1°C, P=2.8 bar, F=70.2 L/min
# read csv with pandas
import pandas as pd
df = pd.read_csv('06_experiment_data.csv')
print("\nData read with pandas:")
print(df)
Data read with pandas:
Time (min) Temperature (C) Pressure (bar) Flow Rate (L/min)
0 0 25.0 1.0 50.0
1 5 45.2 1.5 55.3
2 10 65.8 2.0 60.1
3 15 85.3 2.5 65.7
4 20 95.1 2.8 70.2
6.4 File Paths and Directories#
Understanding file paths is crucial for working with files in different locations.
Path types:
Absolute path: Full path from root directory
/Users/username/data/experiment.csv(Mac/Linux)C:\Users\username\data\experiment.csv(Windows)
Relative path: Path relative to current directory
data/experiment.csv../data/experiment.csv(.. means parent directory)
Assume your current working directory is:
project/
├── main.py
├── data/
│ ├── experiment.csv
│ └── results.csv
└── output/
data/experiment.csv
→ accesses the file inside thedatadirectoryoutput/
→ refers to theoutputfolder in the current directory../project/data/experiment.csv
→ moves up one directory, then navigates intoproject/data/Users/username/project/data/experiment.csv
→ absolute path that works regardless of the current directory
Python’s pathlib module provides a modern, cross-platform way to work with paths.
from pathlib import Path
import os
print("=== Working with Paths ===")
# Get current working directory
current_dir = Path.cwd()
print(f"Current directory: {current_dir}")
=== Working with Paths ===
Current directory: /Users/hoon/CHME212/cheme_comp_book/docs/chapter06
# Create a Path object
data_file = Path('batch_data')#06_temperature_data.txt')
print(f"\nFile path: {data_file}")
print(f"Absolute path: {data_file.absolute()}")
print(f"File exists: {data_file.exists()}")
print(f"Is file: {data_file.is_file()}")
File path: batch_data
Absolute path: /Users/hoon/CHME212/cheme_comp_book/docs/chapter06/batch_data
File exists: True
Is file: False
# Get file information
if data_file.exists():
print(f"File name: {data_file.name}")
print(f"File stem (no extension): {data_file.stem}")
print(f"File extension: {data_file.suffix}")
print(f"File size: {data_file.stat().st_size} bytes")
File name: batch_data
File stem (no extension): batch_data
File extension:
File size: 192 bytes
# Creating Directories and Organizing Files
print("=== Creating Directory Structure ===")
# Create a data directory
data_dir = Path('project_data')
data_dir.mkdir(exist_ok=True) # exist_ok=True won't error if exists
print(f"Created directory: {data_dir}")
=== Creating Directory Structure ===
Created directory: project_data
# Create subdirectories
raw_dir = data_dir / 'raw'
processed_dir = data_dir / 'processed'
raw_dir.mkdir(exist_ok=True)
processed_dir.mkdir(exist_ok=True)
print(f"Created: {raw_dir}")
print(f"Created: {processed_dir}")
Created: project_data/raw
Created: project_data/processed
# Write a file in the subdirectory
output_file = processed_dir / 'results.txt'
output_file.write_text("Analysis complete: All tests passed\n")
print(f"\nWrote file: {output_file}")
print(f"Contents: {output_file.read_text()}")
Wrote file: project_data/processed/results.txt
Contents: Analysis complete: All tests passed
6.5 (Optional) Error Handling with Files#
Files can cause many errors:
File doesn’t exist
No permission to read/write
Disk full
File is locked by another program
Good practice: Always handle potential errors!
# Example 1: File doesn't exist
print("\n1. Trying to read non-existent file:")
try:
with open('nonexistent_file.txt', 'r') as file:
content = file.read()
except FileNotFoundError:
print(" Error: File not found!")
print(" Creating a default file instead...")
with open('nonexistent_file.txt', 'w') as file:
file.write("Default content\n")
print(" Default file created.")
1. Trying to read non-existent file:
# Example 2: Check if file exists before reading
print("\n2. Checking file existence first:")
filename = 'data_file.txt'
if Path(filename).exists():
with open(filename, 'r') as file:
content = file.read()
print(f" File read successfully: {len(content)} characters")
else:
print(f" File '{filename}' does not exist")
2. Checking file existence first:
File 'data_file.txt' does not exist
# print("\n3. Comprehensive error handling:")
# def safe_read_file(filename):
# """Safely read a file with error handling"""
# try:
# with open(filename, 'r') as file:
# return file.read()
# except FileNotFoundError:
# print(f" Error: '{filename}' not found")
# return None
# except PermissionError:
# print(f" Error: No permission to read '{filename}'")
# return None
# except Exception as e:
# print(f" Unexpected error: {e}")
# return None
# content = safe_read_file('experiment_data.csv')
# if content:
# print(f" Successfully read {len(content)} characters")
6.6 Practical Applications#
Let’s put everything together with realistic examples.
# Comprehensive Example 1: Processing Sensor Log Files
print("=" * 60)
print("SENSOR DATA ANALYSIS PIPELINE")
print("=" * 60)
# Step 1: Create sample sensor log
print("\n[1/4] Creating sample sensor log...")
sensor_log = """timestamp,sensor_id,temperature,pressure,status
2024-01-15 08:00:00,S001,25.3,1.01,OK
2024-01-15 08:05:00,S001,28.7,1.05,OK
2024-01-15 08:10:00,S001,95.2,2.85,WARNING
2024-01-15 08:15:00,S001,105.3,3.15,CRITICAL
2024-01-15 08:20:00,S001,98.1,2.95,WARNING
2024-01-15 08:25:00,S001,85.4,2.50,OK
"""
with open('sensor_log.csv', 'w') as f:
f.write(sensor_log)
print(" sensor_log.csv created")
# Step 2: Read and analyze data
print("\n[2/4] Analyzing sensor data...")
warnings = []
critical = []
temperatures = []
with open('sensor_log.csv', 'r') as file:
reader = csv.DictReader(file)
for row in reader:
temp = float(row['temperature'])
temperatures.append(temp)
if row['status'] == 'WARNING':
warnings.append(row)
elif row['status'] == 'CRITICAL':
critical.append(row)
print(f" Total readings: {len(temperatures)}")
print(f" Warnings: {len(warnings)}")
print(f" Critical alerts: {len(critical)}")
print(f" Avg temperature: {sum(temperatures)/len(temperatures):.1f}°C")
print(f" Max temperature: {max(temperatures):.1f}°C")
# Step 3: Generate report
print("\n[3/4] Generating analysis report...")
with open('sensor_analysis_report.txt', 'w') as file:
file.write("SENSOR ANALYSIS REPORT\n")
file.write("=" * 50 + "\n\n")
file.write(f"Total readings: {len(temperatures)}\n")
file.write(f"Average temperature: {sum(temperatures)/len(temperatures):.1f}°C\n")
file.write(f"Maximum temperature: {max(temperatures):.1f}°C\n")
file.write(f"Minimum temperature: {min(temperatures):.1f}°C\n\n")
file.write(f"Warnings: {len(warnings)}\n")
file.write(f"Critical alerts: {len(critical)}\n\n")
if critical:
file.write("CRITICAL EVENTS:\n")
for event in critical:
file.write(f" {event['timestamp']}: "
f"T={event['temperature']}°C, "
f"P={event['pressure']} bar\n")
print(" Report saved to sensor_analysis_report.txt")
# Step 4: Display the report
print("\n[4/4] Report contents:")
print("=" * 60)
with open('sensor_analysis_report.txt', 'r') as file:
print(file.read())
============================================================
SENSOR DATA ANALYSIS PIPELINE
============================================================
[1/4] Creating sample sensor log...
sensor_log.csv created
[2/4] Analyzing sensor data...
Total readings: 6
Warnings: 2
Critical alerts: 1
Avg temperature: 73.0°C
Max temperature: 105.3°C
[3/4] Generating analysis report...
Report saved to sensor_analysis_report.txt
[4/4] Report contents:
============================================================
SENSOR ANALYSIS REPORT
==================================================
Total readings: 6
Average temperature: 73.0°C
Maximum temperature: 105.3°C
Minimum temperature: 25.3°C
Warnings: 2
Critical alerts: 1
CRITICAL EVENTS:
2024-01-15 08:15:00: T=105.3°C, P=3.15 bar
# Comprehensive Example 2: Batch Processing Multiple Files
print("=" * 60)
print("BATCH DATA QUALITY CONTROL SYSTEM")
print("=" * 60)
# Step 1: Create sample batch data files
print("\n[1/3] Creating sample batch data...")
batch_dir = Path('batch_data')
batch_dir.mkdir(exist_ok=True)
# Create multiple batch files
batches = {
'batch_001.csv': [["purity", "yield"], [0.965, 0.88], [0.962, 0.89]],
'batch_002.csv': [["purity", "yield"], [0.948, 0.91], [0.945, 0.90]],
'batch_003.csv': [["purity", "yield"], [0.978, 0.85], [0.975, 0.86]],
}
for filename, data in batches.items():
filepath = batch_dir / filename
with open(filepath, 'w', newline='') as file:
writer = csv.writer(file)
writer.writerows(data)
print(f" Created: {filename}")
# Step 2: Process all batch files
print("\n[2/3] Processing all batches...")
min_purity = 0.95
min_yield = 0.85
results = []
for batch_file in sorted(batch_dir.glob('batch_*.csv')):
print(f"\n Processing {batch_file.name}...")
with open(batch_file, 'r') as file:
reader = csv.DictReader(file)
batch_purities = []
batch_yields = []
for row in reader:
batch_purities.append(float(row['purity']))
batch_yields.append(float(row['yield']))
avg_purity = sum(batch_purities) / len(batch_purities)
avg_yield = sum(batch_yields) / len(batch_yields)
# Quality control check
if avg_purity >= min_purity and avg_yield >= min_yield:
status = "PASS"
else:
status = "FAIL"
print(f" Avg purity: {avg_purity:.1%}")
print(f" Avg yield: {avg_yield:.1%}")
print(f" Status: {status}")
results.append({
'batch': batch_file.stem,
'purity': avg_purity,
'yield': avg_yield,
'status': status
})
# Step 3: Write summary report
print("\n[3/3] Generating summary report...")
summary_file = batch_dir / 'quality_control_summary.csv'
fieldnames = ['batch', 'purity', 'yield', 'status']
with open(summary_file, 'w', newline='') as file:
writer = csv.DictWriter(file, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(results)
print(f" Summary saved to {summary_file}")
# Display summary
print("\nQUALITY CONTROL SUMMARY:")
print("=" * 60)
passed = sum(1 for r in results if r['status'] == 'PASS')
failed = len(results) - passed
print(f"Total batches: {len(results)}")
print(f"Passed: {passed}")
print(f"Failed: {failed}")
print(f"\nPass rate: {passed/len(results):.1%}")
============================================================
BATCH DATA QUALITY CONTROL SYSTEM
============================================================
[1/3] Creating sample batch data...
Created: batch_001.csv
Created: batch_002.csv
Created: batch_003.csv
[2/3] Processing all batches...
Processing batch_001.csv...
Avg purity: 96.4%
Avg yield: 88.5%
Status: PASS
Processing batch_002.csv...
Avg purity: 94.6%
Avg yield: 90.5%
Status: FAIL
Processing batch_003.csv...
Avg purity: 97.6%
Avg yield: 85.5%
Status: PASS
[3/3] Generating summary report...
Summary saved to batch_data/quality_control_summary.csv
QUALITY CONTROL SUMMARY:
============================================================
Total batches: 3
Passed: 2
Failed: 1
Pass rate: 66.7%
Summary#
In this chapter, you learned how to work with files in Python:
Reading Files#
open()function: Basic file operationsContext managers:
withstatement (best practice)Methods:
.read(),.readline(),.readlines()Iterate line by line: Memory-efficient for large files
Writing Files#
Write mode:
'w'(overwrites)Append mode:
'a'(adds to end)Methods:
.write(),.writelines()
CSV Files#
csv.reader()andcsv.writer(): Basic CSV operationscsv.DictReader()andcsv.DictWriter(): Dictionary-based (more readable)Common format for data exchange
File Paths#
pathlib.Path: Modern path handlingAbsolute vs relative paths
Directory operations: Create, check existence
Best Practices#
Always use context managers (
withstatement)Handle errors (file not found, permissions)
Close files (automatic with context managers)
Check file existence before reading
Use meaningful file names
Organize data in directories
Quick Reference#
# Reading a file
with open('data.txt', 'r') as file:
content = file.read()
# Writing a file
with open('output.txt', 'w') as file:
file.write("Results\n")
# Reading CSV
import csv
with open('data.csv', 'r') as file:
reader = csv.DictReader(file)
for row in reader:
print(row['column_name'])
# Writing CSV
with open('output.csv', 'w', newline='') as file:
writer = csv.DictWriter(file, fieldnames=['name', 'value'])
writer.writeheader()
writer.writerow({'name': 'test', 'value': 123})
# Working with paths
from pathlib import Path
data_file = Path('data') / 'experiment.csv'
if data_file.exists():
content = data_file.read_text()
File I/O is fundamental for real-world data analysis and scientific computing!