Reading Files Line by Line in Bash
Process files safely — one line at a time. The canonical while-read pattern, mapfile, and how to handle edge cases like binary files and missing newlines.
Processing files — config files, logs, CSVs, lists — is one of Bash's most common tasks. Doing it correctly means handling spaces in lines, lines without trailing newlines, and avoiding the performance cost of launching subshells in a tight loop.
The Canonical Pattern
# The correct way to read a file line by line
while IFS= read -r line; do
echo "Line: $line"
done < input.txt
# Breaking down the idiom:
# IFS= — prevent leading/trailing whitespace stripping
# read -r — don't interpret backslash escapes
# < input.txt — file is redirected to while's stdin (not cat | while!)Don't use `cat file | while read`
cat file | while read line; do ...; done runs the while loop in a subshell (because of the pipe). Any variables you set inside the loop are lost when it ends.
Instead, redirect with < file or use process substitution < <(command) — these keep the loop in the current shell.
Parsing Delimited Fields
# Parse CSV-ish data (simple, no quoted commas)
while IFS=, read -r name age city; do
echo "Name: $name | Age: $age | City: $city"
done < people.csv
# Parse /etc/passwd (colon-delimited)
while IFS=: read -r user _ uid gid gecos home shell; do
echo "$user → $shell"
done < /etc/passwd
# Parse key=value config file
while IFS='=' read -r key value; do
# Skip comments and blank lines
[[ "$key" =~ ^[[:space:]]*# ]] && continue
[[ -z "$key" ]] && continue
# Trim whitespace from key and value
key="${key// /}"
value="${value// /}"
echo "Config: $key = $value"
done < app.confmapfile — Read All Lines into Array
# Read entire file into array (one element per line)
mapfile -t lines < /etc/hosts # -t trims trailing newlines
echo "Total lines: ${#lines[@]}"
# Access by index
echo "Line 0: ${lines[0]}"
# Process
for line in "${lines[@]}"; do
[[ "$line" == "#"* ]] && continue # skip comments
echo "$line"
done
# From command output
mapfile -t pids < <(pgrep python3)
echo "Python PIDs: ${pids[*]}"Performance — Avoid Subshells in Loops
# Slow — spawns a subshell for each line's awk
while IFS= read -r line; do
field=$(echo "$line" | awk '{print $1}') # ← subprocess per line
echo "$field"
done < bigfile.txt
# Fast — use read with IFS to split fields directly
while read -r field rest; do # first word → $field, rest → $rest
echo "$field"
done < bigfile.txt
# Even better — process the entire file with one awk/sed call
awk '{print $1}' bigfile.txtEdge Cases
# Handle file with no trailing newline (last line still read)
while IFS= read -r line || [[ -n "$line" ]]; do
# || [[ -n "$line" ]] catches the last line if file lacks trailing newline
echo "$line"
done < file_without_newline.txt
# Skip empty lines
while IFS= read -r line; do
[[ -z "$line" ]] && continue
echo "$line"
done < file.txt
# Limit to first N lines
head -n 100 bigfile.txt | while IFS= read -r line; do
process "$line"
doneWhy does `cat file | while IFS= read -r line; do x=1; done; echo $x` print nothing (empty)?
Write a script count-errors.sh that:
- Takes a log file path as argument
- Reads it line by line
- Counts lines containing the word "ERROR" (case-insensitive)
- Also counts lines containing "WARN"
- Prints a summary: "Errors: 5, Warnings: 2"