Convert a TXT File to CSV Format Using Python
Working with data in different file formats is a common task for programmers and data analysts. One such scenario is converting a TXT file to a CSV (Comma-Separated Values) format, which is a more structured and tabular way of representing data. Python provides several built-in modules and libraries to handle file operations and data manipulation, making this conversion process straightforward.
Understanding the Problem
Let’s assume we have a TXT file named “log.txt” containing data in the following format:
title1,intro1
title2,intro2
title3,intro3
...
Our goal is to convert this TXT file into a CSV file named “log.csv” with the following structure:
title,intro
title1,intro1
title2,intro2
title3,intro3
...
Approach and Solution
To solve this problem, we’ll use Python’s built-in csv
module, which provides functionality for reading and writing CSV files. Here’s the step-by-step approach:
- Open the input TXT file in read mode.
- Strip any leading or trailing whitespace from each line.
- Split each line by the comma (
,
) delimiter to separate the values. - Open the output CSV file in write mode.
- Create a
csv.writer
object to write data to the CSV file. - Write the header row (column names) to the CSV file.
- Write each line of data to the CSV file.
Python Code and Explanation
Here’s the Python code that implements the solution:
import csv
with open('log.txt', 'r') as in_file:
stripped = (line.strip() for line in in_file)
lines = (line.split(",") for line in stripped if line)
with open('log.csv', 'w', newline='') as out_file:
writer = csv.writer(out_file)
writer.writerow(('title', 'intro'))
writer.writerows(lines)
Let’s break down the code and understand each step:
- Import the
csv
module:
import csv
This line imports the csv
module, which provides functionality for reading and writing CSV files.
- Open the input TXT file in read mode:
with open('log.txt', 'r') as in_file:
This line opens the “log.txt” file in read mode using the open()
function and the context manager (with
statement). The file object is assigned to the variable in_file
.
- Strip whitespace and split lines:
stripped = (line.strip() for line in in_file)
lines = (line.split(",") for line in stripped if line)
These lines use generator expressions to perform two operations:
stripped
is a generator expression that strips leading and trailing whitespace from each line in the input file using thestr.strip()
method.lines
is another generator expression that splits each stripped line by the comma (,
) delimiter using thestr.split()
method. Theif line
condition ensures that empty lines are skipped.
- Open the output CSV file in write mode:
with open('log.csv', 'w', newline='') as out_file:
This line opens the “log.csv” file in write mode using the open()
function and the context manager (with
statement). The newline=''
argument is used to avoid extra blank lines between rows in the CSV file. The file object is assigned to the variable out_file
.
- Create a
csv.writer
object:
writer = csv.writer(out_file)
This line creates a csv.writer
object, which is responsible for writing data to the CSV file. The out_file
object is passed as an argument to the csv.writer
constructor.
- Write the header row:
writer.writerow(('title', 'intro'))
This line writes the header row to the CSV file using the writerow()
method of the csv.writer
object. The header row contains the column names “title” and “intro”.
- Write data rows:
writer.writerows(lines)
This line writes all the data rows to the CSV file using the writerows()
method of the csv.writer
object. The lines
generator expression, created earlier, is passed as an argument, which provides the data rows to be written.
Example Usage
Let’s assume we have a “log.txt” file with the following content:
title1,intro1
title2,intro2
title3,intro3
After running the Python script, a new “log.csv” file will be created with the following content:
title,intro
title1,intro1
title2,intro2
title3,intro3