Converting a TXT file to CSV in Python

Convert a TXT File to CSV Format Using Python

Working with data in different file formats is a common task for programmers and data analysts. One such scenario is converting a TXT file to a CSV (Comma-Separated Values) format, which is a more structured and tabular way of representing data. Python provides several built-in modules and libraries to handle file operations and data manipulation, making this conversion process straightforward.

text to csv in python
text to csv in python

Understanding the Problem

Let’s assume we have a TXT file named “log.txt” containing data in the following format:

title1,intro1
title2,intro2
title3,intro3
...

Our goal is to convert this TXT file into a CSV file named “log.csv” with the following structure:

title,intro
title1,intro1
title2,intro2
title3,intro3
...

Approach and Solution

To solve this problem, we’ll use Python’s built-in csv module, which provides functionality for reading and writing CSV files. Here’s the step-by-step approach:

  1. Open the input TXT file in read mode.
  2. Strip any leading or trailing whitespace from each line.
  3. Split each line by the comma (,) delimiter to separate the values.
  4. Open the output CSV file in write mode.
  5. Create a csv.writer object to write data to the CSV file.
  6. Write the header row (column names) to the CSV file.
  7. Write each line of data to the CSV file.

Python Code and Explanation

Here’s the Python code that implements the solution:

import csv

with open('log.txt', 'r') as in_file:
    stripped = (line.strip() for line in in_file)
    lines = (line.split(",") for line in stripped if line)
    with open('log.csv', 'w', newline='') as out_file:
        writer = csv.writer(out_file)
        writer.writerow(('title', 'intro'))
        writer.writerows(lines)

Let’s break down the code and understand each step:

  1. Import the csv module:
   import csv

This line imports the csv module, which provides functionality for reading and writing CSV files.

  1. Open the input TXT file in read mode:
   with open('log.txt', 'r') as in_file:

This line opens the “log.txt” file in read mode using the open() function and the context manager (with statement). The file object is assigned to the variable in_file.

  1. Strip whitespace and split lines:
   stripped = (line.strip() for line in in_file)
   lines = (line.split(",") for line in stripped if line)

These lines use generator expressions to perform two operations:

  • stripped is a generator expression that strips leading and trailing whitespace from each line in the input file using the str.strip() method.
  • lines is another generator expression that splits each stripped line by the comma (,) delimiter using the str.split() method. The if line condition ensures that empty lines are skipped.
  1. Open the output CSV file in write mode:
   with open('log.csv', 'w', newline='') as out_file:

This line opens the “log.csv” file in write mode using the open() function and the context manager (with statement). The newline='' argument is used to avoid extra blank lines between rows in the CSV file. The file object is assigned to the variable out_file.

  1. Create a csv.writer object:
   writer = csv.writer(out_file)

This line creates a csv.writer object, which is responsible for writing data to the CSV file. The out_file object is passed as an argument to the csv.writer constructor.

  1. Write the header row:
   writer.writerow(('title', 'intro'))

This line writes the header row to the CSV file using the writerow() method of the csv.writer object. The header row contains the column names “title” and “intro”.

  1. Write data rows:
   writer.writerows(lines)

This line writes all the data rows to the CSV file using the writerows() method of the csv.writer object. The lines generator expression, created earlier, is passed as an argument, which provides the data rows to be written.

Example Usage

Let’s assume we have a “log.txt” file with the following content:

title1,intro1
title2,intro2
title3,intro3

After running the Python script, a new “log.csv” file will be created with the following content:

title,intro
title1,intro1
title2,intro2
title3,intro3