Nilson Junior
Nilson Junior20/04/2024 14:15
Compartilhe

Reading and Writing CSV file with NodeJS

  • #JavaScript
  • #Node.js
  • #Node Package Manager (NPM)

Currently is very important to know how to handle files in csv format.

This article has the main purpose of showing one of the ways of how to handle files in csv format using NodeJS.

In order to follow on with the development we are going to use two csv files as examples from Kaglle website. One of these files is a product list and the other one is a category list for the product list.

Once the product list has more than one million registers, we are going to use only one thousand of them for this example.

We are going to create a script that will read both files and create a third file with information from both of them. In this case, we will use the product code and the product name from the product file, and the category description from the category file.

We are going to use NPM as the package manager.

Create a local directory and run the command in the terminal to start the project.

$ npm init

We need to install some dependency that is going to be used to read and write the csv file.

$ npm i file-system
$ npm i csv-parse
$ npm i csv-stringify

In general, the file-system package is responsible for reading and writing in the local directory. 

The csv-parse dependency is used to convert a CSV into an array of objects while csv-stringify is used to convert records into a CSV.

After installing all dependencies we create an index.js file and add the installed modules that will be used.

const fs = require('file-system')
const { parse } = require("csv-parse")
const { stringify } = require("csv-stringify")

In the same directory, we also add the two csv files and then we create a function responsible for reading the product file.

The product list has some values not valid for category ID, for this reason, we are using a simple way of trying to convert and then checking if it is a valid number.

const processProductFile = async () => {
return new Promise((resolve, reject) => {
  const path = './amazon_products.csv';
  const stream = fs.createReadStream(path);
  const parser = parse();
  stream.on('ready', () => {
    console.log('Reading product file')
    stream.pipe(parser);
  });
  parser.on('readable', function () {
    let record;
    while (record = parser.read()) {
      let categoryId = parseInt(record[8])
      if(Number.isInteger(categoryId)){
        product= {
          "productId": record[0],
          "productName": record[1],
          "productCategoryId": categoryId,
        }
        productList.push(product)
      }
    }
  });
  parser.on('error', function (err) {
    console.error(err.message)
    reject();
  });
  parser.on('end', function () {
    console.log('Product process completed')
    resolve();
  });
});
}

In the same way, we create a function responsible for reading the category file.

const processCategoryProductFile = async () => {
return new Promise((resolve, reject) => {
  const path = './amazon_categories.csv';
  const stream = fs.createReadStream(path);
  const parser = parse();
  stream.on('ready', () => {
    console.log('Reading category file...')
    stream.pipe(parser);
  });
  parser.on('readable', function () {
    let record;
    while (record = parser.read()) {
      let categoryId = parseInt(record[0])
      if(Number.isInteger(categoryId)){
        category = {
          "categorytId": categoryId,
          "categoryName": record[1]
        }
        categoryList.push(category)
      }
    }
  });
  parser.on('error', function (err) {
    console.error(err.message);
    reject();
  });
  parser.on('end', function () {
    console.log('Category process completed')
    resolve()
  });
});
}

Then we create a function that will be responsible for reading the list and writing a csv file.

const downloadFile = async () => {
  let columns = {
      productId: 'productId',
      productName: 'productName',
      categoryName: 'categoryName',
  }
  stringify(productCategoryList, { header: true, columns: columns },
      (err, output) => {
          if (err) throw err;
          fs.writeFile(`product-category.csv`, output, (err) => {
              if (err) throw err;
              console.log(`product-category.csv.`);
          })
      })
}

The run function is responsible for calling all of the other functions such as to read files and download the new one. It is also responsible for creating a list that is going to be used to create a new csv file that will be downloaded.

const run = async()=> {
  await processProductFile()
  await processCategoryProductFile()
  productList.forEach(product => {
      categoryList.forEach(category => {
          if (product.productCategoryId == category.categorytId){
              productCategory = [product.productId, product.productName, category.categoryName]
              productCategoryList.push(productCategory)
          }
      })
  })
  await downloadFile()
}

To execute the script we type the following command:

$ node index.js

There are other ways to process files. Here is just an example of one of them.

The full project is available in the Github repository: csv-manager

Compartilhe
Comentários (0)