My favourite library for reading, writing, and manipulating Excel files in Node.JS is ExcelJS. In this tutorial, we’ll look at how to install the library and how to dynamically read an Excel file.
In the business world, there’s a joke that businesses run off Excel files. Sadly, this can be very true – I’ve received a lot of Excel files over the years and had to do data imports or build visualizations off of them. I’ve used code like we’ll look at today in a lot of applications.
Installing ExcelJS
Installing ExcelJS is pretty easy. In the root of your project, just simply run “npm i exceljs” from the terminal or command prompt.
Setting Up Our Code for ExcelJS
We can import or require ExcelJS by just requiring exceljs. If you need ES5 transpiled code, you’ll need to import ‘exceljs/dist/es5’ instead and add the core-js dependencies too.
In Microsoft Excel, a workbook is a collection of worksheets (or “spreadsheets”) in a single file. In the Microsoft Excel API, operations usually begin from the workbook object.
One thing to keep in mind whenever doing something with Excel, is that everything is one based.
Reading a Particular Worksheet
There’s a few different ways that we can access worksheets. Worksheets can be accessed by iteration, through the Id, or by the worksheet tab name.
I prefer to access worksheets by iteration or by name because these methods are a lot more predictable. When doing By Id or by Number keep in mind that the numbers aren’t necessarily sequential because a worksheet may no longer exist.
Accessing Worksheets Through Iteration
As mentioned above, I find accessing worksheets through iteration to be the easiest and more predictable way of reading an Excel sheet.
With ExcelJS we have a few different ways we can iterate through through the worksheets. We can do it by a for loop, a .forEach off of workbook.worksheets.forEach, or doing workbook.eachSheet.
I find workbook.eachSheet to be the easiest to read.
Build In Metrics
Within ExcelJS, there are a few different variables that we can use to make life easier.
We can use worksheet.actualColumnCount to get the number of columns that potentially have any data in them.
We can use worksheet.actualRowCount to get the size of the rows that potentially have data in them.
As we’re iterating through rows, we can use row.cellCount to get the number of cells.
Reading a Particular Cell
In Excel, a cell is the boxes you see in the grid of an Excel worksheet. Cells are traditionally referenced by the column letter, and row number. For example “A1” is the first cell on an Excel worksheet.
Within ExcelJS, we can also use a number instead of a letter as the column reference. For example “1,1” could also be a reference to “A1”.
If we need the value of the cell, we need to use .getCell() and then use the .value property. Like this:
It’s much easier to use a number as a reference, although it can be confusing when you are comparing the code to a spreadsheet. And of course, when looping this is a lot easier than trying to translate a number to a letter.
Dynamically Reading an Entire Document
When I work with Excel documents in Node.js I like to convert them into a JavaScript array of objects so I can easily manipulate them. So, an Excel document that looks like this will end up looking like this JavaScript array of objects.
I usually start with a helper function that takes in the path to the Excel file, and then it reads it and loops through the various worksheets.
For each worksheet it assumes that the first row is a header row. In a lot of cases, this is a good assumption but it never hurts to check. 😊
For each worksheet, we loop through the rows and and the available columns building an object that we eventually add an object called “theRow” into the “theData” array.
At the end, we return theData which is our array of objects.
Wrapping It Up
As you can see, having a knowledge of loops and the ExcelJS library can allow us to do some pretty advanced things without necessarily having to have a human do a lot of data entry.
In this blog post on ExcelJS we covered how to read in an Excel file and convert it to a JavaScript array that contains objects based on headers built from the first row.