Skip to content

bilouro/nodejs-file-reading

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

87 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nodejs-file-reading

A small Node.js ETL toolkit for parsing positional / fixed-width text files with multiple line types (parent/child) and persisting the resulting structured data to either PostgreSQL (via Sequelize) or MongoDB (via Mongoose).

Originally built to process several versions of a banking/insurance transaction format (the m41, m51, m80, m90 mapping families), but the engine is format-agnostic: each format is described declaratively as a "file mapping" — discriminator + per-line attribute layout — so adding a new format is a config exercise, not new code.

What problem it solves

Some legacy interchange formats (insurance, banking, healthcare, AS/400 exports) ship as flat text files where:

  • Each line is a fixed-width record.
  • The first N characters are a discriminator that selects which schema applies to the rest of the line.
  • Records form a tree: a "header" is the parent of "lines", which can be parents of "events", and so on.

This library:

  1. Parses the file into a flat array of typed objects according to a mapping.
  2. Reconstructs the tree (parent UUIDs are propagated; children are pushed into nested attributes of their parent).
  3. Optionally re-binds the parsed objects into a different schema (e.g. domain events) using a separate "binding map".
  4. Persists either via Sequelize (Postgres) or Mongoose (MongoDB).

Project layout

.
├── index.js                    # main entry — m80 example, writes to Postgres via Sequelize
├── indexM41.js / indexM51.js   # entry points for older message formats
├── indexM90.js                 # m90 entry point (Postgres)
├── indexM90MongoDBVersion.js   # same flow but writing to MongoDB
├── positionalFileHelper.js     # core parser (positional → objects with nested tree)
├── positionalFileHelperClass.js
├── convertHelper.js            # binding map → re-shape parsed objects (e.g. into events)
├── nestedUtil.js               # set/push helpers for nested attributes
├── uniquifyHelper.js
├── fileMapDBConvert.js
├── dbConnection.js             # Sequelize models: dataTransfers, dataTransferLines, dataTransferEvents
├── mongo.js                    # Mongoose connection / models
├── mappers/
│   ├── m41FileMapping.js       # per-version line layouts (discriminator → list of attributes)
│   ├── m41BindMapping.js       # per-version "produce events from objects" rules
│   ├── m51FileMapping.js / m51BindMapping.js
│   ├── m80FileMapping.js / m80BindMapping.js
│   └── m90FileMapping.js / m90BindMapping.js
├── files/                      # sample input files
└── *.test.js                   # Jest tests for parser, helpers, each entry point

How a file mapping is declared

// mappers/m41FileMapping.js
function getFileMapping() {
  return {
    discriminatorInitialPosition: 0,   // where the discriminator starts
    discriminatorLength: 5,            // its length
    lines: new Map([
      ["00.00", [
        { name: "regexc", initialPosition: 0,  length: 5,  type: "string",  required: true },
        { name: "codexc", initialPosition: 0,  length: 2,  type: "integer", required: true },
        { name: "datexc", initialPosition: 34, length: 14, type: "date",    required: false,
          dateFormat: "YYYYMMDDHHmmss" },
        // …
      ]],
      ["10.00", [
        { name: "parentRef", type: "parent",
          parentDiscriminator: "00.00", parentAttribute: "uuid", childName: "lines" },
        // …
      ]],
    ]),
  };
}

Supported attribute types: integer, date (with dateFormat), string, parent (back-reference into a previously seen line of given discriminator).

Run

npm install
node indexM41.js     # processes ./files/358M41…txt with the m41 mapping → Postgres
node indexM90MongoDBVersion.js   # m90 variant → MongoDB

DB connection details live in dbConnection.js / mongo.js — adapt for your environment.

Tests

npm test    # Jest, runs all *.test.js

Coverage: the core parser, nested-attribute helpers, converter, and each per-version entry point each have their own test file.

Dependencies

  • sequelize + pg (Postgres ORM)
  • mongoose + mongodb (Mongoose ODM)
  • moment (date parsing — note: now in maintenance mode, consider dayjs/luxon if extending)
  • uuid (line UUIDs)
  • jest (tests, dev)

Status

Working but archival/reference: it solves a real ETL pattern but is not actively maintained. Useful as a starting point for anyone parsing similar formats.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors