A small Node.js ETL toolkit for parsing positional / fixed-width text files with multiple line types (parent/child) and persisting the resulting structured data to either PostgreSQL (via Sequelize) or MongoDB (via Mongoose).
Originally built to process several versions of a banking/insurance transaction format (the m41, m51, m80, m90 mapping families), but the engine is format-agnostic: each format is described declaratively as a "file mapping" — discriminator + per-line attribute layout — so adding a new format is a config exercise, not new code.
Some legacy interchange formats (insurance, banking, healthcare, AS/400 exports) ship as flat text files where:
- Each line is a fixed-width record.
- The first N characters are a discriminator that selects which schema applies to the rest of the line.
- Records form a tree: a "header" is the parent of "lines", which can be parents of "events", and so on.
This library:
- Parses the file into a flat array of typed objects according to a mapping.
- Reconstructs the tree (parent UUIDs are propagated; children are pushed into nested attributes of their parent).
- Optionally re-binds the parsed objects into a different schema (e.g. domain events) using a separate "binding map".
- Persists either via Sequelize (Postgres) or Mongoose (MongoDB).
.
├── index.js # main entry — m80 example, writes to Postgres via Sequelize
├── indexM41.js / indexM51.js # entry points for older message formats
├── indexM90.js # m90 entry point (Postgres)
├── indexM90MongoDBVersion.js # same flow but writing to MongoDB
├── positionalFileHelper.js # core parser (positional → objects with nested tree)
├── positionalFileHelperClass.js
├── convertHelper.js # binding map → re-shape parsed objects (e.g. into events)
├── nestedUtil.js # set/push helpers for nested attributes
├── uniquifyHelper.js
├── fileMapDBConvert.js
├── dbConnection.js # Sequelize models: dataTransfers, dataTransferLines, dataTransferEvents
├── mongo.js # Mongoose connection / models
├── mappers/
│ ├── m41FileMapping.js # per-version line layouts (discriminator → list of attributes)
│ ├── m41BindMapping.js # per-version "produce events from objects" rules
│ ├── m51FileMapping.js / m51BindMapping.js
│ ├── m80FileMapping.js / m80BindMapping.js
│ └── m90FileMapping.js / m90BindMapping.js
├── files/ # sample input files
└── *.test.js # Jest tests for parser, helpers, each entry point
// mappers/m41FileMapping.js
function getFileMapping() {
return {
discriminatorInitialPosition: 0, // where the discriminator starts
discriminatorLength: 5, // its length
lines: new Map([
["00.00", [
{ name: "regexc", initialPosition: 0, length: 5, type: "string", required: true },
{ name: "codexc", initialPosition: 0, length: 2, type: "integer", required: true },
{ name: "datexc", initialPosition: 34, length: 14, type: "date", required: false,
dateFormat: "YYYYMMDDHHmmss" },
// …
]],
["10.00", [
{ name: "parentRef", type: "parent",
parentDiscriminator: "00.00", parentAttribute: "uuid", childName: "lines" },
// …
]],
]),
};
}Supported attribute types: integer, date (with dateFormat), string, parent (back-reference into a previously seen line of given discriminator).
npm install
node indexM41.js # processes ./files/358M41…txt with the m41 mapping → Postgres
node indexM90MongoDBVersion.js # m90 variant → MongoDBDB connection details live in dbConnection.js / mongo.js — adapt for your environment.
npm test # Jest, runs all *.test.jsCoverage: the core parser, nested-attribute helpers, converter, and each per-version entry point each have their own test file.
sequelize+pg(Postgres ORM)mongoose+mongodb(Mongoose ODM)moment(date parsing — note: now in maintenance mode, considerdayjs/luxonif extending)uuid(line UUIDs)jest(tests, dev)
Working but archival/reference: it solves a real ETL pattern but is not actively maintained. Useful as a starting point for anyone parsing similar formats.