databook-cli

Node.js CLI for DataBook semantic documents — Markdown files that carry typed RDF/SPARQL/SHACL payloads alongside human-readable prose and self-describing YAML frontmatter.

Spec: github.com/kurtcagle/databook · Namespace: https://w3id.org/databook/ns#

Installation

# From the package directory
npm install -g .

# Or run directly
node bin/databook.js <command>

Requires Node.js ≥ 18.0.0 (uses native fetch).

Databook Property Reference Guide

A guide to the YAML Front Matter and data block properties for databooks is available here.

Commands

`databook head` — Inspect a DataBook

Extracts frontmatter and block metadata. Never modifies input. Useful for pipeline inspection and conditional branching.

# Default: frontmatter + block summary as JSON
databook head source.databook.md

# Specific block metadata
databook head source.databook.md --block-id primary-block

# All output formats
databook head source.databook.md --format json     # default
databook head source.databook.md --format yaml
databook head source.databook.md --format xml
databook head source.databook.md --format turtle

# Pipeline use: extract block ids with role=primary
databook head source.databook.md --format json \
  | jq -r '.blocks[] | select(.role == "primary") | .id'

# Check triple count before processing
TRIPLE_COUNT=$(databook head source.databook.md --format json \
  | jq '.frontmatter.graph.triple_count')

# Stdin
cat source.databook.md | databook head --format yaml

`databook push` — Send DataBook RDF to a triplestore

Pushes RDF blocks to a SPARQL-compatible triplestore via the SPARQL 1.1 Graph Store Protocol (GSP). Each block becomes a discrete named graph. Frontmatter provenance is pushed to a #meta graph by default.

Pushable block types: turtle, turtle12, trig, json-ld, shacl, sparql-update

# Push all RDF blocks to Fuseki
databook push ontology.databook.md \
  --endpoint http://localhost:3030/ds/sparql

# Push one block with an explicit graph IRI
databook push ontology.databook.md \
  --block-id primary-block \
  --graph https://example.org/my-graph \
  --endpoint http://localhost:3030/ds/sparql

# Merge (POST) instead of replace (PUT)
databook push ontology.databook.md --endpoint ... --merge

# Suppress meta graph
databook push ontology.databook.md --endpoint ... --no-meta

# Dry-run to see what would be sent
databook push ontology.databook.md --endpoint ... --dry-run

# Auth via env var (recommended for CI)
DATABOOK_FUSEKI_AUTH="Basic YWRtaW46cGFzc3dvcmQ=" \
  databook push file.databook.md --endpoint http://host/ds/sparql

Named graph assignment (priority order):

--graph <iri> (single-block only)
frontmatter.graph.named_graph (single-block documents only)
Fragment-addressing rule: {document.id}#{block-id}

GSP endpoint inference: /sparql → /data, /query → /data. Override with --gsp-endpoint for non-Fuseki stores.

`databook pull` — Fetch RDF from a triplestore into a DataBook

Retrieves RDF from a SPARQL endpoint into a DataBook. Operates in three modes:

Mode	Trigger	Protocol
Named graph fetch	Default	GSP GET
External query	`--query <file>`	SPARQL POST
Fragment-ref	`--fragment <block-id>`	SPARQL POST using embedded block

# Fetch named graph to stdout
databook pull sensors.databook.md \
  --endpoint http://localhost:3030/ds/sparql

# Fetch specific graph IRI
databook pull sensors.databook.md \
  --endpoint http://localhost:3030/ds/sparql \
  --graph https://example.org/sensors

# Execute embedded SPARQL block and replace data block in-place
databook pull sensors.databook.md \
  --endpoint http://localhost:3030/ds/sparql \
  --fragment sensor-construct \
  --block-id sensor-graph \
  --stats \
  --out sensors.databook.md       # same path = atomic in-place update

# External .sparql/.rq file
databook pull onto.databook.md \
  --endpoint http://localhost:3030/ds/sparql \
  --query queries/extract.sparql \
  -o result.ttl

# Dry-run shows the extracted SPARQL query
databook pull sensors.databook.md \
  --fragment sensor-construct \
  --dry-run

--stats recomputes graph.triple_count and graph.subjects in frontmatter using N3.js after a successful Turtle/TriG pull.

`databook process` — Execute a pipeline DataBook

Executes a processor-registry DataBook as a DAG pipeline against a source DataBook. Supports:

Full pipeline mode (-P <process-databook>): Multi-stage DAG with build:dependsOn ordering
Single-operation shorthand: --sparql, --shapes, --xslt, --xquery

Supported processors (configured via processors.toml):

Type	Tool	processors.toml key
`sparql`	Apache Jena ARQ	`jena-sparql`
`shacl`	Apache Jena SHACL	`jena-shacl`
`xslt`	Saxon HE 12	`saxon-xslt`
`xquery`	Saxon HE 12	`saxon-xquery`
`sparql-anything`	SPARQL Anything 0.9	`sparql-anything`

# Full pipeline
databook process source.databook.md \
  -P pipeline.databook.md \
  -o output.databook.md

# Single SPARQL CONSTRUCT
databook process source.databook.md \
  --sparql queries.databook.md#construct-graph \
  -o output.databook.md

# Single SHACL validation
databook process source.databook.md \
  --shapes shapes.databook.md#person-shapes \
  -o report.databook.md

# With VALUES parameter injection
databook process source.databook.md \
  --sparql queries.databook.md#typed-query \
  --params '{"type":"ex:Person"}' \
  -o people.databook.md

# Dry-run shows execution plan
databook process source.databook.md -P pipeline.databook.md --dry-run

DAG execution model: Stages are topologically sorted by build:dependsOn edges. Within a topological layer, build:order is the tiebreaker.

Configuration

`processors.toml`

Declares deployment details for processors and endpoints. Three-layer discovery chain (later layers override earlier):

{package}/processors.default.toml     ← shipped template (read-only)
~/.config/databook/processors.toml    ← user-level
{project}/.databook/processors.toml   ← project-level

Do not commit processors.toml to version control. Add it to .gitignore.

Example:

[default_endpoint]
sparql = "http://localhost:3030/ds/sparql"

[endpoints."http://localhost:3030"]
auth = "Basic YWRtaW46cGFzc3dvcmQ="

[processor."https://w3id.org/databook/plugins/core#jena-sparql"]
command   = "/usr/local/jena/bin/sparql"
version   = "6.0.0"
jvm_flags = "-Xmx4g"

[processor."https://w3id.org/databook/plugins/core#jena-shacl"]
command   = "/usr/local/jena/bin/shacl"
version   = "6.0.0"
jvm_flags = "-Xmx4g"

[processor."https://w3id.org/databook/plugins/core#saxon-xslt"]
jar       = "/usr/local/lib/saxon-he-12.jar"
version   = "12.0"
jvm_flags = "-Xmx2g"

See processors.default.toml for the full schema with all supported keys.

Authentication

The auth credential is resolved in priority order:

--auth <credential> flag
DATABOOK_FUSEKI_AUTH environment variable
processors.toml [endpoints."<url>"] .auth or .auth_env

Credential forms: Basic <base64>, Bearer <token>, or bare <base64> (auto-prefixed with Basic).

Exit Codes

Code	Meaning
`0`	Success
`1`	Runtime error (partial failure, processor error)
`2`	Usage / configuration error (bad args, missing flags)
`3`	Authentication failure (401 / 403)
`4`	Endpoint unreachable (connection refused, DNS failure)
`5`	Empty result — query succeeded but returned no triples/rows

POSIX Pipeline Compatibility

All commands are POSIX-composable:

# Inspect output of a process pipeline
databook process source.databook.md -P pipeline.databook.md \
  | databook head --format json

# Round-trip: push, transform in store, pull back
databook push sensors.databook.md --endpoint http://host/ds/sparql
# ... external store transformations ...
databook pull sensors.databook.md \
  --endpoint http://host/ds/sparql \
  --fragment sensor-construct \
  --block-id sensor-graph \
  --stats \
  --out sensors.databook.md

Fragment Addressing

Any payload reference uses the form {document}#{block-id}:

# Relative path
--sparql queries.databook.md#construct-graph

# Absolute path
--sparql /data/queries.databook.md#construct-graph

# Current document (same-document fragment)
--fragment sensor-construct

# Full IRI
--sparql https://example.org/queries-v1#select-all

Dependencies

Package	Role
`commander`	CLI argument parsing
`js-yaml`	YAML frontmatter parsing
`@iarna/toml`	`processors.toml` parsing
`n3`	Turtle parsing for stats; process DataBook catalogue parsing

Requires Node.js ≥ 18 for native fetch.

DataBook Spec References

Conventions — stdin/stdout, fragment addressing, processors.toml, exit codes
head spec
push spec
pull spec
process spec

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
bin		bin
commands		commands
databook-patch		databook-patch
databook-viewer		databook-viewer
docs		docs
examples		examples
implementations/js		implementations/js
lib		lib
llm-skills/databook		llm-skills/databook
schema		schema
templates		templates
test		test
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CHANGES.md		CHANGES.md
MIGRATION.md		MIGRATION.md
README-v1.4.2.md		README-v1.4.2.md
README-v1.4.4.md		README-v1.4.4.md
README.md		README.md
SPEC.md		SPEC.md
databook-cli-transform-spec.databook.md		databook-cli-transform-spec.databook.md
databook-property-reference.databook.md		databook-property-reference.databook.md
obs-data.databook.md		obs-data.databook.md
package.json		package.json
patch-create.mjs		patch-create.mjs
processors.default-old.toml		processors.default-old.toml
processors.default.toml		processors.default.toml
processors.toml		processors.toml
query-response.databook.md		query-response.databook.md
rdfxml-to-html.xslt		rdfxml-to-html.xslt
sensor-construct.databook.md		sensor-construct.databook.md
sensor-list.md		sensor-list.md
sensors.databook.md		sensors.databook.md
test.create.databook.md		test.create.databook.md
test.observatory.databook.md		test.observatory.databook.md
test3.databook.md		test3.databook.md
transformed-xml.html		transformed-xml.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

databook-cli

Installation

Databook Property Reference Guide

Commands

`databook head` — Inspect a DataBook

`databook push` — Send DataBook RDF to a triplestore

`databook pull` — Fetch RDF from a triplestore into a DataBook

`databook process` — Execute a pipeline DataBook

Configuration

`processors.toml`

Authentication

Exit Codes

POSIX Pipeline Compatibility

Fragment Addressing

Dependencies

DataBook Spec References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

databook-cli

Installation

Databook Property Reference Guide

Commands

databook head — Inspect a DataBook

databook push — Send DataBook RDF to a triplestore

databook pull — Fetch RDF from a triplestore into a DataBook

databook process — Execute a pipeline DataBook

Configuration

processors.toml

Authentication

Exit Codes

POSIX Pipeline Compatibility

Fragment Addressing

Dependencies

DataBook Spec References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`databook head` — Inspect a DataBook

`databook push` — Send DataBook RDF to a triplestore

`databook pull` — Fetch RDF from a triplestore into a DataBook

`databook process` — Execute a pipeline DataBook

`processors.toml`

Packages