Publish to Virtuoso

DAG ID: publish_to_virtuoso Schedule: Manual trigger only File: dags/publish_to_virtuoso.py

What it does

Publishes validated RDF graphs to Virtuoso. For each graph:

Drops the existing named graph (clean slate)
Uploads the TTL data (chunked for large files)
Computes RDF statistics
Generates metadata TTL (using nfdicore/BFO vocabulary) and appends it to the same graph
Writes a publish report JSON

Destructive operation

This DAG drops existing named graphs before uploading. Make sure the source data is correct before triggering.

Source	Description
`matwerk_sharedfs` (Variable)	Shared filesystem
`matwerk-virtuoso_crud` (Variable)	Virtuoso CRUD endpoint URL
`matwerk-virtuoso_sparql` (Variable)	Virtuoso SPARQL endpoint URL
`matwerk-virtuoso_user` (Variable)	Virtuoso username
`matwerk-virtuoso_pass` (Variable)	Virtuoso password
`virtuoso_chunk_bytes` (Variable, optional)	Chunk size for uploads (default: 5 MiB)

Published sources (5 graphs):

Named Graph	Variable	TTL File
`matwerk/spreadsheets_assertions`	`matwerk_last_successful_merge_run`	`spreadsheets_asserted.ttl`
`matwerk/spreadsheets_inferences`	`matwerk_last_successful_reason_run`	`spreadsheets_inferences.ttl`
`matwerk/spreadsheets_validated`	`matwerk_last_successful_validated_run`	`spreadsheets_merged_for_validation.ttl`
`matwerk/zenodo_validated`	`matwerk_last_successful_harvester_zenodo_run`	`zenodo_merged_for_validation.ttl`
`matwerk/endpoints_validated`	`matwerk_last_successful_harvester_endpoints_run`	`endpoints_merged_for_validation.ttl`

Output	Location
`publish_report.json`	Detailed status report for each graph
`{stage}__metadata.ttl`	Per-graph metadata TTL
Named graphs in Virtuoso	`https://nfdi.fiz-karlsruhe.de/matwerk/{stage}`

init_publish_dir -> publish_all

None. After this succeeds, optionally trigger dump_and_archive to create a Zenodo release.