How We Replaced Fragmented Spreadsheets With anAutomated Data Pipeline Across Six CannabisFacilities — and Cut Compliance Reporting From 3Days to 2 Hours

Cannabis is one of the most data-intensive regulated industries in the world. Every plant has to be tracked from seed to sale. Every transfer between facilities has to be logged. Every gram of inventory has to reconcile against state or provincial reporting systems. And if the numbers don’t match — the consequences aren’t a fine. They’re a licence revocation.

Most cannabis operators know this. What they don’t have is the data infrastructure to handle it at scale.

One of our clients — a multi-state cannabis operator running six cultivation, processing, and dispensary facilities — was managing all of this with spreadsheets, manual exports, and a compliance team that spent three full days every reporting cycle pulling data from disconnected systems, reconciling it by hand, and formatting reports for regulators.

They weren’t doing anything wrong. The spreadsheets were accurate — most of the time. But the process was fragile, slow, and entirely dependent on two people who knew where everything lived. When the business grew from three facilities to six, the process didn’t scale. The compliance team went from “stretched” to “one mistake away from a regulatory issue.”

They came to Exillar not for a dashboard or a new tool, but for the data layer underneath — an automated pipeline that would connect their seed-to-sale tracking, inventory management, and compliance reporting into a single, reliable system.

This is what we built, how it works, and what changed.

The Problem: Six Facilities, Seven Data Sources, Zero Automation

The client’s data problem wasn’t complexity — it was fragmentation. Every facility had its own systems, its own exports, and its own way of tracking things.

1. Seed-to-sale tracking system (Metrc)

The state-mandated tracking platform. Every plant, every transfer, every sale has to be logged here. But Metrc is a compliance tool, not an analytics platform. Getting data out of it for reporting or reconciliation required manual CSV exports.

2. Point-of-sale systems across four dispensaries

Each dispensary ran its own POS. Sales data, inventory movements, and customer transaction records lived in four separate databases with four different schemas.

3. Cultivation management software

Two cultivation facilities used different grow management platforms. Plant health data, harvest yields, and batch tracking were siloed in each.

4. Inventory management spreadsheets

Processing and packaging inventory was tracked in Excel. Updated manually. Version control was “whoever saved last wins.”

5. Accounting system

Financial data — COGS, revenue by facility, tax obligations — lived in QuickBooks. Reconciling financial data against operational data required manual cross-referencing.

6. Lab testing results

Third-party lab results for potency, terpenes, and contaminants came in as PDFs and were manually entered into spreadsheets for batch tracking.

7. State reporting templates

Each state had its own reporting format, its own data requirements, and its own submission schedule. The compliance team rebuilt reports from scratch for each jurisdiction every cycle.

The result: a compliance team of three spending three full days per reporting cycle — roughly every two weeks — manually pulling, cleaning, reconciling, and formatting data from seven sources across six facilities. The process worked. Until it didn’t scale.

Why Cannabis Data Infrastructure Is Different From Every Other Industry

Cannabis operators face data challenges that don’t exist in most other regulated industries. Understanding these constraints shaped every architectural decision in the pipeline.

Seed-to-sale traceability is legally mandatory

Unlike most supply chains where traceability is a best practice, cannabis traceability is a legal requirement. Every plant must be tracked from the moment it’s planted to the moment it’s sold to a customer. Gaps in the chain aren’t operational problems — they’re compliance violations.

Multi-state operators face different regulations in every market

A six-facility operator across three states has to comply with three different regulatory frameworks, three different reporting formats, and three different data submission requirements. There’s no federal standard. Every state is different.

Inventory discrepancies trigger audits

In most industries, a 2% inventory variance is a rounding error. In cannabis, any discrepancy between physical inventory and what’s reported in the seed-to-sale system can trigger a regulatory audit. The tolerance for error is effectively zero.

Data lives in state-mandated systems the operator doesn’t control

Metrc, BioTrack, and other seed-to-sale platforms are mandated by the state. Operators have to use them, but they don’t control the data model, the export format, or the API capabilities. Building a pipeline on top of these systems means working within constraints you can’t change.

Financial data and operational data have to reconcile perfectly

Tax authorities in legal cannabis markets require that financial reporting aligns with seed-to-sale tracking data. Revenue reported to the state tax authority has to match the sales reported to the cannabis regulatory body. If they don’t — both agencies come asking questions.

These constraints mean cannabis data infrastructure can’t be built with generic pipeline tools and default configurations. Every decision — ingestion frequency, validation rules, reconciliation logic, error handling — has to account for the regulatory reality.

What We Built: Architecture of the Supply Chain Data Pipeline

The pipeline has four layers, each designed to solve a specific part of the data fragmentation problem.

Layer 1 — Data Ingestion Layer

Automated connectors to all seven source systems: Metrc API integration, POS system database connectors (four dispensaries, two different POS platforms), cultivation management software API connections, automated ingestion of Excel-based inventory files, QuickBooks API for financial data, PDF parsing for lab test results (OCR + structured extraction), and state reporting template reverse-engineering for automated formatting. Data is pulled on a schedule matched to each source’s update frequency.

Layer 2 — Transformation and Normalisation Layer

Raw data from seven sources arrives in seven different formats. This layer normalises everything into a unified data model: plant lifecycle records from Metrc and cultivation software are merged into a single timeline per plant, inventory records are reconciled into a single inventory view per facility, financial records are mapped to operational data by facility and batch, and lab results are linked to specific batches and harvests. Every transformation is documented and testable. No black boxes.

Layer 3 — Data Warehouse

A single, structured source of truth storing complete plant lifecycle data (seed to sale) for every plant across all facilities, real-time inventory by facility and batch, financial data reconciled against operational data, lab testing results linked to batches, and historical compliance reports for audit trail.

Layer 4 — Reporting and Compliance Layer

Automated report generation for each state’s regulatory requirements. The system knows each state’s reporting format, data requirements, and submission schedule. Reports are generated automatically, validated against source data, and staged for the compliance team’s review before submission. The compliance team reviews and submits. The system does the assembly.

Seed-to-Sale Tracking: How the Pipeline Handles Plant Lifecycle Data

Seed-to-sale tracking is the backbone of cannabis compliance. The pipeline maintains a complete lifecycle record for every plant across all facilities.

What gets tracked per plant:

Planting date, strain, facility, and grow room
Every transfer between facilities (cultivation to processing, processing to dispensary)
Harvest date, wet weight, dry weight, and trim weight
Processing records (extraction, infusion, packaging)
Lab testing results (potency, terpenes, contaminants, pass/fail)
Final product creation (which plants/batches went into which products)
Sale records (which products sold, when, where, to whom)
Waste and destruction records (unsold or failed product)

How the pipeline maintains accuracy:

Every Metrc record is cross-referenced against the cultivation management software and POS data. If a transfer is logged in Metrc but doesn’t appear in the receiving facility’s POS system within 24 hours, the system flags it for investigation. If a batch’s lab results show a potency that’s statistically outside the range for that strain, the system flags it — not as an error, but as a data point that warrants verification. These validation rules were designed with the compliance team and reflect the specific checks that regulators perform during audits.

Inventory Reconciliation: Automated, Not Manual

Inventory reconciliation was the single most time-consuming task the compliance team performed manually. The pipeline automates it.

How reconciliation works:

Every 4 hours, the system compares three views of inventory:

Any discrepancy between these three views is flagged immediately with the specific product and facility, the size of the discrepancy (in grams), the likely source, and a recommended action.

Timing lag handling:

Most inventory discrepancies in cannabis aren’t theft or loss — they’re timing lags. A transfer logged in one system hasn’t synced to another yet. The pipeline distinguishes between “expected timing lag” discrepancies (which resolve within 24 hours) and “persistent discrepancies” (which need investigation). Before the pipeline, the compliance team spent an entire day per cycle reconciling inventory across six facilities by hand. Now the system does it continuously, and the team only intervenes when a persistent discrepancy is flagged.

Compliance Reporting: From 3 Days to 2 Hours

The reporting layer is where the pipeline’s value is most visible to the compliance team.

Before: 3 days of manual work per reporting cycle

Day 1: Export data from Metrc, POS systems, cultivation software, and Excel files. Manually clean and format.
Day 2: Cross-reference data sources. Investigate and resolve discrepancies. Rebuild reports in each state's required format.
Day 3: Quality check reports against source data. Format for submission. Submit.

After: 2 hours of review per reporting cycle

What the reporting layer handles per state:

Requirement	How It's Handled
Seed-to-sale tracking reports	Auto-generated from unified plant lifecycle data
Inventory reconciliation reports	Auto-generated from continuous reconciliation engine
Sales and tax reports	Auto-generated from POS data reconciled against financial records.
Waste and destruction reports	Auto-generated from Metrc waste records cross-referenced against processing data
Transfer manifests	Auto-generated from Metrc transfer data with facility-level validation.

Data Quality and Validation: Catching Errors Before Regulators Do

In cannabis, data quality isn’t a nice-to-have — it’s a compliance requirement. The pipeline includes 47 automated validation rules designed to catch errors before they reach a regulator’s desk.

Categories of validation:

Completeness checks

Timeliness checks

Anomaly detection

Every validation failure generates an alert to the appropriate team — operations for operational issues, compliance for regulatory issues, finance for financial discrepancies. The alerts include the specific data points, the rule that was violated, and a recommended action.

The Build Process: Discovery to Production in 5 Weeks

Week	Phase	What Happened
Week 1	Discovery	Mapped all 7 source systems across 6 facilities. Documented every data flow, export format, and manual process. Defined validation rules with the compliance team.
Week 2	Ingestion Layer	Built connectors to Metrc, POS systems, cultivation software, and QuickBooks. Set up automated file ingestion for Excel inventory files and PDF lab results.
Week 3	Transformation & Warehouse	Built the unified data model. Created transformation logic to normalise data from all sources. Deployed the warehouse with facility-level and batch-level views.
Week 4	Reconciliation & Validation	Built the automated inventory reconciliation engine. Implemented 47 validation rules. Configured alerting for discrepancies and violations.
Week 5	Reporting, Testing & Handover	Built automated report generation for each state. End-to-end testing against two reporting cycles of historical data. Documentation of every component. Training for compliance and operations teams.

Five weeks. Seven source systems. Six facilities. Three states. Discovery to production.

Results: What Changed After Go-Live

Compliance reporting cut from 3 days to 2 hours

The compliance team reviews and submits. The pipeline does the assembly, reconciliation, and formatting. What used to consume three full days every two weeks now takes a focused two-hour review session.

Zero compliance discrepancies in the first four reporting cycles

Every report submitted in the first two months passed regulatory review without a single query or correction request. Previously, the team averaged 2-3 minor corrections per cycle.

Inventory discrepancies identified 6x faster

Persistent inventory variances that used to be discovered during the biweekly manual reconciliation are now flagged within 24 hours. Two significant discrepancies — both data entry errors at a processing facility — were caught and corrected within a day of occurring.

Compliance team redeployed from data assembly to strategic work

Three days of manual data work every two weeks meant the compliance team was spending roughly 30% of their time on data assembly. That time is now spent on regulatory strategy, licence applications, and proactive audit preparation.

New facility onboarding dropped from 3 weeks to 3 days

When the client opened their seventh facility, connecting it to the pipeline took 3 days. Previously, integrating a new facility’s data into the manual reporting process took 2-3 weeks of building new spreadsheets and training.

Full audit trail from seed to sale

Every data point — from planting to sale — is traceable through the pipeline. When a regulator asks “show me the chain of custody for this batch,” the compliance team can produce the complete record in minutes, not hours.

What This Pipeline Replaced (And What It Didn't)

What it replaced:

Manual CSV exports from Metrc and POS systems
Excel-based inventory tracking at processing facilities
Manual data reconciliation across systems
Hand-built compliance reports reformatted for each state
The “two people who know where everything lives” dependency

What it didn’t replace:

Metrc and state-mandated tracking systems — the pipeline reads from them, doesn't replace them
The compliance team — they still review, verify, and submit every report
Physical inventory counts — the pipeline reconciles against them, but someone still has to count
Decision-making — the pipeline surfaces data; humans decide what to do with it

The pipeline is infrastructure, not a product. The compliance team uses the same regulatory portals they always used. What changed is the data layer underneath — the manual, fragile, person-dependent process of getting data from seven systems into a usable, trustworthy format.

When an Automated Data Pipeline Makes Sense for Cannabis Operators

You operate 3+ facilities

Below three facilities, a well-organised spreadsheet process can work. Above three, the fragmentation becomes structural and the manual reconciliation time grows nonlinearly.

Your compliance team spends more than 2 days per reporting cycle on data assembly

If the time is going to pulling, cleaning, and formatting — not analysis and review — the process is the bottleneck, not the team.

You’ve had a compliance discrepancy in the last 12 months

Even a minor one. If the process is fragile enough to produce errors, it’ll produce bigger ones as you scale.

You’re expanding to new states or new facilities

Every new facility and every new state multiplies the complexity. If onboarding a new facility takes weeks instead of days, the architecture can’t scale with the business.

Inventory reconciliation depends on specific people

If two people leaving would create a compliance crisis, the knowledge isn’t in a system — it’s in their heads. That’s a risk, not a process.

Frequently Asked Questions

Does this replace our seed-to-sale tracking system (Metrc, BioTrack, etc.)?

No. The pipeline integrates with your state-mandated tracking system — it reads data from it and validates against it. You continue using Metrc or whichever system your state requires exactly as before

How do you handle different regulations across states?

Each state’s reporting requirements are configured separately in the reporting layer. The pipeline maintains a unified data model underneath, but the reports it generates are formatted to each state’s specific requirements. When you expand to a new state, we add the state configuration — typically 2-3 days of work.

What happens if our POS system or cultivation software changes?

The ingestion layer is built with modular connectors. Replacing a POS system means building a new connector to the new system — typically 1-2 days — not rebuilding the pipeline. The transformation, warehouse, and reporting layers don’t change.

Is our data secure?

The pipeline runs within your cloud environment. No operational data leaves your infrastructure. Access is role-based — the compliance team sees compliance data, operations sees operational data, and financial data is restricted to finance and leadership.

How often does the pipeline sync data?

It depends on the source. Metrc syncs every 2 hours. POS systems sync every 30 minutes during operating hours. Lab results are ingested as they arrive. Financial data syncs daily. These frequencies are configurable based on your needs.

Can the pipeline handle additional data sources we add later?

Yes. The ingestion layer is designed to be extensible. Adding a new data source typically takes 1-3 days depending on the source’s API capabilities.

How much does this cost?

It depends on the number of facilities, source systems, and states. A 3-facility pipeline with standard reporting typically falls between £25-45k. A 6+ facility multi-state build is £40-70k. We’ll give you a clear number after the discovery call — no surprise invoices.

What if we only have 2-3 facilities?

The pipeline still makes sense if your compliance reporting is consuming more than 2 days per cycle or if you’re planning to expand. For operators with 1-2 facilities and straightforward reporting, a well-organised manual process may still be sufficient — and we’ll tell you that honestly.