Insight · Data & AI

Building the digital backbone

Moving to a Databricks-centred data and analytics platform, joined by middleware, observed with Grafana and made real through portfolio rationalisation. A practical view on turning a tangle of systems into one backbone the whole business can build on.

Short on time? Read the one-page summary

Executive summary

Most large organisations do not have a data problem. They have a wiring problem. Decades of point solutions, bespoke interfaces and departmental reporting have left a data estate that is expensive to run, slow to change and impossible to trust end to end. Every new question means another integration, and every answer arrives a little too late.

The way out is not another tool. It is a backbone: a single, governed foundation that source systems plug into once, that analytics and AI build on safely, and that the business can see into clearly. Our view is that this backbone is best anchored on a lakehouse, connected by modern middleware, served through a governed data and analytics layer, and observed end to end with Grafana, while the legacy portfolio is deliberately rationalised rather than left to sprawl.

Two threads run through this paper. The first is the SAP to lakehouse path, because for most large organisations the core that runs the business lives in SAP, and freeing that data is the single highest-value move. The second is agentic AI, because once data is governed and live, the prize is not just better dashboards but software that can reason and act on the business safely. The backbone is what makes both possible.

Why a backbone, not another platform

Buying another platform without fixing the wiring just adds a node to the tangle. A backbone is different in intent: it is the small set of shared services everything else depends on, designed so that adding a system makes the whole estate stronger, not more fragile.

  • One governed copy of trusted data, not a copy per report
  • Systems integrated once, through a stable contract, not point to point
  • Analytics and AI that build on the same foundation, not their own silos
  • A single view of health, cost and performance across the estate

The target state

The target is a layered architecture with clear responsibilities. Source systems feed an integration layer. The integration layer lands data into the lakehouse. The lakehouse serves a governed data and analytics layer. Consumers, from dashboards to applications to AI agents, draw from that layer. Observability and governance run across all of it.

The integration middleware

The middleware is what stops the backbone becoming another web of brittle connections. An event and API led layer decouples the systems that produce data from the systems that consume it, so either side can change without breaking the other.

  • API led access to core systems such as SAP, with stable contracts
  • Event streaming for near real time movement where it earns its keep
  • One place to apply security, throttling and audit, consistently
  • Reuse of integrations across teams, instead of rebuilding each time

The lakehouse at the core

At the centre sits a Databricks lakehouse: one foundation for both analytics and AI, rather than a warehouse for reporting and a separate lake for data science. A medallion structure keeps it disciplined. Raw data lands in a bronze layer, is cleaned and conformed into silver, and is shaped into business ready gold products that the rest of the organisation consumes.

The test of a backbone is simple: can a new team answer a new question without building new plumbing? If yes, the foundation is doing its job. WAJD Group

Because storage is open and compute is elastic, the same governed data serves a finance dashboard, a supply chain model and an AI agent without three copies and three sets of rules.

The SAP to lakehouse path

For most large organisations the systems that run the business live in SAP, and that is exactly where the data is hardest to reach. The instinct to build report after report inside the ERP only deepens the lock in. The higher-value move is to make SAP a first-class, governed source on the backbone, while leaving SAP to do what it does best: run the transactions.

We treat the SAP estate as a source, not a destination. Business data is extracted in a controlled, supported way and landed into the lakehouse, where it can be combined with everything else the organisation knows.

  • Extract through supported, change-aware methods, not brittle table scraping
  • Preserve business meaning, so an SAP object stays understandable in the lakehouse
  • Combine SAP with non-SAP data, the questions ERP reporting could never answer
  • Keep SAP clean and current, with analytics and AI load moved off the core

The result is the best of both worlds: a stable, well-run SAP core, and a lakehouse where its data is finally free to drive analytics, planning and AI alongside every other source.

Data, analytics and AI

On top of the lakehouse sits the layer the business actually touches. A semantic layer gives consistent definitions, so a metric means the same thing everywhere. Self service analytics lets teams explore safely within guardrails. And because the data is already governed and AI ready, machine learning and agentic AI become a natural extension of the platform rather than a separate science project.

  • Consistent, governed metrics and a shared semantic layer
  • Self service analytics inside clear guardrails
  • Machine learning and agentic AI on the same governed foundation
  • A clear path from a dashboard insight to an automated action

Agentic AI on the backbone

This is where the backbone stops being a cost centre and becomes an advantage. Once data is governed, live and trustworthy, the prize is not just faster reporting. It is agentic AI: software that can perceive the state of the business, reason about it, and take action, safely and under control.

Agents are only as good, and as safe, as the foundation they stand on. An agent acting on stale, ungoverned, scattered data is a liability. An agent on the backbone draws from one governed source, with lineage behind every fact and policy around every action.

  • Grounded in governed data, so agents reason from one version of the truth
  • Guardrails and human-in-the-loop approval for anything high-impact
  • Every action logged, explainable and reversible, never a black box
  • Continuous evaluation, so agents are measured and tuned, not trusted blindly

The backbone is what turns agentic AI from a demo into something you can run in production. It is why we treat data foundations and AI ambition as one programme, not two.

Observability with Grafana

A backbone you cannot see is a backbone you cannot trust. Grafana gives one pane of glass across three things that are usually watched separately: the health of the platform, the health of the data pipelines, and the health of the business metrics that ride on top.

  • Platform and pipeline health, with alerting before users feel it
  • Data freshness and quality signals, surfaced not buried
  • Cost and consumption visibility, so spend stays deliberate
  • Business KPIs and technical signals in the same view

Rationalising the portfolio

A backbone only pays back if the old wiring comes out. Standing up new capability while leaving every legacy system in place doubles the running cost and the risk. Portfolio rationalisation is the discipline of deciding, system by system, what to keep, what to consolidate and what to retire as the backbone takes the load.

  • A clear inventory of applications, data flows and their real cost
  • A strangler approach, redirecting flows to the backbone gradually
  • Decommissioning on a plan, capturing the saving rather than carrying it
  • Fewer integrations, fewer licences, fewer things to secure

Governance, security and trust

Governance is not a layer you add at the end. It is built into the backbone from the first day: unified access control, end to end lineage, and clear ownership of every data product. The same governance that satisfies the auditor is what makes self service and AI safe to offer widely.

How to get there

This is a journey delivered in increments, never a big bang. We sequence it so value arrives early and risk stays low.

  • Assess: map the estate, the costs and the highest value domains
  • Establish: stand up the backbone, middleware, lakehouse and governance
  • Migrate by domain: move one business domain at a time onto the backbone
  • Decommission: retire the legacy it replaces and capture the saving
  • Run and improve: operate it as a managed service, observed and tuned

Common pitfalls

  • Treating it as a tools purchase rather than an operating model change
  • Lifting and shifting the old mess into a new platform unchanged
  • Leaving governance and observability until after go live
  • Never decommissioning, so cost goes up instead of down
  • Building for reporting only, with no path to AI

How WAJD Group helps

We do this end to end and we stay for the run. We assess the estate, design the backbone, build the middleware, lakehouse and governance, migrate domain by domain, and then operate it as a managed service with Grafana observability and clear SLAs. You get a backbone that is current, governed and AI ready, with one partner accountable for keeping it that way.

Untangling a sprawling data estate?

Tell us what your wiring looks like today. We will sketch the backbone that would replace it, and the order to get there.

Start a conversation