Reconciling Land and Production Data: Why It's the Hardest Problem in Upstream Data

If you ask ten upstream data people what the hardest problem in their world is, you’ll usually get the same answer in different words.

Land and production don’t agree.

The well shows up in production with one set of volumes. The land system shows it with one set of interests. Accounting books revenue based on a third version of both. By the time the monthly run is done, three teams have spent half a week reconciling differences that nobody actually wants to be reconciling.

We’ve worked on this in some form with almost every operator we’ve ever talked to. The specifics are different. The shape of the problem is the same. And the reason it’s hard isn’t that any single piece is hard. It’s that the pieces were built independently, by different people, for different purposes, and they have to agree at the end of every month no matter what.

This is the problem that swallows the most analyst time in upstream. It’s also the one that gets the least attention from the broader data world, because it’s domain-specific in a way that doesn’t translate.

Here’s why it’s actually hard, and where to start when you decide to fix it.


Why land and production were never designed to agree

The two systems came from different places.

Production data was built around operations. The original goal was tracking what came out of the ground, where it went, and what equipment touched it along the way. The data shape reflects that. Wells, completions, meters, allocations, runs.

Land data was built around legal and contractual relationships. The original goal was tracking who owns what, who’s entitled to what, and what happens when those things change. The data shape reflects that too. Tracts, leases, units, working interests, royalty interests, overriding interests.

The connection between them lives in the revenue run. Production volumes get allocated to wells. Wells get tied to leases or units. Leases and units have ownership stacks. The ownership stacks determine who gets paid for the volumes. Every link in that chain depends on data from a different system, and the links don’t naturally line up.

Most operators didn’t build that connection deliberately. It accumulated. Somebody wrote a query in 2009 that pulled production from one system, joined it to land in another, and produced the monthly revenue file. That query got modified, patched, re-modified, and is now the load-bearing piece of business logic that nobody is sure how to replace.


Where the disagreements actually come from

When the numbers don’t match, the cause is usually one of a small number of things.

The well identifier doesn’t match. Production refers to the well by API number. Land refers to it by lease and well name. The cross-reference table that ties them together was maintained by hand. Somewhere in there, a well got renamed and only one of the two systems caught it.

The allocation method changed. The production allocation methodology that was set up years ago no longer reflects how the field actually operates. Volumes are getting split based on assumptions that haven’t been valid since the last recompletion. The land system doesn’t know that, and the revenue run doesn’t either.

The interest changed and only one system was updated. A working interest changed hands mid-quarter. Land has the new owner. Accounting has the old owner. Production doesn’t care either way, but the revenue file has to pick one and it picks wrong half the time.

The well moved between units. A well that was producing into one unit got reassigned, or the unit boundaries got redrawn after a pooling order. The land system reflects the new arrangement. The production system still allocates based on the old one.

The historical record is ambiguous. A well drilled in 1982 has been recompleted, reassigned, and renamed three times. Whose volumes are those, exactly, when you go back to reconcile a prior period? Different teams have different defensible answers, and none of them are written down.

None of these are bugs in the traditional sense. They’re cases where the business changed and the data systems didn’t change together.


Why this is harder than it sounds

A reasonable person looking at this from the outside would say: just pick a master, make the others align to it, and move on. We’ve seen that approach attempted many times. It rarely works.

The reason is that there isn’t a single right answer to “which system is the master.” Production data is closer to the physical reality of what came out of the ground. Land data is closer to the legal reality of who is entitled to what. Accounting is closer to what actually got booked and paid. Each one is authoritative for some questions and not for others.

Picking one and forcing the others to match it doesn’t make the data correct. It just hides the disagreements somewhere else.

The honest version of the problem is that you need all three systems to agree, and the agreement has to be defensible. That means somebody has to make business decisions about what the rules are, document them, and then enforce those rules consistently. That work is genuinely hard, and it’s the work most operators try to skip.


The shape of a real solution

When this problem gets solved, it tends to follow a similar arc.

Define the canonical questions first. Before any technical work, write down the questions that the reconciliation has to answer. What was the production for well X in month Y? Who owned what percentage of well X in month Y? What revenue was booked for well X in month Y? These three questions sound simple. The act of writing them down forces the disagreements out into the open.

Pick the source of truth for each entity. Not for everything. For each specific entity. The well master probably belongs to operations. The interest stack probably belongs to land. The revenue allocation methodology probably belongs to a joint conversation between accounting and the field. The point isn’t to crown one system king. It’s to say, for each entity, which system makes the call.

Build the cross-reference deliberately. The connection between production wells and land entities should be a real, owned dataset. Not a query. Not a tribal-knowledge translation that lives in someone’s head. A table, maintained by a named person, with a documented update process. This is where most reconciliation problems originate, and it’s where most of the leverage is.

Land it in a model that knows about the relationship. This is where PPDM earns its keep. The model has thought through the relationship between wells, completions, leases, and ownership in ways that most homegrown schemas haven’t. We wrote about what PPDM actually buys you in What the PPDM Model Actually Gives You (and What It Doesn’t). The reconciliation problem is one of the places where the model pays off most directly.

Instrument the disagreements. Once the rules are written down and the data lands in a single model, you can measure the gap. How many wells have a production record but no working interest? How many have working interests that don’t sum to one hundred percent? How many revenue lines reference an operator that doesn’t match the current land record? You can’t fix what you can’t measure, and the measurement is what eventually convinces the business that the work is paying off.


What not to do

A few patterns we’ve seen that consistently make things worse.

Don’t build a parallel reconciliation system. The instinct is to write a new tool that takes data from production, data from land, data from accounting, and produces a “reconciled” view. All you’ve done is added a fourth source of truth. Whatever caused the original disagreements still exists, and now there are four numbers that disagree instead of three.

Don’t try to reconcile history before fixing the inputs. New bad data keeps arriving while you’re cleaning up old bad data. We made this point in Data Quality in Upstream Oil and Gas: What Goes Wrong and Where to Start and it applies just as strongly here. Fix the ingestion first. Backfill second.

Don’t skip the business decisions. A data team can build any reconciliation logic you ask for. They can’t decide for you which version of an interest stack is the official one when two systems disagree. If the business won’t make those calls, the data work is going to be wasted. Get the decisions on paper first.

Don’t treat the result as a one-time project. The relationships between production, land, and revenue change continuously. The reconciliation has to be a continuous process, with named owners and a documented cadence. The companies that get this right treat it as ongoing operational work, not as a project that finishes.


Why it’s worth doing

The reason this matters is that the cost of the disagreement compounds.

Every monthly close requires a manual scramble. Every diligence request requires the team to assemble a fresh version of the truth. Every reserves report has to be reconciled by hand. Every analyst hour spent on reconciliation is an hour not spent on anything that creates value.

Operators who get this right don’t spend less time on land and production. They spend the same amount of time, but on different things. Less on chasing down disagreements, more on actually using the data. The monthly close goes from a week of reconciliation to a day of review. Diligence answers come back in days instead of weeks. The analytical layer on top of the data starts to mean something, because the underlying numbers are trustworthy.

This is also the area where adopting an industry data model pays off the most. We covered the migration path from spreadsheet-based stacks in From Spreadsheets to a Real Data Stack: A Realistic Migration Path for Mid-Size Operators, and the reconciliation problem is one of the clearest places where the new foundation earns its keep. The disagreements stop being random surprises and start being a managed, measured part of the operation.

The land and production reconciliation is the hardest problem in upstream data. It’s also one of the highest-value problems to actually solve. The companies that pick it up and work through it deliberately tend to find that everything else gets easier afterward.


We Were Just at PPDM 2026

We spent April 27 through 29 at the PPDM Energy Data Convention in Houston. If you were there and we didn’t get a chance to connect, or if reconciliation pain is the conversation you wish you’d had, we’d love to hear what you’re working on.

Further Reading

Get in touch