Make Smarter Decisions with Trusted Data. 

Discover the Power of Reference Data for your Business.


Reach Out



Establish a single, consistent, and authoritative source of reference data across different systems, applications, and business processes with our reference data management solution. RDM ensures that reference data is accurate, up-to-date, and synchronized across various systems to maintain data integrity and improve operational efficiency.

Should Reference Data Management be Centralised?

The answer to the question of whether reference data should be managed centrally is a resounding yes. Managing reference data centrally is crucial for organizations to ensure the accuracy and consistency of their data, and to enable sharing of datasets between organizations

What is reference data?

Gartner defines reference data as "a consistent and uniform set of identifiers and attributes that accurately describe the strategic information required for an organization to function properly.

In a 2010 blog post, Gartner's Andrew White quotes someone's perception that "reference data management provides solutions around data that is not mission critical across an enterprise". This description is used to differentiate reference data from mission-critical master data - your customers, products, suppliers and so on.

We don't agree!

To paraphrase Dorothy from The Wizard of Oz, we're not in 2010 anymore!

Reference data are Mission-Critical

Simply put, reference data are attributes that describe, categorise, define, and give context to other data.

For example, a list of countries and country codes may be used for multiple purposes including ‘Country of Origin’, ‘Country of Residence’ and ‘Country Issuing Passport’. If different values are used to describe the same country then it becomes impossible to aggregate or compare data from different sources.

What is Reference Data Management?

Reference data management (RDM) refers to the process of managing and maintaining reference data within an organization.

Reference data is the data that provides context or defines the meaning of other data within a system. It includes data such as codes, classifications, lists, taxonomies, and other reference information that is used to describe or categorize data elements.

The goal of reference data management is to establish a single, consistent, and authoritative source of reference data that can be used across different systems, applications, and business processes within an organization. RDM ensures that reference data is accurate, up-to-date, and synchronized across various systems to maintain data integrity and improve operational efficiency.

Key aspects of reference data management include:

  1. Data Governance: Establishing policies, standards, and procedures to govern the creation, maintenance, and use of reference data. This ensures data quality, consistency, and compliance.

  2. Data Modelling: Defining the structure and relationships of reference data elements, establishing hierarchies, and creating taxonomies or classifications to organize the data.

  3. Data Integration: Integrating reference data with other systems and applications to enable consistent data usage across the organization. This involves data mapping, transformation, and synchronization.

  4. Data Quality Management: Implementing measures to assess and improve the quality of reference data, including data profiling, cleansing, and validation.

  5. Change Management: Managing the lifecycle of reference data, including versioning, updates, and retirements. This ensures that reference data remains accurate and relevant over time.

  6. Metadata Management: Capturing and managing metadata associated with reference data, such as data definitions, business rules, and usage information.

  7. Security and Access Control: Implementing appropriate security measures and access controls to protect reference data from unauthorized access or modifications.

By effectively managing reference data, organizations can achieve greater data consistency, accuracy, and reliability across their systems and applications. This, in turn, supports better decision-making, enhances operational efficiency, and enables effective data analysis and reporting.

Errors in reference data directly affect the integrity of reports.

Imagine, for example, that you need to run a report identifying customers by the country that they live in. If reference data is not managed centrally then ambiguities may arise, particularly when data is consolidated from multiple applications for reporting purposes.

For example, in one application code "SA" may represent "Saudi Arabia". In another, code "SA" may represent "South Africa", as illustrated in the below table.

This can cause confusion when trying to understand how many customers live in one or the other country. 

Application Code Value Total Customers
Application A SA Saudi Arabia 583
Application B SA South Africa 2712
Consolidated SA Saudi Arabia 6595
Consolidated ZA South Africa 0

In another example, the below table has two code 1s, each representing different categories. Also,  the category "Bird" is represented by two codes. Another common example could be to have variations of the same value (e.g. Jan, January, january) each with its own code. These kinds of issue cause havoc with reporting.

Code Value
1  Cat
1 Dog
2 Bird
3 Bird


At one banking client, we found issues, similar to the above, with the reference data used to categorise clients for risk reporting.

The result was that the bank had an incorrect understanding of its risk exposure to a particular industry segment, which could have had serious consequences that management was completely unaware of. 

More reasons why reference data management needs central, business oversight

Managing reference data centrally is crucial for organizations to ensure the accuracy and consistency of their data, and to enable the effective sharing of datasets between organizations.

Read our blog: 5 quick facts to get value from your reference data

Here are some additional reasons why reference data should be managed centrally, collated from several sources:

  • Centralized management of reference data ensures consistency across business processes and applications, preventing similar things from being described in different ways. This is important for ensuring the accuracy and reliability of data.
  • Disorganized, incorrect, or inaccessible reference data can disrupt and delay business processes. Assigning the wrong code to a particular transaction can negatively impact billing, inventory counts and replenishment, or undermine the accuracy and destination of shipments. Better management of data and its corresponding reference data can present enormous business value in the form of more accurate analytics, effective sharing of datasets between organizations, and ultimately, better customer service, increased operating margin, and more productive employees.
  • Reference data should be managed centrally to ensure that it is up-to-date and accurate. Organisations that have hundreds of individually developed reference files or tables struggle to maintain and update each table. Reference data can quickly become outdated and inconsistent, causing errors in application performance and data integration.
  • Centralized management of reference data enables the effective sharing of datasets between organizations, promoting mutual understanding thanks to using industry-standard reference data codes.

Reference Data should be governed centrally

Reference data carries meaning. It establishes permissible values, facilitates consistency, and maps internal data against external data and/or standards. Although it represents a small share of total data volume, reference data represents 25% to 50% of tables in databases and affects reporting accuracy and data governance.

To ensure integrity, reference data should be managed centrally with business oversight. 

Companies should implement formal processes to ensure that changes to reference data are approved by an authorised business stakeholder, and not left to a junior SQL programmer with access to a table.

A reference list in Data360 govern Reference data list in data360

Using Data360 Govern for Reference Data Management

Manual processes to manage reference data can be time-consuming, complex and prone to error.

For example, one financial services client has a dedicated team responsible for updating reference data. To add or modify reference data one is required to complete a spreadsheet and email this to the team. They then go through various checks and approvals, again driven through email or meetings, before making changes to the relevant tables.

This process is painful but an improvement on having no process at all.

One can also treat reference data as a class of master data, and implement a full-blown master data management system to govern it. This may however be overkill given that reference data is typically less complex than master data.

Using  Data360 Govern we can quickly and (fairly) easily:

  • Configure reference lists to specific requirements
  • Ingest reference data sets (via API or once-off import)
  • Add, modify or delete value - via a GUI or via API, with a built-in approval process before sharing
  • Manage relationships (e.g. hierarchies) within and between reference data and other assets
  • Manage requests to update to add reference data via workflow, with a full audit trail
  • Track and approve (or reject) changes with a full audit trail and automated workflow
  • Publish shared reference lists to internal or external consumers

By building reference data governance into our data catalogue we are eating our own cooking so to speak, as the reference data we use to categorise and govern data is itself governed and standardised. It allows us to understand how reference data is used in reporting, to promote data sharing, or to support key business processes.

And, ultimately, it helps to ensure the integrity of data across the organisation, and within the partner network.

Get Started Today!


Phone:+27 11 485 4856