When do you need to use a business glossary versus a data dictionary versus a data catalog? Although the terms sound similar, they are very different tools that can help your business manage and use its data strategically.

In this article, we will cover:

Data is a critical asset for any business. We know that. It doesn’t matter the size of an organization—large, medium, or small—its data is essential to making business decisions and to remaining competitive. We also know that as the volume of data continues to grow, companies need to make managing their data a priority if they want to understand what has happened in the business, answer questions about why it happened, and make informed decisions going forward.

Data management needs to be part of the overall business strategy so that everyone in the organization understands data and uses it in the same way. But where do you start? There are three tools we recommend that will help keep you organized and will enhance your data management strategy: a business glossary, data dictionary, and data catalog.

White and blue diagram illustrating the pros and cons of a business glossary, data dictionary, and data catalog.

All three tools—business glossary, data dictionary, and data catalog—can help an organization better manage its data. Here’s a list of pros and cons for each.

Although they are related, these tools are in fact very different tools that your organization can use for different purposes. In this blog, we will define all three—business glossary, data dictionary, and data catalog—and discuss what’s needed to build and govern each, as well as pros and cons to consider.

What is a Business Glossary?

A business glossary contains concepts and definitions of business terms frequently used in day-to-day activities within an organization—across all business functions—and is meant to be a single authoritative source for commonly used terms for all business users. It is the entry point for all organizations that have any kind of data initiative in play. A business glossary is the red thread that connects the business terms and concepts to policies, business rules, and associated terms within the organization. When creating a business glossary, you should have:

  • Cross-functional input, as well as consensus/approval for agreed-upon understandings of key business concepts and business terms.
  • Accessibility to common business terms—words, phrases, acronyms, or business concepts—across the organization so that everyone speaks the same business language.
  • Cross reference terms and their relationships. This will help provide context and easily identify relationships between terms for business users.

Although you do not need to have a data governance program in place to build, use, and maintain a business glossary, you should still have a governance strategy for the business glossary itself. In order to have cross-functional consensus, you need stakeholders from all business functions whose responsibility it is to meet regularly to discuss terms and concepts that might overlap departments. This will allow for approval and documentation of definitions, which is important, especially if two departments define the same metric differently. It’s fine to have two different definitions so long as the stakeholders have verified that it is an acceptable deviation, and it is documented and made accessible for the business users who need it. In some cases, you may have a tie-breaking decider—such as a CEO—choosing one definition over the other.

Once the business term or concept is defined and approved, the designated stakeholders need to ensure that definition is used consistently throughout the organization. A business glossary is a key artifact for any data-driven organization and will help in setting up future data initiatives as the company’s analytics needs mature. Here’s what to consider when creating a business glossary:

  • Pro: You don’t need to invest in new technology to create a business glossary. You can use something as simple as Microsoft Excel or Google Sheets to set up the business glossary and place it in SharePoint or Google Workplace to provide access to your business users.
  • Pro: It is the lexicon of business language, which will allow for cross-functional collaboration among your business users.
  • Pro: It can be used as an onboarding and coaching tool for new employees within your organization.
  • Con: If not implemented correctly, it could lead to misunderstandings, with emphasis on bureaucracy, as well as introduce bias into your business language.

As stated earlier, a business glossary is the starting point for any data initiative, but it also a pre-requisite to building a data dictionary.

Alation’s Business Glossary illustrated in graph with title, description, template, and articles.

Alation’s Business Glossary enables the creation of definitions, policies, rules, and KPIs through a rich, user-friendly interface. A business glossary can be initiated with Microsoft Excel or Google Sheets to get the process started and ensure that it’s working properly. Photo: Alation

What is a Data Dictionary?

A data dictionary is a more technical and thorough documentation of data and its metadata. It consists of detailed definitions and descriptions of data dimension and measure names (in databases, data tables, etc.), their calculations, their types, and related information. Whereas with a business glossary you provide definitions for terms and concepts, in a data dictionary, you provide information on the type of data you have and everything that is related to it. This information is most commonly useful for technical users that work on the backend of your systems and applications so that they can more easily design a relational database or data structure to meet business requirements. When creating a data dictionary, you should have:

  • A business glossary already in place, and you should have a governance strategy to ensure your business users are using it.
  • A data integration tool that will automate the process for building and maintaining the data dictionary. The effort required to do this manually is not worth the value you will get out of it. Take advantage of tools that have built-in capabilities, such as dbt, where you can enter descriptions as you are programming, and they will be automatically documented to create a data dictionary. dbt also includes an automated data impact and lineage graph. There are lots of tools that have these built-in capabilities, so check to see if your existing tool does, or look around for one that fits your purposes.
  • Attributes such as data type, size, allowed values, default values, and constraints, as well as any additional technical metadata that is relevant included in your data dictionary. Taking the time to do this upfront and making your data dictionary more user-friendly will help with data quality across the organization.

Watch the CTO of iFit discuss how having a data dictionary empowered their data teams and removed data engineering bottlenecks:

Unlike a business glossary, a data dictionary will likely require you have a more formal data governance program in place with a governance committee made up of individuals from both the business and IT side.

The business team should be responsible for requesting changes to a metric’s definition, while the IT team should be responsible for implementing the change and communicating it with the organization. Establishing lines of communication between the two groups will promote trust. Here’s what to consider when creating a data dictionary:

  • Pro: Having a data dictionary will ultimately serve as a lexicon of business language for technical teams across the organization and help with metadata management—allowing  them to do their jobs more effectively. The technical metadata of each data element within a data dictionary helps to clarify business requirements for the IT team working on the backend of systems or applications.
  • Pro: A data dictionary helps to improve master data management and ensure data quality across the organization, as well as to integrate data from multiple sources more efficiently. Depending on the tool you use, you can enter the definition once and use it for multiple applications.
  • Con: Although a data dictionary helps reduce overall time and costs with data initiatives when implemented properly, it does require an extra step for data integration developers. When building code for a new job that will integrate two data sources, they will need to either look to a business glossary for definitions or work with the data governance committee to get the definitions and add them to the code. If the data dictionary is not automated, the developer will have to manually document the new definition in the data dictionary in addition to adding it to the code.

A data dictionary is a subset of a business glossary, but both are required to build a data catalog.

dbt data dictionary capabilities illustrated with column, type, description, tests, and more.

Whether your data is stored in a data warehouse, data lake, or lakehouse, running dbt docs will propagate table and column definitions to create an automated data dictionary. Source dbt

What is a Data Catalog?

A data catalog is the pathway—or a bridge—between a business glossary and a data dictionary. It is an organized inventory of an organization’s data assets that informs users—both business and technical—on available datasets about a topic and helps them to locate it quickly. Users have a clear, accessible view of what data the organization has, where it came from, where it is located now, who has access to it, and what risks or sensitivities may be involved—all in one central location. When creating a data catalog, you should have:

  • A business glossary and a data dictionary already in place, and you should have a data governance committee to ensure your business and technical users are using both.
  • A tool that can automate the process. A data catalog should not be set up manually; you will need to use a tool to set it up, as well as to maintain it. There are lots of tools to choose from, including Alation, Alteryx, and Qlik just to name a few. You may also already have cataloging capabilities built into existing tools—whether it’s your source system or a business intelligence (BI) tool.
  • Subject-matter experts. Because a data catalog is a comprehensive artifact, and it is built for both the business and technical users, you will need individuals who have competencies in both.

In terms of governance, you should follow the same structure as with a data dictionary. However, you should have another committee—a subset of individuals—made up of individuals who have both technical and business competencies that work alongside the data governance committee set up for a data dictionary. The best way to maintain a data catalog is to integrate it as naturally as possible, or intuitively as possible with existing processes put in place, such as whenever a new data source is added, updating the data catalog should be part of whatever process is in place for doing that job.

Here’s what to consider when creating a data catalog:

  • Pro: A data catalog supports regulatory compliance by providing quick and easy access to where certain data is stored and who uses it.
  • Pro: It fosters a data culture throughout the organization by providing data and content for self-service applications. It allows users to get what they want, when they need it, and trust that it is accurate because of the transparency a data catalog provides.
  • Con: Although a data catalog helps to reduce risk and improves data efficiency and analysis, it requires skill to develop. Individuals who can do this are rare and in high demand as it needs business and technical abilities to create a good data catalog.
Custom data catalog example with organized data assets separated by blue cells.

A data catalog is an organized inventory of data assets and provides knowledge of all aspects of metadata. Users can access a data catalog without access to the data asset itself. This helps in saving time and improves employee productivity, as well as, promoting transparency and trust in the data.

Adopting Best Practices for Data Initiatives

Although the terms—business glossary, data dictionary, and data catalog—sound similar, they play very different roles within your organization. Each is valuable, but not completely necessary for each organization—at least not right away. It depends on where you are at with your analytics maturity and how much time and resources you have to dedicate to build and maintain each artifact. As you consider your options, start with:

  • Building a Business Glossary: This is the easiest way to get started, and it is also a pre-requisite for any data initiative you have. Once you create a business glossary and take the necessary steps to maintain it, you will be one step closer to building a data-driven culture within your organization and to scale up with your data and analytics maturity.
  • Examine Your Existing Tools: Before you make any new technology purchases, take the time to see what capabilities you have built-in with your existing tools. If you find that you have data dictionary capabilities, use them and start building it into your data integration processes to update and maintain.
  • Promote a Data Culture in Your Organization: The key thing for any of these programs to work is the willingness of the organization to want to do it. Just because you ask business or technical users to adopt a program, doesn’t mean they fully understand and endorse it. The more you encourage a data culture and communicate the importance behind, the more natural it becomes for everyone to get on board.

 

Get In Touch With a Data Expert Today

Christina Salmi Christina leads the Data Strategy Service Line, helping our customers to think and act strategically about data and analytics.
Sign up for a 30-Minute

Data Governance Session

Thanks for your inquiry! Someone will be in touch shortly.