This blog post was written by Solution Architect Hayley Tuller. It was last updated on 05/31/22.
One of the most important responsibilities Salesforce professionals have is ensuring our orgs remain secure. Achieving a strong security posture is about so much more than just passwords and session settings, however. It’s critical to know what data is in your Salesforce org, what security obligations that data falls under, and what the possible risk is should that data be compromised.
I know what you’re thinking. It’s impossible for an Admin to keep all this information in their heads. Fortunately, Data Classification is here to help!
What is Data Classification in Salesforce?
Fields in Salesforce also have attributes, or Field-level metadata. Some examples of this are pretty obvious, such as the Field’s Label, or the Field’s Data Type, such as Picklist or Text. Fortunately for us, Salesforce is also able to capture critical security information at the field level using these attributes. Configuring, capturing, and reporting on this data is collectively known as Data Classification.
The core of Data Classification in Salesforce is a set of metadata points on each field, specifically:
- Data Owner
- Field Usage
- Data Sensitivity Level
- Compliance Categorization
To see these fields, navigate to an object in the object manager, select a field, and view the details. You’ve probably seen these fields dozens of times, and perhaps haven’t really thought deeply about them until now. Here they are on the standard Email field on the Contact object:
In this case, I’ve already configured these attributes so we can see how this works in a real-life case. The two most important fields are Data Sensitivity Level and Compliance Categorization. These capture your organization’s expression of the risk of loss or compromise for this data point, as well as the regulations that may govern its usage respectively. Both can use customized values. Let’s take a bit of a deep dive on the first one.
Data Sensitivity level is, at the end of the day, an expression of risk. In other words, it answers the question, “How bad is it if this data is lost or compromised?” The default values are Public, Internal, Confidential, Restricted, and MissionCritical, but there are literally endless ways to express the idea of the risk involved here. To really make Data Classification work for you, get your key stakeholders involved and workshop how you want to assess and indicate risk. Some organizations use a simple “Low/Medium/High” rubric, while others prefer a schema based around handling, like “Open, Internal Only, Restricted.” However you choose to express risk, this is where you do the hard work! Your organization must come to a consensus.
Fortunately, Compliance Categorization is a lot easier. This is a multi-select picklist that captures by which regulation or regulations a given data point may be governed. Most of us have heard of GDPR (General Data Protection Regulation) by now, and know it is the EU Standard for governing use of data and its storage. However, GDPR is just one regulation you need to be aware of as a Salesforce Admin. Luckily for us, Salesforce has built in a set of default values that can apply:
Salesforce Help has a breakdown on these different laws, but it’s wise to consult with your legal department or partners to determine what applies.
Data Owner and Field Usage are much simpler. Data Owner is simply the user or the queue who “own” or are responsible for this data point. Field usage gives you a way to capture if the field is in active use, planned for deprecation, or intended to be hidden. Watch out here! Setting a field’s usage to “DeprecateCandidate” or “Hidden” doesn’t actually do anything to the visibility of that field in the UI or the API. It’s purely administrative, so many users of Data Classification don’t bother with this feature.
Why Use Data Classification?
There are a number of reasons why you might need to have a clear record of the sensitivity of your data in Salesforce, but the most obvious and probably the most common is for compliance reasons. Your organization may need to comply with regulations that require it to document data sensitivity, and the Data Classification feature gives you a fast, safe, and accessible way to capture and document data sensitivity. Many backup applications also use these fields to help prioritize data for backups in case of catastrophic data loss.
The most immediate procedural reason an Admin may wish to employ Data Classification is to support Data Retention policies and procedures. Many organizations follow the best practice of archiving data not actively needed in Salesforce. This archiving can happen both outside of Salesforce or within Salesforce, in things like Big Objects. Obviously, archiving data in either way makes it less accessible to users, so admins need to be thoughtful around what data gets archived and when. Defining parameters for data retention around data sensitivity can help ensure the most vital data is quickly accessible, while less sensitive or mission-critical data gets archived in a timely manner.
Finally, we all hope we never have to use it, but Admins need a disaster recovery plan. Being able to quickly and clearly report on the sensitivity of data that may have been lost or compromised can be a vital part of that plan, and Data Classification delivers this capability.
Implementing Data Classification
Now that we have a handle on what Data Classification in Salesforce actually means, let’s talk about how you might implement it.
Step 1 is to define your organization’s schema. We touched on this above; the key here is to get your stakeholders involved. Once you’ve defined your Data Sensitively Levels and which Compliance Categories your organization wants to use, you’re ready to move forward.
Step 2 is configuring your schema in Salesforce. From Setup, search for and click on “Data Classification Settings.” From here you can access all your options – and there are only three! – for customizing Data Classification in Salesforce.
“Edit Data Sensitivity Picklist Values” will allow you to customize your risk levels. This works just like customizing any other set of picklist values, including the ability to deactivate a value or delete it. “Edit Compliance Categorization Picklist Values” works the same way, but it’s not likely you need to do major customization here. It’s possible you may wish to add some industry or community-specific regulation, but for most organizations, the default list will work just fine.
You also have the option to “Use Default Data Sensitivity Level” by clicking the checkbox labeled as such in Data Classification Settings. This will update the classification on several fields on the Account, Case, Contact, Lead, User, and a few other related child objects. Only the “Confidential” or “Internal” levels are used, and they are mostly applied to fields like phone numbers and emails. In short, the default use is intuitive and may be a great place to start if you are keeping the default Data Sensitivity levels.
Now you’re ready to go forth and Classify your Data! But if you have a lot of data points to classify, opening each individual field could be a pain. Fortunately, Salesforce also provides a tool to get this done faster. From Setup, search for “Data Classification Upload.” Here you’ll find instructions on how to format a CSV file of the fields you wish to classify, along with an upload wizard. Using this tool will allow you to classify all your fields in one go, including any fields you may wish to classify that aren’t accessible from the Object Manager.
While you’re here, you may wish to give the Data Classification Download tool a spin. Also accessible from Setup, it works the same way, only in reverse.
Step 3 is to monitor and improve. An Admin’s most important tool for monitoring Data Classification is reporting, but before you can run a report on your fancy new schema in Salesforce, you’ll need to create a Custom Report Type.
To get the report you’re looking for, the Primary Object should be “Entity Definition,” which captures metadata on the object, and “Field Definition,” which captures metadata about individual fields.
Salesforce has a brief step-by-step on how to do this, but you should know this results in a Report Type that has a lot of unrelated data points in it. To make things easier, name your report something descriptive, such as “Data Classification,” and customize the report layout so the key fields are shown by default. I choose to default display Label on the Entity Definition, and Label, Developer Name, Compliance Category, and Data Sensitivity on the Field Definition as my starting point.
What’s Next for Data Classification?
Whenever I learn about a new feature in Salesforce, I always like to review the open ideas in the Salesforce IdeaExchange covering that feature. Here are some standouts that you may also want to review and upvote! Maybe they will be coming soon to an org near you…
Upload ComplianceGroup in FieldDefinition for Data Classification – Currently, you can mass upload the Data Sensitivity level, but not the Compliance Category. This idea suggests adding that field to the mass update tool.
Include Data Classification Values When Deploying Fields – Any documentation of Data Classification doesn’t move with a field when it’s deployed in a change set. All this work would need to be done in production. Wouldn’t it be easier to classify as you develop in a sandbox, and have those values deployed along with the change set? This idea thinks so!
Finally, here’s my idea! What’s yours?
Show Data Classification Details in the UI – Currently, Data Classification allows us to document the sensitivity of a given field in terms of the impact of its loss or compromise. To see this data, users must run a report on a custom Report Type. Please add an option to show this information in the UI near the field for given levels of sensitivity. This would keep users more mindful of data sensitivity and encourage best practices.