- Last Week As A vCISO
- Posts
- Data Classification 101
Data Classification 101
Data sensitivity is a relative thing. Where should I put sensitive data? Who can access it? Let's go over how to put a Data Classification Policy & Standard together.
Most people don’t read security policies out there… but if there is one policy out there that is helpful for such a wide array of folks…. The Data Classification and Handling Policy.
All Roads Lead To Data… Classification
This Data Classification and Handling policy details exactly how data should be handled, and since literally everyone in the company is handling data, then it directly affects everyone, everyday.
Today’s post is a guide on how to put together a Data Classification and Handling Standard. It can be part of your policy or a separate guide. As with all security documentation make sure it’s accessible to all folks and that they are informed of their existence.
Why Is It So Important?
This goes back to the notion that all people inherently want to do the right thing. However, if they are never told what the right thing to so is then how will they know? (If you read my articles, you might be experiencing deja vu - I know I am).
That’s why we need Policies and Standards in the first place.
With Data Classification it’s an often referenced tool. Here are some use cases:
Software Engineer trying to understand whether to encrypt User ID’s or not
Finance folks trying to know whether they are allowed to share any of the following with external parties
Documents with transaction ID’s
Employee ID’s
Transactions with the last four digits of credit cards
The What - Data Inventory
The first step in the process is to gather ALL the data elements that exist in your company. I don’t care how nuanced they are, if it’s a piece of data with a label or category, then it should be indexed and catalogued.
Of course this list will continue growing, which is why it’s important to make it a living document that lives in your internal wiki like Confluence or Notion.
Below is an example of a data element list
Employee Names
Employee Address
Employee SSNs
Employee DOBs
Customer Names
Customer Addresses
Customer SSNs
Customer DOBs
Customer UUID
Transaction ID
Authentication Material
Passwords
Password Hashes
API Keys
Authentication Cookies
Current Company Financial Information
Historical Company Financial Information
Company Capitalization Table
Employee Salary and Compensation Information
Employee Reviews
The Who - Data Classification
Now that you have identified all the data, we now need to classify the data into several buckets. How the data is classified slightly depends on the next step, how it’s handled, but for now we’ll just classify the buckets. You always reserve the right to tweak and adjust as needed.
Generally we like to use four main buckets to classify data:
Public
Internal
Confidential
Restricted
You are welcome to add or remove buckets as you see fit. The government has tens or hundreds of buckets for example, so it’s up to your appetite for complexity here!
Let’s break down what each category represents. The way you want to approach all these categories from a “sensitivity” perspective is to understand what would happen to this data if it got into the “wrong” hands.
Public
This type of data is publicly accessible and ok for anyone to access. It might be a press release or any other data that is well known to the public in one way or another.
Internal
Most corporate data will fall under this category. This is data that can be viewed by employees and contractors and needed for day to day business operations. This data is not to be viewed by non-employees, including your kids or spouse. 😁
Confidential
This data is limited to a subset of the above people. Pretty self explanatory, but it could be limited to a job role, department, or combination of other factors. The idea here is to provide access only on a “need to know” basis.
Restricted
In addition to this data being confidential, it is also most likely regulated in some form or fashion. Examples of this data includes SSNs, DOB’s, or full credit card numbers and security codes (also known as PANs or CHD). This data should be treated extremely carefully, and ideally not readily accessible by humans if possible, but we’ll talk data handling in a minute.
The How - Data Handling
Data handling is how we are going to bring everything together. There are a variety of areas you have to consider here.
Physical handling of information
Logical handling of information
Transport method(s) allowed
Encryption at rest
Encryption in transit
Let’s break it down a little more…
Physical handling
Some questions to consider:
Can the data be printed?
If so, how should it be disposed of?
Can we put it on a USB stick?
Can we mail it?
If so, does it have to tracked?
Logical Handling
Can the data be emailed to other employees? How about outside the company?
Can it be in Slack or Teams?
Can it be downloaded to a local computer? How about a personal machine?
Does it have to be encrypted at rest? What about it in transit?
Bringing It All Together
So now we have the what, the who, and the how… let’s bring it all together now. You have all the important elements of a good policy or standard document to define everything. You may even have a table like the below to make it easy to read for folks.
Note Of Caution
However you decide to put together your policy or standard document, one thing to keep in mind is to be realistic about expectations for that data AND empower your employees to be able to meet those expectations. Give them the tools necessary and set them up for success.
So if you want them to encrypt emails to 3rd parties, but you don’t give them a tool… how do you expect them to do so? 🤔
Well, I hope this has been helpful for you. If so, please share!
What has been your experience? Leave a comment below.
Here is a detailed guide: https://www.spirion.com/data-classification/
Reply