SkarpSkarp

Chapter 13 of 27

Data Protection Strategies: Classification, Encryption, and Tokenization

Follow the lifecycle of sensitive data and choose the right mix of classification, encryption, and other controls to keep it protected wherever it travels.

27 min readen

Big Picture: Protecting Data Across Its Lifecycle

From Systems to Data

You have seen how enterprises secure complex hybrid environments. Now you zoom in on the data itself: how to keep it protected wherever it lives or travels.

Lifecycle View

Think in stages: discover and classify data, rate its sensitivity, apply controls, monitor and respond, then retire it securely when no longer needed.

Exam Mapping

For each scenario, ask: what data type and classification is this, where is it in the lifecycle, and which control (encryption, tokenization, masking, DLP) fits best?

Step 1: Data Classification and Handling Rules

Why Classify?

Data classification means labeling data by sensitivity and impact, then tying those labels to clear handling rules. You cannot pick controls until you know what you are protecting.

Typical Levels

Many orgs use tiers like Public, Internal, Confidential, and Restricted. Regulated data (cardholder, health, special category personal data) usually maps to the top tiers.

Handling Rules

Each label must have rules: encryption requirements, who can access, whether it can leave the company, and how it appears in test environments and logs.

Classify This Data (Thought Exercise)

Work through these quick scenarios. Do not overthink; focus on mapping to a classification and a couple of required protections.

  1. Company public website FAQ
  • How would you classify it (Public/Internal/Confidential/Restricted)?
  • Is encryption at rest required? In transit?
  1. Employee salary spreadsheet on a shared drive
  • Likely classification?
  • At least two controls you would require (e.g., ACLs, encryption, DLP)?
  1. Customer credit card numbers stored in a database
  • Likely classification?
  • What techniques from this module would you combine (e.g., database encryption, tokenization, masking in logs)?
  1. IoT sensor data from manufacturing equipment, including operator IDs
  • What classification makes sense?
  • How might classification affect where this data is stored (cloud vs on‑prem) and how it is encrypted?

After you answer, compare your reasoning to this pattern:

  • The more personal, financial, or safety-related the data, the higher the classification.
  • Higher classification means stronger encryption, tighter access control, and stricter monitoring.

Use this mindset on Security+ exam questions: identify the data type and its implied classification before picking the control.

Step 2: Encryption Basics, In Transit vs At Rest

Plaintext vs Ciphertext

Encryption turns readable plaintext into unreadable ciphertext using a key and algorithm. Only someone with the right key can reverse the process.

Transit vs Rest

Data in transit moves over networks and is protected with TLS, VPNs, and secure protocols. Data at rest sits on disks, databases, and backups and needs storage encryption.

Limits of Encryption

Encryption hides data from eavesdroppers, but it does not decide who may access decrypted data. You still need strong access control and integrity checks.

Step 3: Encryption for Data at Rest (Full Disk, Volume, File, Database)

Full Disk and Volume

Full disk encryption protects an entire drive, great for lost laptops. Volume encryption targets specific partitions, like a database volume, instead of the whole disk.

File and Folder

File or folder encryption protects only selected data and can enforce per-user access, even when the OS is running and the disk is otherwise unlocked.

Database Encryption

Database options include transparent data encryption for whole databases and column-level encryption for specific fields like SSNs or card numbers.

Step 4: Key Management, Rotation, and PKI in Data Protection

Keys Matter Most

Strong crypto is useless if keys are weak or exposed. Generate keys securely, store them safely, and never hard-code them into source or config files.

Rotate and Separate

Rotate keys regularly and after incidents, and use different keys for different systems and purposes to limit the impact of any single key compromise.

PKI’s Role

PKI manages certificates and public keys, enabling TLS, secure email, and code signing. It underpins encryption and authentication across enterprise data flows.

Step 5: Tokenization, Masking, and Why They Matter

What is Tokenization?

Tokenization swaps a sensitive value for a non-sensitive token, with the mapping held in a secure system. Breaching the tokenized database alone reveals little.

What is Masking?

Masking hides part of the data, such as showing only the last four digits of a card, so staff and logs see less sensitive information.

Why Use Them?

Tokenization and masking let systems function while limiting who ever sees real sensitive data, especially in logs, support tools, and test environments.

Step 6: End-to-End Scenario – Payment Data Through Its Lifecycle

Step 1–2: Capture and Transit

Customer card data is entered on a web page and protected with HTTPS. The web tier forwards it over mutual TLS or a VPN; logs mask the card number.

Step 3–4: Tokenize and Store

The payment processor tokenizes card numbers. The merchant stores only tokens and last four digits in a database protected by TDE or volume encryption.

Step 5–6: Keys and DLP

Keys live in a KMS or HSM and are rotated. DLP policies watch for card patterns leaving via email or file shares, blocking or alerting on violations.

Step 7: Data Loss Prevention (DLP) and Discovery

What DLP Does

DLP tools watch for sensitive data leaving endpoints, networks, or cloud apps and can log, warn, or block risky transfers.

How DLP Detects

They use pattern matching, keyword dictionaries, and classification labels to spot things like card numbers, IDs, or tagged confidential documents.

Typical Exam Clues

If a question asks how to stop users from emailing or uploading sensitive data outside the org, DLP plus classification is a strong candidate.

Quiz 1: Choosing the Right Control

Answer this exam-style question to test your understanding.

A company processes customer credit card payments through a web application. They want to ensure that if their customer database is breached, attackers cannot obtain usable card numbers, while still allowing recurring billing. Which control BEST meets this requirement?

  1. A. Implement full disk encryption on the database server
  2. B. Use tokenization for card numbers and store only tokens in the customer database
  3. C. Enable TLS for all connections between the web server and the database server
  4. D. Apply masking so that only the last four digits of the card number are displayed to customer service staff
Show Answer

Answer: B) B. Use tokenization for card numbers and store only tokens in the customer database

Full disk encryption (A) protects data at rest if the disk is stolen, but if attackers access the running database, they can see decrypted data. TLS (C) protects data in transit, not in a breached database. Masking (D) limits what staff see, but full numbers may still be stored. Tokenization (B) replaces card numbers with non-sensitive tokens in the merchant database, so a database breach does not reveal real card data while still supporting recurring billing via tokens.

Quiz 2: Data at Rest Options

Another quick check on at-rest encryption choices.

An organization stores employee HR records, including Social Security numbers, in a large relational database. They want to protect only the SSN field with encryption while minimizing changes to existing applications. Which solution is MOST appropriate?

  1. A. Full disk encryption on the database server
  2. B. Column-level encryption of the SSN field
  3. C. Network DLP on outbound email
  4. D. File-level encryption of database backup files
Show Answer

Answer: B) B. Column-level encryption of the SSN field

Full disk encryption (A) and file-level encryption of backups (D) protect entire storage, not a specific field. Network DLP (C) helps prevent data exfiltration but does not encrypt stored SSNs. Column-level encryption (B) targets just the sensitive SSN field, aligning with the requirement and is a common pattern for protecting specific fields in a database.

Key Term Review

Flip through these cards to reinforce the most important terms before you move on.

Data classification
The process of labeling data based on its sensitivity and business impact, and tying those labels to specific handling and protection requirements.
Data in transit
Data that is moving across a network, such as web traffic, API calls, email, or VPN tunnels, typically protected with TLS or VPN encryption.
Data at rest
Data that is stored on persistent media such as disks, SSDs, databases, backups, and mobile devices, often protected with full disk, volume, file, or database encryption.
Full disk encryption (FDE)
An encryption method that protects an entire drive, including the operating system and user data, primarily to mitigate risks from lost or stolen devices.
Transparent Data Encryption (TDE)
A database feature that encrypts data at the storage level so that applications can access data normally while it remains encrypted on disk and in backups.
Tokenization
A technique that replaces a sensitive value with a non-sensitive token, with the mapping stored in a secure system, reducing exposure if databases are breached.
Masking
The practice of obscuring part of a data value, such as hiding all but the last few digits of a credit card number, to limit what users and logs can see.
Key rotation
The practice of periodically replacing cryptographic keys to limit the damage from key compromise and to meet security and compliance policies.
Public key infrastructure (PKI)
A system for creating, managing, distributing, and revoking digital certificates that bind public keys to identities, enabling secure communication and authentication.
Data Loss Prevention (DLP)
Technologies and processes used to detect and prevent unauthorized transmission or disclosure of sensitive data across endpoints, networks, and cloud services.

Step 8: Putting It All Together for Security+ and Real Environments

4-Step Exam Checklist

For each question: identify the data, locate it in the lifecycle, choose the main control type, and ignore distractors that do not solve the specific problem.

Real-World Mix

Enterprises blend classification, layered encryption, tokenization, masking, and DLP, all shaped by governance, risk, and compliance requirements.

Your Next Moves

Use Skarp diagnostics and mock exams to test these concepts; your spaced review and gap guide will reinforce areas where you struggle.

Key Terms

Masking
The practice of obscuring part of a data value, such as hiding all but the last few digits of a credit card number, to limit what users and logs can see.
Data at rest
Data that is stored on persistent media such as disks, SSDs, databases, backups, and mobile devices, often protected with full disk, volume, file, or database encryption.
Key rotation
The practice of periodically replacing cryptographic keys to limit the damage from key compromise and to meet security and compliance policies.
Tokenization
A technique that replaces a sensitive value with a non-sensitive token, with the mapping stored in a secure system, reducing exposure if databases are breached.
Data in transit
Data that is moving across a network, such as web traffic, API calls, email, or VPN tunnels, typically protected with TLS or VPN encryption.
Hybrid environment
A hybrid environment is an enterprise environment that includes a mix of cloud, mobile, Internet of Things (IoT), operational technology (OT), and on-premises resources that must be monitored and secured.
Data classification
The process of labeling data based on its sensitivity and business impact, and tying those labels to specific handling and protection requirements.
Data Loss Prevention (DLP)
Technologies and processes used to detect and prevent unauthorized transmission or disclosure of sensitive data across endpoints, networks, and cloud services.
Full disk encryption (FDE)
An encryption method that protects an entire drive, including the operating system and user data, primarily to mitigate risks from lost or stolen devices.
Public key infrastructure (PKI)
A system for creating, managing, distributing, and revoking digital certificates that bind public keys to identities, enabling secure communication and authentication.
Transparent Data Encryption (TDE)
A database feature that encrypts data at the storage level so that applications can access data normally while it remains encrypted on disk and in backups.

Finished reading?

Test your understanding with a custom practice exam on this chapter.

Test yourself