Chapter 13 of 27
Data Protection Strategies: Classification, Encryption, and Tokenization
Follow the lifecycle of sensitive data and choose the right mix of classification, encryption, and other controls to keep it protected wherever it travels.
Big Picture: Protecting Data Across Its Lifecycle
From Systems to Data
You have seen how enterprises secure complex hybrid environments. Now you zoom in on the data itself: how to keep it protected wherever it lives or travels.
Lifecycle View
Think in stages: discover and classify data, rate its sensitivity, apply controls, monitor and respond, then retire it securely when no longer needed.
Exam Mapping
For each scenario, ask: what data type and classification is this, where is it in the lifecycle, and which control (encryption, tokenization, masking, DLP) fits best?
Step 1: Data Classification and Handling Rules
Why Classify?
Data classification means labeling data by sensitivity and impact, then tying those labels to clear handling rules. You cannot pick controls until you know what you are protecting.
Typical Levels
Many orgs use tiers like Public, Internal, Confidential, and Restricted. Regulated data (cardholder, health, special category personal data) usually maps to the top tiers.
Handling Rules
Each label must have rules: encryption requirements, who can access, whether it can leave the company, and how it appears in test environments and logs.
Classify This Data (Thought Exercise)
Work through these quick scenarios. Do not overthink; focus on mapping to a classification and a couple of required protections.
- Company public website FAQ
- How would you classify it (Public/Internal/Confidential/Restricted)?
- Is encryption at rest required? In transit?
- Employee salary spreadsheet on a shared drive
- Likely classification?
- At least two controls you would require (e.g., ACLs, encryption, DLP)?
- Customer credit card numbers stored in a database
- Likely classification?
- What techniques from this module would you combine (e.g., database encryption, tokenization, masking in logs)?
- IoT sensor data from manufacturing equipment, including operator IDs
- What classification makes sense?
- How might classification affect where this data is stored (cloud vs on‑prem) and how it is encrypted?
After you answer, compare your reasoning to this pattern:
- The more personal, financial, or safety-related the data, the higher the classification.
- Higher classification means stronger encryption, tighter access control, and stricter monitoring.
Use this mindset on Security+ exam questions: identify the data type and its implied classification before picking the control.
Step 2: Encryption Basics, In Transit vs At Rest
Plaintext vs Ciphertext
Encryption turns readable plaintext into unreadable ciphertext using a key and algorithm. Only someone with the right key can reverse the process.
Transit vs Rest
Data in transit moves over networks and is protected with TLS, VPNs, and secure protocols. Data at rest sits on disks, databases, and backups and needs storage encryption.
Limits of Encryption
Encryption hides data from eavesdroppers, but it does not decide who may access decrypted data. You still need strong access control and integrity checks.
Step 3: Encryption for Data at Rest (Full Disk, Volume, File, Database)
Full Disk and Volume
Full disk encryption protects an entire drive, great for lost laptops. Volume encryption targets specific partitions, like a database volume, instead of the whole disk.
File and Folder
File or folder encryption protects only selected data and can enforce per-user access, even when the OS is running and the disk is otherwise unlocked.
Database Encryption
Database options include transparent data encryption for whole databases and column-level encryption for specific fields like SSNs or card numbers.
Step 4: Key Management, Rotation, and PKI in Data Protection
Keys Matter Most
Strong crypto is useless if keys are weak or exposed. Generate keys securely, store them safely, and never hard-code them into source or config files.
Rotate and Separate
Rotate keys regularly and after incidents, and use different keys for different systems and purposes to limit the impact of any single key compromise.
PKI’s Role
PKI manages certificates and public keys, enabling TLS, secure email, and code signing. It underpins encryption and authentication across enterprise data flows.
Step 5: Tokenization, Masking, and Why They Matter
What is Tokenization?
Tokenization swaps a sensitive value for a non-sensitive token, with the mapping held in a secure system. Breaching the tokenized database alone reveals little.
What is Masking?
Masking hides part of the data, such as showing only the last four digits of a card, so staff and logs see less sensitive information.
Why Use Them?
Tokenization and masking let systems function while limiting who ever sees real sensitive data, especially in logs, support tools, and test environments.
Step 6: End-to-End Scenario – Payment Data Through Its Lifecycle
Step 1–2: Capture and Transit
Customer card data is entered on a web page and protected with HTTPS. The web tier forwards it over mutual TLS or a VPN; logs mask the card number.
Step 3–4: Tokenize and Store
The payment processor tokenizes card numbers. The merchant stores only tokens and last four digits in a database protected by TDE or volume encryption.
Step 5–6: Keys and DLP
Keys live in a KMS or HSM and are rotated. DLP policies watch for card patterns leaving via email or file shares, blocking or alerting on violations.
Step 7: Data Loss Prevention (DLP) and Discovery
What DLP Does
DLP tools watch for sensitive data leaving endpoints, networks, or cloud apps and can log, warn, or block risky transfers.
How DLP Detects
They use pattern matching, keyword dictionaries, and classification labels to spot things like card numbers, IDs, or tagged confidential documents.
Typical Exam Clues
If a question asks how to stop users from emailing or uploading sensitive data outside the org, DLP plus classification is a strong candidate.
Quiz 1: Choosing the Right Control
Answer this exam-style question to test your understanding.
A company processes customer credit card payments through a web application. They want to ensure that if their customer database is breached, attackers cannot obtain usable card numbers, while still allowing recurring billing. Which control BEST meets this requirement?
- A. Implement full disk encryption on the database server
- B. Use tokenization for card numbers and store only tokens in the customer database
- C. Enable TLS for all connections between the web server and the database server
- D. Apply masking so that only the last four digits of the card number are displayed to customer service staff
Show Answer
Answer: B) B. Use tokenization for card numbers and store only tokens in the customer database
Full disk encryption (A) protects data at rest if the disk is stolen, but if attackers access the running database, they can see decrypted data. TLS (C) protects data in transit, not in a breached database. Masking (D) limits what staff see, but full numbers may still be stored. Tokenization (B) replaces card numbers with non-sensitive tokens in the merchant database, so a database breach does not reveal real card data while still supporting recurring billing via tokens.
Quiz 2: Data at Rest Options
Another quick check on at-rest encryption choices.
An organization stores employee HR records, including Social Security numbers, in a large relational database. They want to protect only the SSN field with encryption while minimizing changes to existing applications. Which solution is MOST appropriate?
- A. Full disk encryption on the database server
- B. Column-level encryption of the SSN field
- C. Network DLP on outbound email
- D. File-level encryption of database backup files
Show Answer
Answer: B) B. Column-level encryption of the SSN field
Full disk encryption (A) and file-level encryption of backups (D) protect entire storage, not a specific field. Network DLP (C) helps prevent data exfiltration but does not encrypt stored SSNs. Column-level encryption (B) targets just the sensitive SSN field, aligning with the requirement and is a common pattern for protecting specific fields in a database.
Key Term Review
Flip through these cards to reinforce the most important terms before you move on.
- Data classification
- The process of labeling data based on its sensitivity and business impact, and tying those labels to specific handling and protection requirements.
- Data in transit
- Data that is moving across a network, such as web traffic, API calls, email, or VPN tunnels, typically protected with TLS or VPN encryption.
- Data at rest
- Data that is stored on persistent media such as disks, SSDs, databases, backups, and mobile devices, often protected with full disk, volume, file, or database encryption.
- Full disk encryption (FDE)
- An encryption method that protects an entire drive, including the operating system and user data, primarily to mitigate risks from lost or stolen devices.
- Transparent Data Encryption (TDE)
- A database feature that encrypts data at the storage level so that applications can access data normally while it remains encrypted on disk and in backups.
- Tokenization
- A technique that replaces a sensitive value with a non-sensitive token, with the mapping stored in a secure system, reducing exposure if databases are breached.
- Masking
- The practice of obscuring part of a data value, such as hiding all but the last few digits of a credit card number, to limit what users and logs can see.
- Key rotation
- The practice of periodically replacing cryptographic keys to limit the damage from key compromise and to meet security and compliance policies.
- Public key infrastructure (PKI)
- A system for creating, managing, distributing, and revoking digital certificates that bind public keys to identities, enabling secure communication and authentication.
- Data Loss Prevention (DLP)
- Technologies and processes used to detect and prevent unauthorized transmission or disclosure of sensitive data across endpoints, networks, and cloud services.
Step 8: Putting It All Together for Security+ and Real Environments
4-Step Exam Checklist
For each question: identify the data, locate it in the lifecycle, choose the main control type, and ignore distractors that do not solve the specific problem.
Real-World Mix
Enterprises blend classification, layered encryption, tokenization, masking, and DLP, all shaped by governance, risk, and compliance requirements.
Your Next Moves
Use Skarp diagnostics and mock exams to test these concepts; your spaced review and gap guide will reinforce areas where you struggle.
Key Terms
- Masking
- The practice of obscuring part of a data value, such as hiding all but the last few digits of a credit card number, to limit what users and logs can see.
- Data at rest
- Data that is stored on persistent media such as disks, SSDs, databases, backups, and mobile devices, often protected with full disk, volume, file, or database encryption.
- Key rotation
- The practice of periodically replacing cryptographic keys to limit the damage from key compromise and to meet security and compliance policies.
- Tokenization
- A technique that replaces a sensitive value with a non-sensitive token, with the mapping stored in a secure system, reducing exposure if databases are breached.
- Data in transit
- Data that is moving across a network, such as web traffic, API calls, email, or VPN tunnels, typically protected with TLS or VPN encryption.
- Hybrid environment
- A hybrid environment is an enterprise environment that includes a mix of cloud, mobile, Internet of Things (IoT), operational technology (OT), and on-premises resources that must be monitored and secured.
- Data classification
- The process of labeling data based on its sensitivity and business impact, and tying those labels to specific handling and protection requirements.
- Data Loss Prevention (DLP)
- Technologies and processes used to detect and prevent unauthorized transmission or disclosure of sensitive data across endpoints, networks, and cloud services.
- Full disk encryption (FDE)
- An encryption method that protects an entire drive, including the operating system and user data, primarily to mitigate risks from lost or stolen devices.
- Public key infrastructure (PKI)
- A system for creating, managing, distributing, and revoking digital certificates that bind public keys to identities, enabling secure communication and authentication.
- Transparent Data Encryption (TDE)
- A database feature that encrypts data at the storage level so that applications can access data normally while it remains encrypted on disk and in backups.