«Enterprise Self-Encrypting Drives User Guide - Part 1 100515636, Rev. B September 2015 © 2015, Seagate Technology LLC All rights reserved. ...»
User Guide - Part 1
100515636, Rev. B
© 2015, Seagate Technology LLC All rights reserved.
Publication number: 100515636, Rev. B September 2015
Seagate, Seagate Technology and the Spiral logo are registered trademarks of Seagate Technology LLC in the United States and/or other countries. Seagate, and
SeaTools are either trademarks or registered trademarks of Seagate Technology LLC or one of its affiliated companies in the United States and/or other countries.
All other trademarks or registered trademarks are the property of their respective owners.
No part of this publication may be reproduced in any form without written permission of Seagate Technology LLC.
Call 877-PUB-TEK1 (877-782-8351) to request permission.
One gigabyte, or GB, equals one billion bytes and one terabyte, or TB, equals one trillion bytes when referring to hard drive capacity. Accessible capacity may vary depending on operating environment and formatting. Quantitative usage examples for various applications are for illustrative purposes. Actual quantities will vary based on various factors, including file size, file format, features and application software. Seagate reserves the right to change, without notice, product offerings or specifications.
1.0 Introduction................................................................................. 3
1.1 The fundamentals of data encryption......................................................... 4 1.1.1 Encryption basics.................................................................. 4 1.1.2 The Advanced Encryption Standard (AES)........................................... 5 1.1.3 Block ciphers....................................................................... 6 1.1.4 Cipher Block Chaining (CBC)........................................................ 6
1.2 Hash functions............................................................................... 7
1.3 Drive locking..............................................................................
This user guide provides a comprehensive introduction to security and full disk encryption as it is implemented in Seagate SecureTM enterprise Self-Encrypting Drive (SED) models. SED models communicate with a host system using the standard protocol defined by the Trusted Computing Group (TCG), an organization sponsored and operated by companies in the computer, storage and digital communications industry.
Most of the published material on this subject is in the form of standards. Standards are documents which provide the definitive text on the subject and are the ultimate reference for the industry’s design and development teams. These documents however are hardly fodder for the inquisitive amateur and are not recommended as an alternative to your favorite night time reading. That’s where this manual comes in. If you have to know about encryption and data security as it applies to disc storage, you’ve come to the right place.
This manual forms Part 1 of the Users’ Guide and will introduce and explain the subject matter using a stepped approach to ease you into the terminology used by the data security intellectuals with as little pain and mathematical wizardry as possible. In Part 2 of the Users’ Guide, you will find the information necessary to communicate with the drive using the TCG protocol. In short, Part 1 tells you what you can do with the drive and Part 2 tells you how you can do it. If you are interested in the SED User Guide Part 2, Trusted Storage Architecture-Training Manual, please request it directly from your Seagate engineering contact. A Non-Disclosure Agreement is required for Part 2.
If you stay with us all the way to the back cover of Part 1, we can promise you a good working knowledge and understanding of
• Data encryption and decryption
• Symmetric and asymmetric keys
• Digital signatures and secure messaging
• Drive locking
• Cryptographic data erase
• Encryption keys and authentication keys
• Security partitions
• Password and data access management
• Taking ownership of the drive and activating the security features
• User data bands
• SCSI security commands
• Authenticated firmware downloads If this is what you were looking for, welcome aboard.
1.1.1 Encryption basics Encryption is a process whereby a plain text or clear text message is disguised in such a way as to hide its meaning. It stands to reason that this would not be a particularly clever thing to do unless the process could be reversed and the encrypted text (also known as cipher text) could be decrypted (or deciphered) back to the original message.
In Figure 1, we see that a clear text message is encrypted by a piece of hardware we’ll call an encryption engine and subsequently decrypted back to the original clear text by passing it through a decryption engine. We call these engines because they work on the data as they pass through and perform the required conversion without introducing any noticeable delay in the data flow.
Figure 1. Encryption and decryption
Both the encryption and decryption engines use a key which is the secret ingredient in the text transformation processes. To keep things as simple as possible, this is a symmetric key, which means the same key is used for both the encryption and the decryption process. We’re all familiar with at least one symmetric key, the one which both locks and unlocks the door to our house.
Here’s a very simple example of symmetric encryption. Let’s suppose our encryption engine uses a very simple process which will add incoming plain text to a secret key value, letter by letter, and output the result as cipher text. In
this case we’ll give all letters a value corresponding to their position in the alphabet such that:
“A” = 1, “B” = 2, “C” = 3, “D” = 4, ………., “Z” = 26, “ “ (space) = 27.
Now if we add “B” (2) to “G” (7) we get “I” (9). If the addition produces numbers which are greater than 27, for example (N + R) = (14 + 18) = 32, we simply loop back around the alphabet, (32 – 27) = 5 = “E”. Suppose our clear text message is “HELLO WORLD” and we have selected the secret encryption key “ENIGMA” (repeated as required to match the length of the clear text message), then for our very simple encryption engine, the encryption process would provide
the following result:
HELLO WORLD + ENIGMAENIGM = MSUSAAAB SQMessage Encryption Key Cipher Text Enterprise Self-Encrypting Drive User’s Guide, Rev. B 4 It isn’t rocket science to deduce that our decryption engine would need to perform a similar but opposite function (subtraction in this case) in order to reconstitute the original message:
MSUSAAAB SQ – ENIGMAENIGM = HELLO WORLDCipher text Decryption Key Message Any prying eyes that got access to our cipher text would not be able to deduce the original plain text message unless
two things were known:
1. The secret key
2. How the encryption engine works—the algorithm used to compute the cipher text.
Any cryptographer worth his salt would be able to break our code in a heartbeat, so we could beef it up by making the algorithm more complex. Instead of assigning sequential numbers to the letters of the alphabet, we could use a lookup table, assign any unique number we want to each of the alphabetic characters, process the message in reverse order, add redundant characters, and add any number of other algorithm complexities to make the key more secure.
The other thing we could do is to use a more complex key which is as large as manageably possible (to cut down the number of repetitions) and one that does not form a readable word or phrase, making it more difficult for an attacker to break the code. A key made up of random characters would fit the bill since its structure is entirely unpredictable.
To make the encrypting process easier to handle by the electronic hardware, we could perform the encryption on a block by block basis. Additionally, if we made the key size equal to the block size, we could avoid having to repeat (concatenate) the key within the block. This is the technique that is used and though block sizes vary depending on the algorithm and the current state of the technology, the more common block sizes in Self-Encrypting Drives are 128 and 256 bits.
1.1.2 The Advanced Encryption Standard (AES) Believe it or not, in the real world of disk drive cryptography, the only secret is the encryption key itself. The algorithm (encrypting process) is not only well known but is a standard called the Advanced Encryption Standard (AES) which is recommended by the US government.
Two versions of this standard are used in Seagate disk drives, AES128 and AES256. The numbers refer to the bit-size of the encryption key (and the block size) used by the algorithm, which must be a 128-bit (16 byte) or 256-bit (32 byte) random number. Without knowing the encryption key, this algorithm makes it virtually impossible to decipher the code and since the algorithm is in general use, the more exposure it gets to being unsuccessfully attacked and broken, the higher our confidence in it.
Another advantage of being a standard is that it provides a common denominator for the manufacture of encrypting devices. So, all vendors are dancing to the same tune—this makes it easier to check that all vendors of encrypting hardware are compliant with government requirements.
For those interested in reading a short description of the AES128 algorithm and seeing a simplified block diagram, refer to Section 6.0.
Enterprise Self-Encrypting Drive User’s Guide, Rev. B 5 1.1.3 Block ciphers As mentioned in our discussion on AES, we encrypt the clear text message in blocks of 128 or 256 bits at a time. In other words, we are using a block cipher in our encryption engines.
Figure 2 shows how a simple block cipher called Electronic Code Book (ECB) constructs the encrypted data. Each block of plain text (P) is encrypted with the key (E) and outputs cipher text (C) ready for storage. Since we are using the same encryption key for every block of data, we would expect identical blocks of clear text to produce identical blocks of cipher text. This is clearly undesirable since it provides an attacker with a clue that could be used to determine the key.
If attackers are able to manipulate the clear text and if they can view the resultant cipher text, they could use small and precise changes in the clear text, see what affect that had on the cipher text, and use this information to help identify the key. But don’t be alarmed, the word “if” appears twice in the last sentence. The first would normally not be satisfied and the second would never be satisfied.
As we will see later, the encrypted data never leaves the drive and is not available over the I/O (interface). Nevertheless, this characteristic of ECB is seen as a weakness which has been addressed and rectified by Cipher Block Chaining.
Figure 2. Electronic Code Book
1.1.4 Cipher Block Chaining (CBC) As you can see from Figure 3, CBC is similar to ECB except that the cipher text from the previous encryption block is XOR’d with the plain text in the current block. This effectively randomizes the clear text in every stage and prevents cipher text duplications. Ah yes, but what about the first stage? Clearly there is no cipher text available to the first stage, so we’ve compensated for this by adding a component called the Initialization Vector (IV). This isn’t magic, it’s simply a secret 128-bit number known only to the disk drive.
Figure 3. Cipher Block Chaining
1.2 Hash functions Hash functions take an arbitrarily long string of bytes and produce a fixed size result, sometimes called the digest or fingerprint. In Figure 4, we see a string of bytes m being input to a hash function which produces a fixed size hash output h(m).
Figure 4. The hash function
The properties of a hash function can be simply stated as follows:
• It must be a one-way function. Given m it’s easy to compute h(m) but given h(m) it’s not possible to find m. That is, you cannot create the original message from the digest.
• It must have good collision resistance. A collision means computing the same hash h(m) for two different input streams. In other words it should be practically impossible to find two messages m1 and m2 such that h(m1) = h(m2).
Hash functions have been used in disk drives since the early days of magnetic storage. They have been used as an integrity check on data fields to signal an error condition when data corruption has taken place.
There are various hash functions in use but one of the more common provides a digest called Cyclic Redundancy Check (CRC). Today, we append a 4-byte CRC check to Fibre Channel and Serial Attached SCSI (SAS) data frames before they are sent by a controller across the interface to the drive. As the data frame is received, the drive computes the CRC (hash) of the data and compares it to the appended CRC. Any discrepancy indicates the data has been corrupted and the frame should be resent.
So, how do we select a suitable hashing algorithm that adequately meets the properties given above? Once again, the federal government made our job of selecting a good hashing function a lot easier because the NSA (National Security Agency) designed the Secure Hash Algorithm (SHA) which fits the bill very nicely. The SHA algorithm used in Seagate drives produces a digest of 256 bits and is consequently called SHA256.