[Beowulf] Data Destruction

Jörg Saßmannshausen sassy-work at sassy.formativ.net
Wed Sep 29 21:51:40 UTC 2021

Hi Ellis,

interesting concept. I did not know about the Lustre fsencrypt but then, I am 
less the in-detail expert in PFS.

Just to make sure I get the concept of that correct: Basically Lustre is 
providing projects which itself are encrypted, similar to the encrypted 
containers I mentioned before. So in order to access the project folder, you 
would need some kind of encryption key. Without that, you only have 
meaningless data in front of you. Is that understanding correct?

Does anybody happen to know if a similar system like the one Lustre is 
offering is possible on Ceph?

The only problem I have with all these things is: at one point you will need 
to access the decrypted data. Then you need to make sure that this data is not 
leaving your system. So for that reason we are using a Data Safe Haven where 
data ingress and egress is done via a staging system. 

Some food for thought.


All the best


Am Mittwoch, 29. September 2021, 16:42:46 BST schrieb Ellis Wilson:
> Apologies in advance for the top-post -- too many interleaved streams
> here to sanely bottom-post appropriately.
> SED drives, which are a reasonably small mark-up for both HDDs and SSDs,
> provide full drive or per-band solutions to "wipe" the drive by revving
> the key associated with the band or drive.  For enterprise HDDs the
> feature is extremely common -- for enterprise SSDs it is hit or miss
> (NVMe tend to have it, SATA infrequently do).  This is your best bet for
> a solution where you're a-ok with wiping the entire system.  Note
> there's non-zero complexity here usually revolving around a non-zero
> price KMIP server, but it's (usually) not terrible.  My old employ
> (Panasas) supports this level of encryption in their most recent release.
> Writing zeros over HDDs or SSDs today is an extremely dubious solution.
>   SSDs will just write the zeros elsewhere (or more commonly, not write
> them at all) and HDDs are far more complex than the olden days so you're
> still given no hard guarantees there that writing to LBA X is actually
> writing to LBA X.  Add a PFS and then local FS in front of this and
> forget about it.  You're just wasting bandwidth.
> If you have a multi-tenant system and cannot just wipe the whole system
> by revving encryption keys on the drives, you're options are static
> partitioning of the drives into SED bands per tenant and a rather
> complex setup with a KMIP server and parallel parallel file systems to
> support that, or client-side encryption.  Lustre 2.14 provides this via
> fsencrypt for data, which is actually pretty slick.  This is your best
> bet to cryptographically shred the data for individual users.  I have no
> experience with other commercial file systems so cannot comment on who
> does or doesn't support client-side encryption, but whoever does should
> allow you to fairly trivially shred the bits associated with that
> user/project/org by discarding/revving the corresponding keys.  If you
> go the client-side encryption route and shred the keys, snapshots, PFS,
> local FS, RAID, and all of the other factors here play no role and you
> can safely promise the data is mathematically "gone" to the end-user.
> Best,
> ellis
> On 9/29/21 10:52 AM, Paul Edmon via Beowulf wrote:
> > I guess the question is for a parallel filesystem how do you make sure
> > you have 0'd out the file with out borking the whole filesystem since
> > you are spread over a RAID set and could be spread over multiple hosts.
> > 
> > -Paul Edmon-
> > 
> > On 9/29/2021 10:32 AM, Scott Atchley wrote:
> >> For our users that have sensitive data, we keep it encrypted at rest
> >> and in movement.
> >> 
> >> For HDD-based systems, you can perform a secure erase per NIST
> >> standards. For SSD-based systems, the extra writes from the secure
> >> erase will contribute to the wear on the drives and possibly their
> >> eventually wearing out. Most SSDs provide an option to mark blocks as
> >> zero without having to write the zeroes. I do not think that it is
> >> exposed up to the PFS layer (Lustre, GPFS, Ceph, NFS) and is only
> >> available at the ext4 or XFS layer.
> >> 
> >> On Wed, Sep 29, 2021 at 10:15 AM Paul Edmon <pedmon at cfa.harvard.edu
> >> 
> >> <mailto:pedmon at cfa.harvard.edu>> wrote:
> >>     The former.  We are curious how to selectively delete data from a
> >>     parallel filesystem.  For example we commonly use Lustre, ceph,
> >>     and Isilon in our environment.  That said if other types allow for
> >>     easier destruction of selective data we would be interested in
> >>     hearing about it.
> >>     
> >>     -Paul Edmon-
> >>     
> >>     On 9/29/2021 10:06 AM, Scott Atchley wrote:
> >>>     Are you asking about selectively deleting data from a parallel
> >>>     file system (PFS) or destroying drives after removal from the
> >>>     system either due to failure or system decommissioning?
> >>>     
> >>>     For the latter, DOE does not allow us to send any non-volatile
> >>>     media offsite once it has had user data on it. When we are done
> >>>     with drives, we have a very big shredder.
> >>>     
> >>>     On Wed, Sep 29, 2021 at 9:59 AM Paul Edmon via Beowulf
> >>>     
> >>>     <beowulf at beowulf.org <mailto:beowulf at beowulf.org>> wrote:
> >>>         Occassionally we get DUA (Data Use Agreement) requests for
> >>>         sensitive
> >>>         data that require data destruction (e.g. NIST 800-88). We've
> >>>         been
> >>>         struggling with how to handle this in an era of distributed
> >>>         filesystems
> >>>         and disks.  We were curious how other people handle requests
> >>>         like this?
> >>>         What types of filesystems to people generally use for this
> >>>         and how do
> >>>         people ensure destruction?  Do these types of DUA's preclude
> >>>         certain
> >>>         storage technologies from consideration or are there creative
> >>>         ways to
> >>>         comply using more common scalable filesystems?
> >>>         
> >>>         Thanks in advance for the info.
> >>>         
> >>>         -Paul Edmon-
> >>>         
> >>>         _______________________________________________
> >>>         Beowulf mailing list, Beowulf at beowulf.org
> >>>         <mailto:Beowulf at beowulf.org> sponsored by Penguin Computing
> >>>         To change your subscription (digest mode or unsubscribe)
> >>>         visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
> >>>         <https://beowulf.org/cgi-bin/mailman/listinfo/beowulf>
> > 
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> > To change your subscription (digest mode or unsubscribe) visit
> > https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

More information about the Beowulf mailing list