… or, the lack of it.
A recent discussion at a customer made me having a closer look around support for encryption in the context of XaaS cloud service offerings as well as concerning Hadoop. In general, this can be broken down into over-the-wire (cf. SSL/TLS) and back-end encryption. While the former is widely used, the latter is rather seldom to find.
Different reasons might exits why one wants to encrypt her data, ranging from preserving a competitive advantage to end-user privacy issues. No matter why someone wants to encrypt the data, the question is do systems support this (transparently) or are developers forced to code this in the application logic.
IaaS-level. Especially in this category, file storage for app development, one would expect wide support for built-in encryption.
- Amazon’s S3 indeed provides server-side support for encryption
- Google Storage does not encrypt files
- Same for Rackspace’s Cloud Files – no encryption, ATM
- As well as for Microsoft’s Azure storage – not encrypting files
- And last but not least, HP Cloud’s Object Storage is in good company by not supporting encryption
On the PaaS level things look pretty much the same: for example, AWS Elastic Beanstalk provides no support for encryption of the data (unless you consider S3) and concerning Google’s App Engine, good practices for data encryption only seem to emerge.
Offerings on the SaaS level provide an equally poor picture:
- Dropbox offers encryption via S3.
- Google Drive and Microsoft Skydrive seem to not offer any encryption options for storage.
- Apple’s iCloud is a notable exception: not only does it provide support but also nicely explains it.
- For many if not most of the above SaaS-level offerings there are plug-ins that enable encryption, such as provided by Syncdocs or CloudFlogger
In Hadoop-land things also look rather sobering; there are few activities around making HDFS or the likes do encryption such as ecryptfs or Gazzang’s offering. Last but not least: for Hadoop in the cloud, encryption is available via AWS’s EMR by using S3.
Even where encryption is available, there are also questions of trust and legal jurisdiction about the cloud service provider. Storage is one thing, you can decrypt client-side and keep your private key outside the cloud, but if you want to use cloud computing power to actually process your encrypted data the problem becomes more difficult. There is at least one company working on solutions to these problems: http://cloudtomo.com (disclaimer, I work there).
I think the reason AWS is the only one to support some sort of “server side encryption” is because the others saw that it doesn’t make any sense. Maybe the reasoning went something like, “this doesn’t make sense but customers are asking for it so let’s do it anyways”. I mean why on earth, if you have data worth encrypting, would you trust some random third party on the internet to have your keys and do it all for you? What’s the point? Any time Amazon needs to look at your data, they can – they have the key!
And SSL/TLS is horribly broken and vulnerable to MITM attacks too.
If you care about the privacy of your data, encrypt it properly before it leaves your control. Otherwise do not put it in clouds, point finale.
An update regarding AWS: users of Zadara Storage at AWS have the option of server-based encryption of data-at-rest where the key encryption key (KEK) is user-generated. If you’d like to know more, see this security brief outlining data privacy and encryption: http://bit.ly/ZUDndd (disclaimer, I’m obviously with Zadara Storage)