Tuesday 25 March 2014

Can we really store everything forever?

There seems to be a growing expectation within organisations that we store every piece of data forever. If ever I raise retention policies, particularly with legal or compliance teams, the response is that we need to store data forever. Whilst I understand that some data is required to be kept in legal hold, do we need to really store everything forever? And sometimes, there could be a risk in keeping records beyond the date required by the regulators... On the flip side, in the world of Big Data we may not know that a piece of data is valuable yet?



So the challenge is that archived data takes up valuable space in an organisations data centre, and is typically not used for revenue generating purposes. It may allow a business to operate: think regulatory data, but the business doesn't make money from it, or doesn't yet... So, it's a pure bottom line cost, a cost that need to be minimised as much as possible to increase P&L. I like to call this type of data Write-Once-Read-Rarely, or cold storage.



Historically, long term archive meant magnetic tape. Tape is still the highest density storage media, but the challenge with tape is that is degrades, so ensuring data integrity is a challenge, particularly for data with long retention periods. Managing the tape pool becomes a full time job, and remember that this is a non-revenue generating post. Whilst disk is not as dense as tape, data integrity can be provided via software: parity, check-sums, continual checking. A lot of organisations have gone to pure disk storage solutions for archive and backup. One of the other benefits of disk is also faster data retrieval times.



There is another type of storage that can slot into an information life-cycle management (ILM) strategy, and that is cloud storage. The interesting thing with cloud is that the funding model typically changes from CapEx to OpEx, so it's pay-per-use, and it plays to my earlier point about non-revenue generating infrastructure: i.e. outsource it to a utility provider. Obviously there are the usual cloud concerns: security, legal, mobility etc, but if you can put in place compensating controls, the value proposition is compelling.



Not all data types are suitable for storing in the cloud: there may be regulatory or jurisdiction constraints that mandate where data is at rest. So I think we need some form of hybrid cloud storage: a mix of on and off premise storage, where the requirements and constraints dictate where the data is placed as a part of the life-cycle management. The end goal is to ensure the cost back to the business is kept as low as possible, and a company's resources are used for driving up revenue.

No comments:

Post a Comment