Introduction

This blog is about my personal passion that is technology. Working in technology is fun, despite its odd working hours and stress. I enjoyed working in the technology field and learnt great deal about the value it generates to everyone. I don't claim to super expert on technology, but I can share some thoughts that might be interesting.

I would like to write few posts on various tech topics starting from data storage to video serving and see where this blog go. So if you like to share your thoughts, send your comments.

Happy blogging!
-ravi

Tuesday, April 8, 2008

HP's Upline Service

Recently HP announced a new service to share files easily over Internet. The service is a subscription model where a subscriber will use a web browser to publish and share files. The service also come with some cool features like password-protected sharing, automatic backup process, and data migration from old PC to new PC. While the service seems to be aiming at consumers and small businesses who are burdened with storage growth at the edge and don't have many sophisticated backup/recovery, migration software options at economical prices. This service charges $299/yr for an unlimited storage and seems like good deal to me.

But the real question is who gets the maximum value out of this deal. If we take a small business which spends about 500GB-1TB of storage for every 2 years and let us assume most of this growth is in the NAS space, that is mostly files and majority of them is related to Exchange data, then you can see the connection. Why are emails growing terribly fast? Because of the attachments. the average size of attachment in my own email box has grown from 500KB in 2001 to 3.5MB in 2007, if just plot the percentage of email storage based on attachment size, I notice that 90% of storage is occupied by attachments of size 1M - 15M, but they represent less than 1% of total messages. So the bottom line is that if I take all the attachments from my own inbox, I can save up to 90% of storage growth. Email is not an optimal solution for attachments. We can clearly see where HP's Upline is going after. Even if they didn't think about email attachments, that is the place where they have good bet on this type of service.

My Personal Email Data



Monday, April 7, 2008

SaaS: Is this a new type of tech secret-sauce?

SaaS loosely defined in many ways Software as a Service or Storage as a Service or Some tech-S as a service. Why is it suddenly so important service model? What happened to ASPs, the so called Application Service Providers.

SaaS is a service offered by some hosting companies where they run software on your behalf and show results of it. Well, don't we already have companies doing it for us? like email service from gmail or yahoo or photo sharing sites from picasso. The difference is that we are talking about enterprise applications for enterprise-class customers. The enterprises love to control over their information within their physical boundaries. Not true anymore! if a cool startup wants to start database application and provide API to a enterprise to create, search their version of database, that is possible with SaaS.


SaaS wants to take Industry to where ASPs couldn't. The main difference between as I understand is that ASP is much more customized version for client, where as SaaS exploits commanlity among clients usage of application software or service and could potentially share the same application instance.

SaaS is directly targeting IT costs (installation, operational, and managerial) of IT assets. A service can be "pay-as you-go" model. This model gives flexibility in allocating IT budgets. Enterprises love control over the budget, but security is concern. Despite claims made using the technical buzzwords, SaaS is weak on security. The data is stored and transmitted outside the walls of corporation and with service provider. There is no guarantee that secuirty can't be breached. However with some sense of strict regulation and policies, the security breaches can be mitigated but can not be avoided.

De-duplication: what is it?

This is the data storage technology I see lot of promises in future. Of course there is lot of confusion about its promise and so beaware that it's potential depends upon the type of applications.

De-duplication is a disruptive technology that reverses “duplication” of data in the traditional backup environments. The technology reduces data to be backed up by the order of magnitude amount. The traditional backup solution burdened with growing backup window and amount of data archived. Compression techniques and retention policies have been implemented to address the data growth. However these methods have minimal reduction of data growth burden.

De-duplication process work on reversing this growing data. The technology divides the data into segments and only the segments that have been modified will be backed up to the secondary storage. The redundant segments are determined by a commonality factor and will not be part of the backup. A typical de-dupe application is expected to reduce around 200-500x data reduction in backup environment. This technology is disruptive to the current backup environment and will be playing major role in coming 3-5 years.

The de-duplication saves WAN bandwidth and growth of secondary storage by reducing overall data in backup process. In affect it will reduce network bandwidth costs, secondary storage costs, support costs and installation costs. De-duplication can be performed at source or target of backup data. The source de-dup reduces the network bandwidth as well as secondary storage, where as target de-dupe will reduce secondary storage. EMC’s Avamar and Data Domain Appliance series are examples of source-based and target-based de-duplication.

Major storage players are focusing on the de-duplication. Many backup software vendors are started working on de-dupe solutions as part of their offerings. The archiving vendors such as VTL vendors are integrating de-dupe as part of the archiving solutions.

De-duplication is technology that can go beyond the backup and archiving environments. Data reduction is desired functionality in areas such as replication. De-duplication in the replication market is untapped opportunity. It can reduce data to be sent over to remote site in the remote replication and increases the performance of asynchronous mirroring application. Due to early stage of the technology, Replication vendors haven’t fully embraced De-duplication.

Many De-duplication solutions focused on data reduction by focusing data changes over time which can be referred as Temporal de-duplication. NetApp introduced Single-Instance-Storage de-duplication solution that takes advantage commonality of data within storage which can be referred as Spatial de-duplication. The spatial de-duplication is relatively very new concept and potential area for vendors to tap new opportunities. The spatial de-duplication could reduce primary as well as secondary storage needs for certain applications.

De-duplication is game changing technology in coming 3-5 years. Even though it is getting traction in backup environments, the technology has potential in many areas where data reduction is desired. Both temporal and spatial de-duplication has advantages applicable in certain application environments.