From the Founder & VP Products at CloudSwitch

Ellen Rubin

Subscribe to Ellen Rubin: eMailAlertsEmail Alerts
Get Ellen Rubin via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Related Topics: Cloud Computing, Cloud Data Analytics, CloudSwitch on Ulitzer

Blog Feed Post

Planning for Outages in the Cloud

The Amazon US East outage of just over a week ago was an eye-opener for many people

The Amazon US East outage of just over a week ago was an eye-opener for many people. Here at CloudSwitch it validated what we know about best practices for using the cloud. Not surprisingly, these reflect traditional IT processes and systems that enterprises know are needed to protect data and ensure that applications remain available to users.

From an enterprise support perspective, I’m very happy about the variety of options we have to protect and backup data, scale and shrink application capacity, and bring applications on and off-line. It’s also easier than ever to make sure that you don’t have all of your “eggs in one basket” due to public clouds and products like CloudSwitch.

CloudSwitch was designed to bridge the worlds of the data center and public cloud, making it extremely easy and safe to move virtual machines into the cloud over an encrypted tunnel onto encrypted storage. You can also deploy multiple copies of your virtual machines to different availability zones, regions, or even clouds. Because CloudSwitch acts as a layer-2 bridge between the data center and cloud, virtual machines in the cloud are able to seamlessly access the data center and visa-versa, allowing data and applications to live in either the data center or the cloud. This provides some great opportunities to continue to use your existing IT management tools while taking advantage of cloud storage and cloud compute power in powerful new ways.

We encourage our customers to make full use of the opportunities that CloudSwitch enables:

  • Deploy virtual machines to multiple availability zones, regions and/or clouds.
  • Clone existing virtual machines or create hot “point in time” snapshots.
  • Continue to utilize traditional backup methods.
  • Employ traditional file system and/or database replication and load balancing to make your applications as available as possible.
  • Automate scripted deployments and life cycle actions on virtual machines.

When reviewing DR strategies with customers, we recommend the following approach:

  • Review all applications and associated virtual machines and prioritize them to determine the appropriate DR strategy.
  • Review and test monitoring of each application to ensure that you can detect and be alerted to application failures as quickly as possible.
  • Eliminate single points of failure for critical applications by providing multiple points of presence in the cloud.
  • If you are using replication technology to keep copies of virtual machines in sync, ensure that you have appropriate alerts in place to detect synchronization failures.
  • Determine if failover and failback must be automated or manual processes.
  • If load balancing is required, weigh the options and limitations of the different load balancers. Amazon’s ELB for instance can balance across availability zones but not regions.
  • For lower priority systems that don’t require HA, consider scheduled automated clones/snapshots or traditional backups or both.

Eating Our Own Dogfood

In our own internal operations, we’ve deployed www.cloudswitch.com to both Amazon’s US-East Region and US-West as well as Terremark’s Enterprise cloud.  We utilize an open source file system synchronizer to keep these copies in sync. This application also has an automated backup process that backs up data to Amazon S3.

CloudSwitch’s web portal for download/activation and support (home.cloudswitch.com) consists of database and application servers which are deployed to multiple regions within Amazon utilizing database (master-slave) replication.

As our CTO John Considine noted in last week’s blog post, we and our customers had a range of experiences during the Amazon outage, which only reinforced the importance of planning and implementing the procedures outlined above.  In some cases, we relied too heavily on snapshots for some of our internal systems, and recognized after the fact that we needed a careful DR review and prioritization of the many applications we have running in the cloud.

Any DR/HA plan should be routinely validated. We’ve taken this recent event as a good opportunity to do this internally and have been working with our customers to remind them of the options that are available to them with CloudSwitch to protect data and make systems more resilient.

By Dave Armlin, Director of Customer Support at CloudSwitch

Read the original blog entry...

More Stories By Ellen Rubin

Ellen Rubin is the CEO and co-founder of ClearSky Data, an enterprise storage company that recently raised $27 million in a Series B investment round. She is an experienced entrepreneur with a record in leading strategy, market positioning and go-to- market efforts for fast-growing companies. Most recently, she was co-founder of CloudSwitch, a cloud enablement software company, acquired by Verizon in 2011. Prior to founding CloudSwitch, Ellen was the vice president of marketing at Netezza, where as a member of the early management team, she helped grow the company to more than $130 million in revenues and a successful IPO in 2007. Ellen holds an MBA from Harvard Business School and an undergraduate degree magna cum laude from Harvard University.