|Have a ‘recovery’ plan before disaster strikes|
LAKSHMAN NARAYANASWAMY, CO-FOUNDER , SANOVI TECHNOLOGIES, BANGALORE.D. Murali
Early in the morning, after a cup of hot coffee at home, when I meet Lakshman Narayanaswamy in Nageswara Rao park, the topic of discussion is not fitness or finance, but disaster recovery. Lakshman is the co-founder and VP-Products of Bangalore-based Sanovi Technologies (www.sanovi.com), which helps organisations ‘proactively manage disaster recovery (DR) environments.’
For starters, ‘disaster recovery’ is all about an organisation’s ability to recover and operate core processes/services that enable it to transact business with its customers and partners, explains Narayanaswamy during the course of a subsequent e-mail interaction with eWorld.
Excerpts from the interview.
What are the misconceptions and myths to watch for, in the DR space?
There are several misconceptions that managers must be aware of when planning for their organisation’s DR:
We have not been hit by a disaster, these things don’t happen here.
We are replicating our data, that is our DR plan.
My team of experts can bring up our infrastructure in case of need.
DR is for big companies; my company can postpone the decision.
It is a well-researched and documented fact that 90 per cent of outages are caused by banal happenings, such as someone tripping on a cable, wrong configuration, plugged into the wrong port, and patch upgrade not working as expected. Less then 10 per cent of the outages are caused by fire, flood, etc. The everyday ordinary happenings may be low impact but cause business disruption nevertheless, resulting in loss of productivity and customer satisfaction.
Is there a quick test to find out if a business needs to consider DR? Also, is there an easy way to do DR?
A quick test that determines the need for DR can be summed in two questions:
What are the critical business processes and related IT systems?
What is the impact to business if critical IT systems become unavailable?
Approximate answers to the above questions can reveal the dependencies of the business on critical process and system. A more thorough analysis can help understand the financial exposure and any other exposures such as regulatory, loss of reputation, loss of productivity that adversely impact the business.
In terms of IT recovery readiness, what do you see as lessons from successful organisations?
Organisations that have deployed DR have the following common traits:
DR readiness has executive level sponsorship.
Recovery readiness is well integrated into their business process; it is not an add-on after-thought.
Focus is on inspecting, measuring and reporting on recovery metrics on a regular basis.
Reduced dependency on people; increased level of automation and processes that can be followed.
Are there levels of maturity in disaster recovery preparedness?
Gartner has described a DR maturity model that is useful to understand where a company stands and how it can progress towards a more mature DR capability. The lowest on the maturity chart is when there is no documented DR capability in the organisation.
Stage I is when DR is taken up as a project, some of the critical business processes and IT applications are identified and a DR capability is built and demonstrated. After the project is completed, the DR capability cannot be relied on.
Stage II is when DR readiness is implemented as a business process. This means that critical processes and IT systems have been identified and DR for these processes has been set up. Further, DR readiness is also tested and areas that failed are identified and fixed. Also, in this stage companies work on including the business users as part of the testing routine to ensure that business processes and people are also accounted for in the continuity plan and their needs and roles are also accounted for in the plan.
The final stage of maturity is when DR becomes integral to key business process. The organisation has a strong focus on DR and it is part of a larger risk management group in the organisation.
Monitoring, testing and regular reporting on compliance with key recovery metrics are part of the reporting to the executive management. Further, the scope of testing includes business users and key partners and vendors.
The organisation as a whole approaches recovery readiness as part of their planning and operations; it is no longer an add-on that is done after the completion of business process.
What are the areas of DR that attract research and innovation?
DR readiness touches several facets of an organisation and has a life-cycle that it goes through. The key stages of the DR life-cycle are DR planning, DR solution design and provisioning, DR monitoring and validation, recovery, testing and reporting.
Typically a DR plan fits into a larger business continuity plan for the organisation. There are two key metrics that dictate recovery readiness. The recovery point is the amount of data that an organisation is willing to lose in case of an outage; and the recovery time is the maximum amount of time an organisation can wait for an IT system/application to come up.
There are many options and ongoing research on reducing the recovery point and recovery time. Traditionally, as recovery point came close to zero, the cost of the DR solution including the hardware and software became higher.
New technologies and data protection methods have reduced cost and made DR solution with close to zero recovery point quite viable. Examples of new technologies that are enablers are virtual tape libraries, CDP solutions, and asynchronous replication that is adaptive.
Given the complexity and the heterogeneous nature of the data centre, being dependent on people to recover complex applications needs diverse skill sets.
DR recovery and failover automation tools that are DR-aware and can coordinate the bring-up of the various dependencies of an application are now available and deliver over 80 per cent reduction in recovery time.
Monitoring and validation are recent innovations to help the IT manager increase their DR readiness. Customer now has the tools to monitor, on a real time basis, recovery metrics such as recovery point; this is very powerful as it allows the user a real-time view of recovery readiness as opposed to having to do a drill to measure recovery readiness.
Along with the monitoring of recovery metrics, validation of the primary and DR environment is a huge pro-active step that can make the difference between successful recovery and failure. IT managers find keeping primary and DR environments in sync to be a constant challenge.
Having, therefore, a tool that alerts them of changes in the various layers of the stack makes a dramatic difference to recovery readiness. (For instance, Sanovi DRM is a DR management software that takes a life-cycle approach to DR readiness; it provides capabilities to provision industry best-practice DR solutions, monitor RPO and RTO, automate recovery and DR drills and obtain comprehensive reports on compliance.)
Would you like to talk about the impact of cloud computing and other newer developments on DR?
Virtualisation and cloud computing are major developments that impact and influence how DR is done. Server virtualisation can eliminate some of the challenges in a traditional DR setup, such as keeping the OS environment in sync between the primary and DR since the complete machine is replicated to the DR on a periodic basis.
The definition and offering in a cloud model is a large enough scope to warrant a dedicated discussion. In summary, cloud computing approaches the DR challenge in a different manner. Irrespective of the underlying technologies, an infrastructure cloud is assumed to offer an in-built DR capability that can meet specific recovery metrics. This is a fast-evolving area and sure to offer innovative solutions at very attractive cost points.
|Do boardrooms crack the disaster recovery puzzle?|
The top find for ‘disaster’ in Google News, among the 50,755 results at the time of writing this, is the sombre observation by the Railways Minister Mamata Banerjee about the lack of disaster management system in Kolkata. Thankfully, the Railway disaster management team came in from Howrah and Sealdah to assist the trapped people at the Park Street building.
Perhaps, the tragic blaze is one more grim reminder of the indispensability of disaster recovery (DR) readiness. The topic needs to get more boardroom attention and support, agrees Lakshman Narayanaswamy, Co-Founder and VP, Products, Sanovi Technologies, Bangalore ( www.sanovi.com).
Risk management/mitigation gets its due attention and support only when the executive management understands its role and integrates it into the business, he adds, during an email interaction with Business Line.
“A good rule of thumb for executives to gauge their company’s commitment to DR or BCP (business continuity planning) is to ask themselves if they have seen a BCP/DR status update in the last quarter. If they have not, it augurs DR is not getting the boardroom attention it deserves.”
A good way to sensitise the management on the need for DR is to talk to them about impact of downtime and what is acceptable, suggests Lakshman. Often, it is the translation of business outage to financial impact that really brings the focus on the need for investing in DR and giving it the required boardroom attention, he adds.
Excerpts from the interview:
What should be the DR objectives of financial institutions?
Financials institutions in India are mandated by regulatory authorities to have a business continuity and disaster recovery plan for their critical business process. Besides meeting their regulatory obligations, financial institutions also have to reinforce their service commitment to their customers and partners by demonstrating transparency, reliability and trust.
Declaring their commitment to protecting customer information and providing uninterrupted services by investing in disaster recovery readiness are concrete steps that business can take to ameliorate risk. Risk mitigation through DR planning should be a visible step for every financial institution.
Successful DR is the coming together of process, people and technology. Organisations must focus on enabling all three aspects to ensure a successful DR programme. An organisation’s DR needs are best served once it commits to putting the right structure in place; this enables the right level of visibility at the board and management level.
A Chief Risk Officer, who is separate from the IT Head, usually reports to the COO or the CEO. The office of the risk officer is charged with putting in place process and technologies to ensure that the financial institution is aware of the risks the business faces and the appropriate responses to various situations. The risk officer also facilitates the participation of various business units in the DR readiness process.
Where do you find the maximum investments happening, as regards DR preparedness?
Several companies understand the need for disaster recovery for critical IT applications. The immediate and big investment happens for infrastructure, capital expenditure on hardware, software, network and data centre to enable DR readiness. While this is the enabler for DR readiness, one must not stop after putting the hardware and software in place; this would be akin to buying a car and not accounting for the petrol required to utilise it.
Putting together a DR plan that works is similar to assembling a puzzle. One of the common actions IT takes is to invest in a data replication technology and feel that the company has a DR in place. Replication is one piece of the puzzle; after the data are available on the DR site, they still have to consistent and the application should be recoverable.
Do you notice enterprises in the BFSI space often dangerously ignoring a few key pieces of DR?
After the business has approved a DR plan and the required spend for DR infrastructure has been done, there is an alarming gap in how soon business expects IT to recover versus what the operations are able to deliver. This is largely because reporting on DR readiness as an on-going metric has not been funded or accounted for.
DR monitoring and testing are key “last mile” links that make the difference between being able to recover when an outage happens and struggling to recover. Regulation mandates that critical applications be tested at regular intervals. As the number of applications grows, this becomes a herculean task that does not get the resources and the time to do it.
Without regular testing, the DR team does not have the confidence that recovery is meaningful. Without investing in DR management, the DR manager does not have the visibility and the tools to be confident about recovery readiness.
Are you happy with the level of disclosures corporates make about their DR capabilities?
The current level of disclosures and transparency is not adequate for the consumers to feel confident their interests are being taken care of. The banking industry is driven by regulation, so banks have to submit to the RBI their DR readiness status every six months; ideally this information should be available to the consumer also. This directly reflects on the organisation’s commitment to protecting customer information and providing uninterrupted services.
The consumer can then take an informed decision on which organisation they want to transact with.
As companies become more dependent on IT for their critical function, I would ideally like to see companies advertise the recovery metrics that they are committing to, as an example — I want my bank to advertise that they will not lose any information and the bank’s core services will be restored in less than two hours in case they are impacted by an outage.
Any other points of interest.
DR is often perceived by the management as a costly and rarely-used indulgence. A DR plan need not always be based on heavy infrastructure spend. The business must prudently consider all possible risk scenarios and make a conscious decision on which ones the organisation want to respond to immediately, and other risks that it wants to develop a response to as the business grows or as certain business milestones are reached.
A simple DR plan is not necessarily an inadequate plan; instead, not having a plan is inexcusable. As an organisation makes the transition from manual and paper-based business process to IT-enabled business process, it must carefully evaluate which of the processes and related IT systems need to have a DR plan and the potential cost of doing so.
Another viable method of justifying the spend on DR is to use utilise the DR infrastructure to load share as the business grows. A recovery capability that is predictable enables an agile IT organisation. Typically, if the production services go down, IT managers prefer to spend time to fix it rather than invoke recovery on to the DR site to continue services while the root cause of the primary outage is fixed.
Rather than spend time on fixing the primary problem with a recovery plan that is predictable, IT managers can start services on the DR, get the business going and then spend their effort on fixing issues on the primary side.
There are several creative ways of planning and implementing a recovery strategy that meets business goals and budget needs. The important first step is to commit to enabling DR capability for the organisation.
The next time the opportunity presents, ask your IT head or COO the following questions to gauge you organisation’s DR readiness: Do we have a DR plan? When was the last time we tested the plan? If some or all of our core business process/IT systems go down, do we understand their impact to the business?
Sanovi DRM enables the bank’s treasury application be recovery ready
March 17, 2010 /India PRwire/ — Sanovi Technologies, one of the leading players in the Disaster Recovery Management and business continuity space, has announced the implementation of the Sanovi DR Suite at ING Vysya Bank.
ING Vysya Bank Ltd., is an entity formed with the coming together of erstwhile, Vysya Bank Ltd, a premier bank in the Indian Private Sector and a global financial powerhouse, and ING of Dutch origin, in Oct 2002. As at the end of the year December 2009, ING’s total assets exceeded 1164 billion euros, employed over 110000 people, served over 85 million customers, across 40 countries. This global identity coupled with the back up of a financial powerhouse and the status of being the first Indian International Bank, has given the added strength and dimensions to ING Vysya Bank.
Sanovi offers solutions to proactively manage disaster recovery (DR) environments to ensure that business applications can be recovered in compliance with service level agreements. As the leading independent provider of DR management software solutions, Sanovi is focused on ensuring recovery readiness.
The bank’s treasury department is using Kondor from Reuters for its treasury related applications. To meet with regulatory prescriptions of Reserve Bank of India towards enabling Disaster Recovery (DR), ING Vysya Bank was in search for a suitable DR solution.
“Given the nature of treasury applications, it is a complex implementation with multiple databases, multiple application processes on multiple servers and multiple start stop scrip, using off the shelf host or storage based replications. Our expertise in creating tremendously challenging business continuity DR solutions for the banking sector has helped ING Vysya in implementing the best of breed DR management solutions”, says, Subramanian Parameshwaran, CEO, Sanovi Technologies.
Commenting on the project implementation, CVG Prasad, CIO of ING Vysya Bank said “We are very impressed with the coverage provided by Sanovi’s DR software on all aspects of DR life cycle. The Treasury application is DR ready thanks to the automation and dashboard provided by Sanovi’s software”.
Sanovi identified the areas of application which needed to be DR ready like identifying and removing hard coded IP address references and documentation of all the steps for fail over and DR drill so that they could be automated.
Sanovi DRM completely automates the switch over and switchback test and ensures zero data loss when services transition takes place from the primary to DR and then back to the primary.
Sandeep Kaul, Unit Head, IT Service Delivery, ING Vysya Bank, who was involved in the implementation of the DR software, commented, “Sanovi DRM exceeded our expectations. We were able to deploy the DR solution for our treasury application and do DR dill, all within a week. I thank Sanovi for their professional competence and customer centricity”.
Sanovi was founded in 2002 to help organizations proactively manage disaster recovery (DR) environments and ensure business managers that business applications can be recovered in compliance with service level agreements. As the leading independent provider of DR management software solutions, Sanovi is focused on ensuring recovery readiness across all levels of IT infrastructure.
Sanovi DR Management Suite is the leading management solution for aligning DR infrastructure with Recovery Time and Recovery Point objectives. Sanovi DRM™ combines monitoring, reporting, testing and workflow automation capabilities of complex IT infrastructure into a scalable, easy-to-use solution built on industry standards. The result is a unified disaster recovery management product family that delivers DR readiness validation and offers clear business and operational advantages.