測量和管理

在強調使用數據在管理方面的重要性時後,有一句話經常會被引用,如果你無法測量,你將無法有效的管理。

有些人拿著這個念頭就聯想,只要有足夠的數據就可以解決問題。當收集與積累大量的數據以後,希望就可以找到關鍵的洞察力,而我們也能明確的知道在下一步該做什麼。

不過也許我們首先該作的是問一個這個問題,我們最終最希望看到什麼結果?以及我們需要在什麼樣的環境/現實中使用?

在確定所要的管理結果之後,我們再問需要的是什麼數據,以及收集數據的投資是否會超過了收益。

在測量之前,先問問適合的問題。所收集的數據將會更有意義。

Data, Analytics, Decisions

A typical workflow will show data often lead to analytics, which in turn lead into decision-making.

While that might make sense when data are what we have on-hand, and we ask ourselves… What can we do with the data?

Perhaps it is more instructive to look at the workflow in reverse.

The more critical, first question to ask is… What decisions are we trying to make?

That question leads to… What analysis or analytics output do we need to support the decision-make?

Which then leads to… What data do we have on-hand that can support the analytics? If not, where can we go to find the data we need for the analytics?

Data science experts encourage us to ask questions first before doing the analytics.

The important question is always what is the decision this for?

Measurement and Management

“If You Can’t Measure It, You Can’t Manage It” is frequently quoted when emphasizing the importance of managing with data, not just with intuition alone.

Some took the quote and ran with the metric gathering exercises. The data-gathering effort accumulates tons of data with a hope to find that key insight to show what the organization out to be doing next.

Perhaps the question to ask first is this. What outcomes do we all want to see and in what context/reality do we need to work with?

After determining the outcome/process to manage, ask the question of what data do we need and whether the investment of collecting the data might outstrip the benefit.

Ask the hard questions first. The resulting data collected will be a lot more meaningful.

ITSM Tools Operation Continuity Plan – Part One

If your IT organization is like most others, you rely heavily on your IT service management (ITSM) tools for delivering IT services to your business customers or constituents. Many IT shops also have a comprehensive suite of ITSM tools they use as part of the various aspects of their operation. It is my personal belief that the ITSM tools operate like the ERP systems for many businesses – the ITSM tools are providing a critical service to the IT organizations.

While many ITSM products on the market today are well made and come with industrial strength resiliency, technology failures or other disasters can still cause the tools to become unavailable for use. When the tools unavailability or outage stretches out from merely minutes into hours or days, you need to have a continuity plan to get the tool services restored so your IT organization’s operations can continue normally without further hindrances. The ITSM tools operation continuity plan needs not to be fancy or sophisticated, but it does need to be well thought-out with as many details called out beforehand as possible. This is the part one of a two-part post where we will go over what components should go into such plan.

Introduction
This section provides an overview of the plan. Why the plan exists? Who is the owner accountable for drafting and/or executing plan? How will the plan be maintained and tested for validity or accuracy? Any other high-level, overview information about the plan will be helpful to include in this section.

Invocation
This section includes the comments on what conditions will trigger the actions to invoke this plan. It is important to point out who will be authorized to invoke and to implement this plan. It is also important to outline the availability requirements and targets once the plan is invoked.

Scope
This section describes the ITSM modules, systems, infrastructure, services and facilities that will be part of this plan. A number of ITSM systems do not operate in isolation these days, so identifying all components required for a functional ITSM system could be daunting. That is OK. Just have a boundary in mind and do your best. If possible, include and provide as much information on the infrastructure that hosts the ITSM products as feasible. This could include the actual server names, databases, and other components deemed essential and critical for the operations of the ITSM tools. If you have a CMDB with the relationships documented, those relationships between the system components and your plan should be consistent with each other.

Depending on your operation, not all ITSM modules need to be part of this continuity plan. For example, I surmise tool modules or services such as Incident Management, Change Management, or anything the Service Desk uses could be high on the priority list to get restored ASAP. Problem Management module probably can wait and get restored as part of the normal system recovery cycle.

Data Dependencies and Considerations
This section includes comments about the data requirements that need to be met before the recovery plan can be implemented. What data is needed for the recovery and what preparation activities are required to get the data in place? This is more than just calling out what database servers are needed for recovery, which should have been discussed in the Scope section. I am talking about things such as how current the data need to be before the recovery procedures can be executed. Another consideration is how the data that was captured during the recovery phase will be incorporated back into the main database once the original systems come back online. The key objective here is not to lose data, during recovery and post recovery.

Security and Access Considerations
This section includes the important details about the security and access related matters. For example, what access rights will your systems and personnel require in order to fully execute the plan? Often we have the security and access considerations on the back burners and forget about them. During the recovery phase, things are not working as expected and, after many rounds of discussions and trouble-shooting exercises, we realize the security access might be preventing things from working. Don’t put yourself in the position of being unprepared and wasting time. Figure out those security and access details beforehand and document them in the plan.

External Dependencies and Considerations
This section calls out the systems, infrastructure, service, facility or interfaces that are external to the ITSM system but have inter-system dependencies that should be documented. Essentially, anything that has not been identified in the Scope section but still required for recovery should be mentioned here. That way, all dependent systems and the nature of dependency can be identified and taken into account during the plan execution. For example, we might want to include information about the email system and its key interface points because most ITSM systems have a reliance on the email systems for communication.

That is all for now. On the next post, we will conclude the discussion of the plan and cover the remaining topics:

  • Recovery Team and Communication
  • Recovery Procedure and Configuration Details
  • A Checklist of Key Actions or Milestones
  • Testing and Validation
  • Return-To-Normal-Operation Procedure