Enterprise Architect  

Strategies for Operational Risk Management
EA documentation simplifies management of operational risks.
by Vineet S. Rajput

September 15, 2004

In the uncertain world of business, every organization finds itself in a state of continuous risk management. This involves all kind of risks, including credit risk, market risk, and operational risk. The credit and market risks have long been the key evaluating factors for corporations' financial positions. Recently though, operational risk has started getting the attention it deserves.

Operational risk can be defined as the amount of exposure an organization has as a result of its operational structure. This includes risk due to processes, organizations, and technologies. The operational risks expose an organization's business. These might arise from natural catastrophes such as floods and earthquakes, man-made catastrophes such as 9/11, or minor events such as a fraudulent transaction or system failure. Any time an organization's operations can fail, an operational risk exists.

Businesses as well as regulatory bodies and governments are focusing on managing operational risk. The Bank for International Settlements (Basel) Committee on risk management has made operational risk a factor in determining capital adequacy. In a way, the Sarbanes-Oxley Act (see Resources) is aimed at ensuring against operational inadequacies that result in false financial reporting. The Health Insurance Portability and Accountability Act (HIPAA) aims to ensure the operational practices to protect patients' information. The Graham-Leach-Bliley Act (GLBA) and the Patriot Act also attempt to ensure the operational practices that reduce operational risks for financial institutions.

Managing operational risk is not only a regulatory need—the organization that demonstrates a sound practice of risk management will also achieve greater shareholder value through superior capital efficiency, data and risk management uniformity, enhanced credit ratings, reduced operational losses, and an improved credit risk-return profile. These organizations are also more likely to be rewarded by customers with higher confidence and more of their business.

The regulatory needs and current business environment have prompted a slew of technology products, all of which claim to help with compliance with the regulations. Although some technology will eventually be necessary, these are essentially business issues and need a business solution. Enterprise architecture (EA) at its fullest is about defining a comprehensive business and technology strategy to achieve organizational goals. You can apply this approach and related tools to operational risk management.

Operational Risk Defined
The Basel committee has defined operational risk in detail for financial institutions, but much of the definition applies to other institutions as well. Other organizations might not be susceptible to all the operational risks identified by the new Basel Accord (also called Basel II), or might have some additional operational risk exposures. But for the purpose of this article, I will concentrate only on the operational risk aspects of Basel II.

Basel II defines operational risk as "risk of loss due to inadequate or failed internal processes, people, or systems, or from external events. This definition includes legal events but excludes strategic and reputational risk."

The Basel Capital Accord emphasizes active risk management and suggests three approaches. One of the approaches, the Advanced Measurement Approach (AMA), provides institutions with maximum flexibility and benefits. However, to qualify for AMA, institutions must demonstrate an active and adequate risk management practice in all areas of operational risk. These areas include internal fraud, external fraud, employment practices, customer practices and product design, system failure, physical damage to assets, and execution processes.

The management of risk involves accurate measurement or assessment of existing risks along with a well-defined mitigation approach commensurate with the risk exposure. Unfortunately, there has been precious little work done in the area of quantifying and mitigating operational risks.

Understand Risk
All risks are not created equal. There is a big difference between a PC failing on a programmer's desk and a PC failing on a trader's desk. One might be a minor inconvenience, while the other might impact customers and might even create millions of dollars in actual loss.

Traditionally a risk has been quantified based on its impact and its probability of occurrence. These factors have been used to compute the net impact of risk. If failure of a trader's PC has a risk exposure of $10 million and the probability of its occurrence is 0.0005 (1 hour in a 2,000-hour work year), its risk exposure is $5,000.

These simplistic factors are usually used to evaluate the response to the risk. In this case, a spare $2,000 PC might be an appropriate response; however, creating a $10,000 fail-safe environment for that one trader might be overkill.

However, unless you figure in the organization's detection and response capabilities, the picture of exposure stays incomplete. The ability to detect a threat before it can cause the impact will greatly reduce the potential damage. Similarly, once the threat occurs, organizational ability to respond will determine the true impact.

Assess Risks
All these factors are traditionally utilized in a Failure Mode and Effects Analysis (FMEA) to evaluate the overall risk profile of a complex system. The FMEA involves enumerating the system's various failure modes and then evaluating each mode for impact, probability of occurrence, and potential of detection.

To understand the risks of a complex system, such as an enterprise or any part thereof, you must understand:

  • Its parts.
  • Interaction of the parts.
  • Failure modes of the parts and interactions.
  • Impact of each part/interaction failure.

Enterprise architecture documentation provides critical information about the parts and interactions thereof and thus creates the foundation of any such risk analysis.

Once you identify and document the process, organization, and computer systems supporting a function, it is easy to assess the risk. If you know all the components that a fraudulent document might have to flow through, it is easy to assess the risk. Once you consider the cross-functional traceability, it is not only easy to assess the risk, but it is also easy to identify the best possible response.

But risk analysis isn't always straightforward. For example, an FMEA doesn't always account for "tail events," which are events of extreme risk but low probability or events with low impact but high probability. Tail events can be hidden because of:

  • Low impact and high probability of detection: These events are essentially overheads or process inefficiencies due to amount of rework.
  • Overlooking of some extremely low-probability failures: Events such as 9/11 could be easily overlooked because the probability is so low.

This necessitates that all failure modes are enumerated. EA documentation helps you avoid overlooking any such events through proper documentation of the failure processes that can be combined with other processes.

Manage Risks
Once you assess the risks, you can manage them in many ways. The ways and means of managing risks depend on various factors including nature of risk, the organizational culture, options available at a given time, and even individual personalities. The most common options for managing risks are:

  • Avoidance (reducing probability): You can achieve this by creating more processes or systems that are inherently more reliable.
  • Monitoring (increased visibility/detection): You usually achieve this through improved monitoring. The monitoring might be manual (through human effort) or automatic (through mechanical/electronic means).
  • Improved response: Accomplish this through organizational (creating a position of responsibility) or electronic (standby systems) means.
  • Transferring (reducing impact): You usually do this by creating insurance of some kind that involves assumption of responsibility by another party. For instance, buying financial insurance might mitigate a tail event with high financial impact.

  • Assumption (acceptance): Sometimes, the risk might be so low or alternatives so expensive that the best course might be to accept/assume the risk.

Introduction to Enterprise Architecture
The final white paper for government organization CIOs (see Resources) defines enterprise architecture as:

Enterprise Architecture (EA) links the business mission, strategy, and processes of an organization to its IT strategy. It is documented using multiple architectural models or views that show how the current and future needs of an organization will be met.

Once you understand the true nature of enterprise architecture (it creates a complete mapping of business mission, strategy, processes, organization, and IT strategy), you can easily understand how useful a documented architecture could be to managing various types of the organization's risks.

The difficult part, however, is to understand how to document this critical link and its associated parts to derive real value for the business.

Over the past few years, many people and groups have proposed high-level frameworks to document enterprise architecture. The Zachman Framework, proposed by John Zachman (see Resources), is one of the more popular ones. This framework allows creation of a detailed documentation of enterprise architecture. It proposes creation of multiple viewpoints for the enterprise (planner, owner, designer, builder, and so on). It also isolates different types of items that need to be documented. These include assets (data and otherwise), functions, locations, organizations, events, and goals and strategies.

The Zachman Framework includes documentation of various "primitives" that are then combined to provide implementation views or composites. For example, an organization might be composed of many groups such as marketing, sales, manufacturing, transportation, and design. They might follow different processes to provide products and services to the customers.

Each of these—the organizations, products, and processes—are considered primitives. However, the implementation of the process through organizations to produce products and services is actually a composite. Each of these components/primitives can be combined with primitives of other types in a flexible manner to produce different results.

Once you understand the primitives and composites, you can create traceability matrices to see which goals are impacted or supported by which processes. Similarly, you can also understand which systems support which processes and goals. This kind of traceability creates a level of knowledge that you can leverage to create a risk model for any of the components.

Basel II Operational Risks and EA
The operational risk is fundamentally created by the way an enterprise manages and utilizes its strategies, organization, processes, and assets to achieve its objectives. You'd find it nearly impossible to manage any complex entity unless it is properly documented.

The same is true with any modern enterprise. Enterprise architecture methodology is aimed at documenting as well as maintaining traceability between various components of the enterprise (see Figure 1). You'll find that enterprise architecture documentation is indispensable in simplifying operational risk management.

EA documentation helps you manage risk in several ways:

  • It provides traceability from risk to specific components. Once you identify the high-risk components, you can easily incorporate the required level of reliability in the component specification. This is true for electronic components such as computer systems, process components, or organizational components. As a corollary, the documentation avoids over-specification of reliability for components that don't need it.
  • The documentation also allows for improvement of supervisory processes. It helps us identify if human or electronic monitoring will be the best fit in a particular case.
  • Good EA documentation also promotes reuse. It helps identify the organizational, process, and computer components that are either already reused and thereby must be more reliable, or that can be reused to make existing systems more reliable.

I've included some examples for types of risks identified in Basel II and potential assessment and mitigation models using enterprise architecture (see Table 1).

Operational risk is an important aspect of risk that must be assessed and managed on par with credit risk and market risk. The new Basel Accord has acknowledged and included operational risk as one of the determining factors for calculating the capital requirement for international financial institutions. The accord provides incentives in the form of the Advanced Measurement Approach for active risk management by financial institutions.

The assessment and management of operational risk is still an emerging field. Comprehensive risk assessment and management needs a systematic approach. The discipline of enterprise architecture provides a systematic approach to documenting the enterprise organization, process, strategies, and systems. You can readily leverage EA in both assessment and mitigation planning for operational risk.

About the Author
Vineet Rajput is a senior IT professional with more than 20 years of global experience in IT and business strategy planning. Throughout his career, he has focused on maximizing the IT value and managing business and IT risks. He is a certified PMP and has helped many organizations in business and IT process improvement. His other interests include simulation and expert systems.