FTP Online
 
 

N-Tier Is the New Frontier of IT Operations Management
N-tier distributed applications require new management approaches.
by Glenn Helton and Page Alloo

Posted March 15, 2004

The complexity of n-tier distributed applications complicates management. Fortunately, new classes of application management software will provide a means of managing these new applications in a top-down, holistic way (see Figure 1).

N-tier distributed applications, whether Java 2 Platform, Enterprise Edition (J2EE), .NET, or full-blown Web services varieties, create unprecedented IT operations management challenges. How does an IT operations staff stay on top of an application that has multiple, often dynamic interdependencies? How does an operations staff prevent an inappropriate change on one server or router or piece of code from degrading availability of one or more complex applications? How can even experienced operations managers quickly troubleshoot a failing application when so many potential points of failure exist in so many software and hardware layers, or result from a configuration change occurring four or five days earlier?

Fortunately, the answers are coming. Solutions from familiar vendors and innovative startups (see vendor profiles) are appearing across the landscape of application management software.

Figure 1. Click to enlarge.



Size Matters
The complexity of n-tier distributed applications architecture, in and of itself, complicates management of those applications. Instead of monolithic code running exclusively on dedicated servers, we now have many software components connected to many platforms. In J2EE environments in particular, you might have multiple Enterprise JavaBeans (EJB) components and an additional middleware tier in the form of application servers such as BEA WebLogic or IBM WebSphere.

For all the wonderful flexibility and economy such an approach offers, the advantages come with a price. Application servers deliver the tremendous benefit of distributing workloads and providing load-balancing options, but also introduce a level of complexity because their style of middleware creates numerous many-to-many connections. Though the basic design of a distributed application might be well understood, and its underlying hardware elements well monitored, at a given moment it might be daunting to obtain a comprehensive picture of the application's health. A single business transaction might often kick off a sequence of processes. Each process might be supported by events that transpire at a business-logic software, hardware, or network level. A glitch in one might ripple through the others.

Server and application deployments were formerly infrequent. Now it is common to roll out new servers and patches weekly, taxing scarce system administration resources, increasing costs, and risking system availability caused by misconfigured systems.

The Trouble With Troubleshooting
N-tier distributed applications usually have a Web of interdependencies—for data, for bandwidth, for backup, for authentication. When a problem crops up, where do you start looking for the cause? In most cases, IT operations troubleshooters have ready sources of information. Monitoring software pushes alerts into the visual fields of datacenter staff, and logs of operations can readily show lists of recent changes to configurations. Fortunately, the collective experience and intelligence of IT operations staff can bring most troubleshooting situations to rapid and successful conclusions.

But what about other occasions? What happens when the time required for diagnosis far exceeds the time for actual resolution? What happens when the monitoring software pops up dozens of alerts and it isn't obvious which are completely redundant? What happens when it becomes increasingly apparent that the most recent configuration change—the most frequent source of problems—is not the source of the application illness? Savvy IT operations people will often grab the nearby change management database to look for pointers to problems. However, the change management database usually contains the intents of changes and not necessarily the patches and configurations that actually exist on the datacenter floor. Tick, tock. The escalation process begins. While actual flat-down application outages are rare, nagging degradations are not. And worse, IT operations can be embarrassed by a major degradation at a crucial time, even though it's a once-in-a-blue-moon occurrence.

Putting Them Through Changes
Veteran IT operations directors will tell you that, paradoxically, it's the planned changes that can cause the most serious issues. Typically, unplanned changes are easily noticed and quickly correctable. Hardware breakdowns, server crashes, and router misconfigurations tend to stand out. A swap, a restart, or a redo can often restore an application to full health.

But planned changes, especially the kind that involve propagation of software across a set of devices, can be notoriously difficult. Not only do immediate rollbacks send expenses—and blood pressures—up, but they often necessitate delays until the next deployment window, sometimes a week or more in the future. Despite deployment teams' best efforts, it's difficult to model complex n-tier distributed applications in staging environments—at least not without considerable expense in facilities, equipment, and personnel. And, as applications get bigger, it becomes more likely that differences in test and product environments will appear. Even subtle differences can wreck a deployment. This is an area in deep need of software for simulating software releases, and thankfully, it's coming.

In a worst case, a deployed change or patch might appear successful at first, but higher load factors reveal a hitch later. This situation can give rise to some mutual finger-pointing, because, after all, IT operations might be unable to instantly prove what is causing the n-tier distributed application's sickness. With so many runtime cross-dependencies, the evidence might be hard to assemble.

The Top-Down Era
As the major IT market analysts have described, IT operations management will progress through a series of transitions over the next few years. The terminologies of the individual analyst firms vary. Summit Strategies calls the end goal "dynamic computing," while JNoel Associates uses the IBM-inspired "autonomic computing" term. Gartner Group offers a vision of self-regulating applications it calls "business service management," and adds a realistic view of three stages IT organizations must traverse to arrive there. Gartner points out that IT must first graduate the long phase called "element management," characterized by managing devices and processes component by component with basic automation solutions and plenty of human intervention.

The second phase, the one IT is finally entering now, is called "operations management." The third phase Gartner describes is service management, which focuses on reporting and management of services meaningful to the business customer, including service-level agreement (SLA) reporting, and interfaces to provisioning and billing.

The whole point of the major transition into the next phase of IT operations is moving from an element-by-element management approach to a more application-centric, holistic approach. Success in moving in this direction depends completely on new types of software that automate various aspects of IT operations.

This shift centers on enabling a top-down approach to application management. The traditional bottom-up approach worked fine in the mainframe era and into the client/server era. In these environments, each application tended to have its own underlying server and resources. Each server had mostly simple connections to another tier, so management efforts could concentrate on keeping particular components healthy. It made sense for network monitoring consoles to adapt their tried-and-true approaches to systems management. This element-by-element approach begins to break down once servers multiply, application processing gets distributed, and applications develop multiple dependencies across the datacenter.

As we'll discuss, the monitoring vendors (the "frameworks" sellers) have made, and are making, major efforts to add modules that address the new challenges of managing n-tier distributed applications. Those vendors, as well as most from adjacent spaces, plus startups from new software categories, are striving to create a top-down management paradigm. That is, they all want to provide, in whole or in part, methods IT operations can use to manage complex applications in a holistic, end-to-end fashion. (Click here to see details on these vendors' strategies.)

The Good, The Bad, and The Ugly Complexity
The move to n-tier distributed applications is rather gradual, and the problem of complexity is fairly evident, so why hasn't it been addressed before now? There's a three-fold answer. First, the issue of complexity has been attacked first by efforts to simplify IT environments. Consolidation is the best catch-all phrase to describe recent efforts by enterprises to reduce its number of core applications, datacenters, and management consoles. These efforts tend to work, at least within the metrics of the old element-by-element paradigm.

The second part of the answer focuses on improved processes, both in development and IT operations. Corporate applications teams are adopting standardized development methodologies. These methods tend to increase development efficiency and reuse, and make applications more manageable. IT operations organizations are also improving management processes. You'll hear the term "IT Infrastructure Library (ITIL)-compliant" commonly used to describe methods and products. This is a clear recognition that application management processes can be standardized.

But the rest of the explanation lies in the lack, until recently, of software offering a top-down management capability. It's a case of complexity begetting more complexity, which tends to mask understanding.

Although software has been capable of implementing pre-established rules for years, until recently IT management software vendors have not been able to perfect ways of automatically finding and tracking all the elements that compose a complex application. Moreover, software creators are only now attaining the means of defining entire applications to the level required to make sense of these piece-parts and their interdependencies.

Remapping Application Management Product Categories
"Application management" software products have long existed, but the concept of a product category has been fuzzy. Analysts seldom track, as a distinct group, the loose collection of application management software products. Most enterprise IT management products have been considered systems management products, as appropriate in the element-by-element era. Add to those a collection of "support" products—such as IT service management (including help desk), "release management," and "change management" packages that provide general-purpose utility for applications—as well as operating systems and other types of software, including client software. Later, a broad group of application performance management products became ubiquitous, but these started out with mostly hardware-oriented metrics.

Now that there's a clear need for application-centric management, as well as the ability to create more advanced management software, a comprehensive enterprise application management category is coming into focus. Underlying this are important trends toward increasing levels of abstraction and resulting automation (see Figure 2).

IT Moves Toward Fly-by-Wire
Abstraction and automation are really part of the same tsunami. With software's ability to abstract the nature of complex infrastructures and applications comes the reward of automated management. This is the concept of virtualization, a term embraced by some, abused by others, and even hated by a few. Love or despise the term, the concept has moved well along in the area of storage, and it will do so in applications, whether it goes by the name of utility computing, grid computing, on-demand computing, service-oriented architecture, Web services, or whatever.

The vital first step in any effort to move toward abstraction and virtualization is automating configuration management. In fact, it can be argued that the new breed of configuration management software is the foundation for many efforts to automate IT's application management. Why? Because the ability to automatically generate accurate, up-to-date data about configurations is crucial to the overall automation process. If IT operations is just trying to nurse the day-to-day health of n-tier distributed applications, which by now have been split into many pieces adjusted on the fly, then reliable information about configurations is crucial. No one can keep that amount of information on a spreadsheet any more.

And in an escalating troubleshooting situation, looking into the change management database or at available logs for configuration data is becoming impractical. That familiar data might not reflect the reality of actual planned or unplanned configurations. In the near future, the repository of choice will be the configuration management database (CMDB)—assuming that a CMDB can be automatically generated and updated.

The emerging category of configuration management software promises to do exactly that type of dynamic database construction and more. Automated discovery functionality contained in configuration management software will comb through the datacenter, as well as the underlying infrastructure, for configuration information. What's more, some of these configuration management solutions also demonstrate the ability to track configurations of software elements through application-specific plug-ins and customizable templates.

At the top of the heap of configuration management solutions are packages that also create topologies of entire applications, complete with cross-dependencies to other applications and infrastructure. This type of mapping capability has been pioneered by Relicore's Clarity and Troux Technologies' Troux 4, which refer to their own functionality as "blueprinting." These products and others come with their own basic topology models, but invite IT customers to do plenty of customization to make their "blueprints" as accurate as possible.

At least a couple newcomers are pushing the envelope with more robust built-in topology models, seemingly capable of automatically capturing the complexities and dependencies of n-tier applications software. Collation's Confignia product automatically discovers—without the use of any agents—all the components of complex applications and their underlying infrastructures. Confignia dynamically maps all application interdependencies in their runtime environments, creating an end-to-end view of an entire application. It allows drill-downs on topological views to look at configurations and changes, and offers a set of analytical tools for troubleshooting. Additionally, it provides a way to answer what-if questions about how a hypothetical change will affect an application, given its known dependencies.

Another young company, Appilog, promises capabilities similar to Collation's, although it uses a somewhat different technical approach. Appilog's PathFinder provides dynamic discovery, service definition modeling, and visual topology mapping designed to show the interdependencies between critical applications and the servers, systems, databases, and network devices they rely on.

As configuration management software becomes established, other applications will likely begin tapping the configuration management database for information. That "machine-readable" data could solve numerous issues associated with management of virtualized applications in service-oriented environments.

Established Vendor Responses
Systems management or "framework" products have existed for years. The best known is probably HP OpenView, which began in its critical role of monitoring underlying networks. Such a role is still vital, and, to its credit, the framework products have broadened its abilities to manage servers and other elements. However, systems management tools are still best at looking after systems (i.e., hardware components) and are less successful at providing holistic views of entire complex applications.

In recent years, systems management vendors have acquired technology to offer modules capable of working with more complicated definitions of "applications." To that end, Hewlett-Packard has recently signed definitive agreements to acquire two management software companies, Novadigm and Consera Software. Once integrated with HP OpenView, those companies' products will add new automation capabilities in the areas of configuration management and dependency modeling, respectively.

Not to be outdone, IBM is on a multiyear mission to modernize its Tivoli product suite. Through an arduous development and acquisition effort, IBM is creating more holistic management capabilities. One example is IBM's purchase of ThinkDynamics, whose products exist as a sub-brand while IBM busies itself on all the associated integration issues. IBM Tivoli Intelligent ThinkDynamic Orchestrator provides the basis of a capability IBM calls "orchestration." It aims to automate many of the steps involved in provisioning and configuring n-tier applications, eventually leading to a capability for more dynamic, on-demand environments.

BMC's Patrol product is expanding in more automated directions. For example, BMC's acquisitions of Remedy ITSM software and IT Masters' MasterCell software are part of a larger effort to combine service management and availability management into a single solution.

In the sprawling area of application performance and availability management, vendors are also taking up the automated management challenge.

Mercury Interactive, now a 15-year-old company, is integrating its product portfolio for application delivery, application management, and IT governance. J2EE-specific solutions in the Mercury Interactive suite include J2EE Transaction Breakdown, Topaz for J2EE, and J2EE Deep Diagnostics. Each product is designed to automate some aspect of troubleshooting n-tier distributed applications.

Concord Communications' eHealth Suite contains various features to manage applications. Concord recently acquired netViz, a data-driven visualization software developer, and licensed Tavve's technology for root-cause analysis and discovery of Layer 2 and Layer 3 network topology.

Finally, provisioning, release management, and change management solutions are becoming more robust and automated. It will be interesting to see the progress of EDS and Opsware's efforts to establish a standardized Data Center Markup Language (DCML). Such a standard could help a range of application management software packages work together harmoniously.

Summary of New Approaches
Software is automating IT operations' application management by replacing manual tasks with computer-driven operations. We should expect to see major improvements in tracking the use and configurations of entire n-tier distributed applications, as well as their interdependencies. Auto-discovery of application components across all layers of the infrastructure is needed as a first step, followed closely by the generation of accurate, up-to-date application topology maps. Both the discovery and topology mapping should be as dynamic as possible to provide views of actual runtime conditions along with fresh data for troubleshooting purposes.

Established IT management software category vendors will likely broaden both their scale and scope, adding increasingly automated functionality to their product lines. As provisioning, change management, performance management, and other categories mature, IT operations groups can look forward to a more automated future. The ultimate result will be better availability, more efficient changes, and improved management productivity for n-tier distributed applications.

About the Authors
Glenn Helton and Page Alloo are partners of Positioning Strategies, a management consulting firm that serves technology companies from its Silicon Valley base in Los Altos, Calif. Founded 10 years ago, Positioning Strategies consults with some of the world's largest electronics companies, as well as emerging startups, including many in the enterprise software space. The company has advised IBM, Hewlett-Packard, Sun Microsystems, Hitachi, Advance Micro Devices, ATI, National Semiconductor, and others. It has recently worked with young enterprise software companies such as Mercado, Collation, and Interwise. Reach the authors at glenn@positioning.com and page@positioning.com.