Do
Enterprise Frameworks Pay? Part III
Issue Date: April 1998
Issue Number: 8.11
Category: FRAMEWORKS
Does the framework approach to distributed
systems management (DSM) really pay off? In
this third installment of our series examining
enterprise frameworks, we check the experiences
of Tivoli TME 10 users.
Since its founding in 1989, Tivoli has sold
a vision of best-of-breed enterprise management
via a common framework of base utilities and
'open' (published) APIs. Until recently, this
was in contrast to its principal rival, Computer
Associates, whose Unicenter product stressed
integration of CA's own modules.
In part one of our series, we surveyed a half
dozen CA Unicenter customers and discovered
that earning a payoff from DSM investments requires
significant investment in skill development
and systems integration. That's consistent with
findings and predictions from Meta Group and
other analysts that up to 70-80% of all enterprise
DSM implementations are likely to fall behind
schedule or settle for delivering abbreviated
functionality.
But DSM is a habit that's becoming increasingly
difficult to resist: Meta estimates that up
to 60% of distributed client/server installations
will have some DSM project underway this year,
rising to 95% by 2001.
CA's customers told us that Unicenter was not
as integrated or 'plug and play' as advertised,
a situation that demanded extra time and effort
to resolve. We found two causes: that CA's product
integration was looser than advertised, but
also that many users did not commit adequate
resources into their Unicenter projects.
The question is have Tivoli's customers been
any luckier or wiser? This report focuses on
their experiences.
PRODUCT OVERVIEW
Tivoli is widely acknowledged as one of only
two or three suppliers that is capable of supplying
a genuine, leading edge, enterprise wide systems
management platform. The Tivoli Management Environment
(TME 10) is not used by many customers - they
number a couple of hundred - but they are almost
all blue chip organizations that spend a lot
of money on getting it right. (As a group, they
are larger than typical CA Unicenter customers
are.) And in some cases, TME implementations
can take years to install properly.
TME uses a framework approach that is supposed
to be more 'open' than its rivals - chiefly
Computer Associates' Unicenter and - on the
horizon, a revived HP OpenView. It supports
integration through published APIs and integration
toolkits, and promotes its best-of-breed strategy
with Tivoli 10/Plus Association partnership
programs. To date, 10/Plus includes roughly
30 certified partner products, with another
70 said to be in the development stream.
Conversely, rivals CA and HP have had to graft
frameworks and published APIs after the fact.
(In fact, CA, dismisses the importance of frameworks,
and is currently giving its version away for
free.)
As such, the framework approach provides Tivoli
the halo effect of being more 'open', but at
the cost of functionality. For instance, prior
to its 1997 acquisition of Unison, Tivoli's
job scheduling functionality was inferior to
CA's, and prior to its acquisition of Software
Artistry, Tivoli lacked automated help desk
tools.
Yet, it is those very acquisitions that place
Tivoli's strategy in question. If it acquires
more third-party capabilities, will that stratify
some offerings as 'more equal' than others?
TME is focused around four goals for managing
distributed systems: managing availability,
software deployment, automate routine problem
resolution, and provide a security blanket/access
control.
These goals aren't very different from other
vendors playing in the DSM space, but what distinguishes
Tivoli is its ambition to be the hub player.
It does so using a CORBA-compliant object-oriented
framework for integrating services and functions,
which is the key to its ability to scale across
multiple, distributed platforms in heterogeneous
environments.
IBM's acquisition of Tivoli in 1996 proved a
double-edged sword: although in the long run,
it provided the deep pockets to make Tivoli
a credible enterprise solutions player, the
short-term penalty was energy diverted to absorbing
IBM legacy products such as NetView/6000 (whose
organization Tivoli subsumed) and ADSM (distributed
storage management).
The fruits of that effort are starting to emerge;
within the past month, Tivoli has released its
first mainframe (OS/390) product, which is an
adaptation of IBM's established SystemView offering.
Tivoli's product umbrella begins with the TME
framework, the hub used for integrating Tivoli
and third-party products. It does not perform
any systems or network management functions,
but instead provides the necessary hooks to
allow different tools (Tivoli and third-party)
to utilize Tivoli services. The framework includes
toolkits for integrating LAN management, collecting
data from tools such as Microsoft Systems Management
Server (SMS), Intel LANDesk Management Suite
(LDMS), and IBM NetFinity.
Additionally, it provides tools to develop interfaces
to applications (for application management).
Among the basic modules are the Tivoli Enterprise
Console (TEC), which provides a central point
of management that monitors events and allows
manager to respond to alarms, manually or automatically.
It monitors events such as system resource availability,
available (memory) swap space, disk utilization,
network traffic conditions, printer queue status,
user logins, SNMP status, etc.
This data can be used to generate 'business
views' delivered via Tivoli's applications management
modules (specific modules have been developed
for SAP, Lotus Domino/Notes, Microsoft Exchange,
Microsoft Internet servers, Netscape SuiteSpot,
IBM MQSeries (middleware), and the CATIA computer-aided
design program). It is accomplished through
the Tivoli Global Enterprise Manager (GEM).
Actually, CA has been promoting business views
in TNG, its latest incarnation of Unicenter
and HP is also starting to promote the same
capability.
GEM's business views translate system and network
data into a view that shows the impact on business
transaction processes.
According to Computerwire's Distributed Systems
Management Tools Bulletin, the addition of GEM
reflects a transition for IT from a utility
towards a service mentality, where its activities
are tied more closely to enterprise value-added
business processes and its emergence comes in
parallel with the new generation of ERP applications.
GEM can be configured to provide views of different
variables - or business transaction processes
- by user type, and it can do so across numerous
platforms, including S/390 mainframes (which
may be a server to distributed environments)
via interface to IBM SystemView.Tivoli Courier
(software distribution) uses transaction-based
technology to synchronize the installation of
server and/or client software components.
To avoid bottlenecking enterprise environments
with single point of failure conditions, software
distribution can be cascaded onto multiple regional
or local servers for parallel distribution,
and policies can be set for the automatic falser
or rollback. It is operated in conjunction with
Tivoli Inventory (asset management).
A PREMIUM SOLUTION?
Like CA, Tivoli also sponsored research from
International Data Corp. (IDC) to chart the
ROI from TME 10 roll out. The IDC study, published
early 1997, was based on 10 TME 10 customers,
with the following characteristics:
* Size: Over 20,000 employees and nearly $5
billion revenues.
* Desktops: 6480 (average).
* NOS: Primarily NetWare with some NT and OS/2.
* Servers: Over 90% have UNIX servers (actually,
we would have thought the figure would have
been closer to 100% given Tivoli's UNIX heritage);
77% have mainframes, 20% have AS/400 servers.
* Typical IT systems management team: $12 million
salary (average), comprised of 176 FTEs, paid
an average $70,000 annual salary (burdened;
a figure which we find fairly cheap), supplemented
by 10-12 unpaid gurus (which Meta calls 'shadow
IT').
* Tools Costs: $2.5 million (purchase), 15%
annual maintenance.
* Tools use: All used Tivoli, but in some cases,
not exclusively (some had a patchwork of different
vendor solutions, with a goal to eventually
migrate totally under the Tivoli umbrella).
The study used payback and net present value
(NPV) methods to calculate ROI. The average
annual investment per 100 users was calculated
to be $69,626. The returns, based on improved
management efficiency, productivity, and system
availability (the inverse of downtime) totaled
$221,367, yielding an average payback of 115
days.
Compare that to the IDC figures for CA: the
typical investment per 100 users is $69,000
versus $59,000 for Unicenter, with the payback
time almost double (115 days versus 69 for Unicenter).
And Tivoli's pricing has been a real hurdle
for some accounts. A global $8 billion diversified
food products company had been using Tivoli
Courier as a point solution for automated software
distribution for three years. (It also had HP
OpenView for network management.)
In early 1997, the corporate IT group investigated
adding server and desktop change management,
configuration management, server event management,
inventory, and remote control that would scale
up gradually from 1,100 desktops to nearly 30,000
ultimately.
Pricing was the catch: the goal was $50/desktop,
but Tivoli could only get to about the $150-$160
level, and they weren't willing to guarantee
the price for more than 12 months.
That would have especially impacted the smaller,
less profitable business units (the organization
has decentralized P&L responsibilities),
which would likely have been among the last
to implement, and the ones least able to pay
inflation-adjusted prices. The customer is currently
talking with HP.
What are TME users getting compared to CA? As
we mentioned above, Tivoli and CA product capabilities
are gradually converging. But there are still
major differences, with Tivoli emphasizing best-of-breed,
making available a broader array of solutions
than might be available with a single vendor
solution.
Meta Group has endorsed this approach, stating
that no single vendor can satisfy all management
needs for legacy and distributed environments,
from desktop to server, application, database,
and network.
But as the IDC data demonstrates, Tivoli solutions
are more expensive than Unicenter. For smaller
organizations, short implementation times, costs
and ease of deployment, rather than high-level
integration and cost-of-ownership are much higher
up the agenda.
No matter, Tivoli plans to go down market. Java-based
front ends are likely to become standard across
all TME 10 products. More than just another
pretty interface, Java's run-time deployment
mode translates to fewer systems integration
issues and less network bandwidth consumption,
keeping the infrastructure as light as possible
and making for easier, less costly implementation.
In the shorter term, by mid year, Tivoli plans
to introduce 'Bossman', a channel-ready TME
suite to comprise a framework of limited scalability
along with five or six applications. They might
even be tailored for vertical markets and would
scale to support a single TMR (Tivoli Management
Region) - a workgroup version, effectively.
But even in this guise TME 10 is no one-stop
shop. As Joseph Ambrose, principal at the systems
house of Computer Sciences Corporation, has
it: 'I've put together practice guides [for
distributed systems management implementation]
that have around 30 essential major processes,
and around 100 sub-processes. Tivoli addresses
30% of those functions, at the most.'
Even if Tivoli only addresses 30%, that's far
higher than point solutions. While point solutions
save time and money, the use of integrated,
framework-based solutions, properly implemented,
can provide dramatic returns on investments.
One white paper shows that over five years,
the combined advantages of improvements in efficiency,
productivity and availability attained by sites
implementing a 'high risk, high reward' systems
management framework like Tivoli TME 10 should
yield around a 15-fold saving, against annual
investments in the system.
SERVICE CHARGES ADD TO BILL
Just as organizations that got involved with
an SAP R/3 implementation have found, TME 10
projects call for plenty of professional services.
In most cases, customers don't know what policies
they want to enforce (security, user administration,
etc.) and have to spend time and money on people
who can help define these rules before they
can begin implementation.
Some of the users whom we interviewed did not
use outside services; for the most part, they
implemented only portions of the Tivoli functionality.
Scoping for professional services among TME
10 implementations has not been done very well
to date, though Tivoli will concede privately
that software licensing costs are only just
the start of a long line of implementation expenses
associated with TME 10.
In fact, for every $1 million systems management
software dollar budgeted, there will be at least
another $600,000 worth of consulting fees to
account for business reengineering and process
modeling work, a further $350,000 for architecture
design and project management, and $350,000
more which will be spent in deployment and rollout.
And, in a script borrowed from the SAP world,
good Tivoli skills are hard to find. Tivoli
itself acknowledges a shortage of at least 500
architecture, project management and deployment
specialists, and expects to need 1000 more professionals
by late 1999.
Not surprisingly, with costs at - between $1,100
- $1,400/ day for a skilled TME 10 contractor
and $1,200 - $1,800/day for implementation consultants,
professional service are an expensive but necessary
evil in these projects.
'TME 10 is a consultant's dream,' said a VP
from a national retail banking giant, who warned,
'Careful step-by-step planning is required to
keep implementation on schedule and within budget.’
A typical TME 10 user has tens of thousands
of desktop clients, runs 15 databases, and 12-15
mission-critical applications across eight operating
systems. Invariably some of these systems will
call for upgrade before integration with the
framework.
So what help do TME 10 users receive from Tivoli?
With such a big-ticket item as TME 10, much
of the pre-sales consulting will cover the issue
of cost justification and Tivoli tells us it
has various software templates that allow prospective
customers to model ROI.
Most of these models will help compare systems
management tasks before and after a proposed
implementation, with the chief benefits being
cost avoidance in the following areas:* Unit
cost of software distribution.* The cost to
add/move/change an end-user.* The cost of downtime
and unavailability. Frequently, the numbers
that come out of these analyses are unrealistically
large, assuming ideal conditions. Indeed, often
expectations will need to be scaled back before
the numbers start to look credible.
At this stage Tivoli also introduces the notion
of its Tivoli Implementation Method or TIM,
a scheme that it has developed to make TME 10
rollout faster, easier and cheaper. But in many
ways the ultimate payback will depend on the
objectives behind an implementation.
USER EXPERIENCES
Halifax Bank. A $168 billion,
UK bank with over 900 branches and 600 property
service outlets (whose business has grown drastically
as a result of a three-year old merger), is
now well underway in the task of folding together
its acquired systems and IT assets, including
IBM and Unisys data centers. It is also in the
midst of completely rewriting its business applications
to run in client/server mode, and has invested
millions in a TME 10 rollout, described cryptically
as having a 'list price value' of $53 million.
Halifax's network is large, covering 30,000
Windows NT desktops, 2000 NT Servers, roughly
50 HP/UX servers, along with IBM and Unisys
mainframes. Bank branches are to operate over
a dozen three-tier applications such as PeopleSoft's
human resources applications suite, the HUON
insurance program, the Corebank banking application,
and a workflow and imaging application based
on View Style.
At Halifax the principal aims behind its TME
install were threefold:* Reducing demands on
the operations group, by reducing the proliferation
of different tools and skillsets required for
systems management. Thanks to TME 10's GEM,
data sharing is bi-directional between the MVS
data centers and the 2,000+ Windows NT servers
and 50 HP/UX servers, providing true end-to-end
management from data center to desktop.
* Introducing a single automated software distribution
mechanism. TME 10 software distribution module
lets Halifax staff distribute software, monitor
key applications parameters and events, and
perform other tasks consistently across Windows
NT and UNIX systems, automating the rollout
of the new client/server applications as well
as daily distributions of (program) code and
data files for several key applications.
* Establishing a common look and feel across
the whole systems management control infrastructure.
The other option - managing multiple systems
with multiple control systems -would have increased
support staffing requirements. With TME 10,
Halifax requires only 40 support staff at a
central location to manage help desk and systems
administration for its 900-branch network via
Tivoli consoles and inventory capabilities.
Furthermore, working with a single set of tools
that is well integrated into the framework means
a myriad of systems management tasks from event
management to software distribution can be linked.
As part of a post-implementation review, the
company is about to start looking at the return
on investment from its TME 10 rollout. Early
indications are that returns are better than
expected, though Allen Bentley, the Halifax
project manager for enterprise management, admits
the calculations are far from clear cut and
that the whole process of cost justification
'ain't easy.'
Admittedly, added Bentley, rollout has not been
without bumps. 'It's not just the product, it's
all about understanding systems management in
the client/server environment,' he said.
Halifax will begin by looking at obvious factors,
such as reductions in downtime and staffing.
Among the easiest portions to calculate are:
* Automating software distribution (see Barnett
Banks case study, below).* Automatically re-establishing
a lost database connection.* Advanced problem
diagnostics in improving systems availability.Many
obvious benefits are harder to quantify. For
instance, using distributed monitoring to watch
applications on remote systems for certain thresholds,
such as available disk space on local servers.
The early warning of a disk failure theoretically
saves downtime and repair effort, but putting
a price on cost avoidance is difficult, not
to mention extrapolating the amount of business
that wasn't lost.
Bentley accepts that analyst reports based on
case history reports and generic cost calculations
go only so far. They offer helpful guidelines
but every organization needs to ask the same
question: Do these figures apply to us? His
group will soon find out.
Delmarva Power & Light. A
modest sized Tivoli shop, the Chesapeake Bay
area utility, manages a network of 85 Sun UNIX
systems which range in size from technical workstations
to Enterprise 6000, multi-processor SMP servers
in the utility's engineering area (there are
no PC LANs in this group).
The system includes an Oracle database and various
engineering CADCAM tools, and some SAP R/3 application
and database servers, spread over 20 sites in
three states. The organization has Tivoli's
enterprise console, distributed server monitoring,
software distribution (to servers only; no clients),
and a token level of user administration for
UNIX shell accounts (this is a minority of users;
most interact via application-based security).
Delmarva is currently evaluating adding the
Unison job scheduling utility (recently acquired
by Tivoli).
Overall experience has been positive, according
to Blaine Boyles, project systems programmer,
although he recalled hotline support slowed
down for a period when IBM first folded Tivoli
support into IBM's call centers (he notes that
the problem has been addressed within the past
six months).
Boyles had few numbers to share, and wasn't
at liberty to discuss licensing or maintenance
costs.
The modest size of the installation has undoubtedly
made things easier. 'If you take a steady, incremental
approach, things go more easily,' Boyles said.
The benefits? Previously, they monitored server
activity via rudimentary homegrown tools, and
spent more time monitoring because the old systems
did not carry alarms or distributed sensing.
Boyles estimates time savings at 25% for monitoring.
Implementation of TME was bootstrapped in-house,
including training of three people for 2 weeks,
and requiring 2-3 months for implementation.
Our estimate of implementation staff costs:
$80,000.
At the time, the overall system numbered only
20 servers (expansion came with R/3). By the
time that R/3 was implemented, the Tivoli modules
were already up and running. 'By the time we
implemented R/3, it was just a natural extension
[of Tivoli] for us,' said Boyles.
The biggest challenge was configuration: how
to organize the console, setting policies and
alarm thresholds. Boyles admits that he's constantly
tweaking the settings.
Over this year there are plans to extend Tivoli
to the new NT servers, and drill down to LAN
management. 'That will be a major undertaking,'
Boyles concedes. They will once again do it
incrementally, beginning with a small pilot.
PageNet. As the world's largest
paging company, with 10 million North American
subscribers, the company operates 90 servers
in 80 cities. While IT operations are centralized,
UNIX servers running an internally developed
customer information system are highly decentralized.
'The application is distributed because that's
the way the company grew,' noted Clark Carradine,
lead systems engineer.
The distributed servers are managed remotely
from Plano headquarters; there are no local
UNIX administrators. Currently, projects are
underway to replace these applications with
client/server based packages.
PageNet uses Tivoli distributed server monitoring,
server-level software distribution, and limited
user administration for a handful of restricted
environments.
Events are reported via pages and emails; this
year, Tivoli Enterprise Console will be added,
the entire system will be extended to 100 NT
servers to manage email and print services,
and a link will be added to integrate the Remedy
help desk application. (Desktop administration
is currently not a major issue, since PCs only
run Microsoft Office.) PageNet also uses HP
OpenView for managing SNMP network devices,
at this point there are plans for a limited
integration with the Tivoli TEC console.
Tivoli was chosen over Unicenter three years
ago because of its framework approach and ease
of integration with existing point solutions
such as Legato Networker (backup system). PageNet
did not expand Openview because, at the time,
it lacked capability to monitor server resources
(Data General Aviions). PageNet is using Tivoli
to monitor the servers and applications via
both canned Tivoli Sentries and internally developed
integration hooks to Tivoli, conduct event management,
and limited problem resolution/restarts.
We were unable to pin down initial costs, but
assuming that the firm's $50,000 annual maintenance
represents 15% of up front license cost, we
would estimate that initial cost was about $330,000.
Based on a conservative estimate that implementation
is about 2.5 times the cost of licensing, that
would put the overall up front cost at $825,000.
PageNet also added a $48,000 dual-processor
Compaq Pentium Pro server running as the TMR
server.
Among the benefits is the lower incidence of
trouble tickets recorded by the help desk (which
runs the Remedy help desk application). After
TME ramped up, trouble tickets for a business
critical application dropped from 125 to the
current 25/month level. With an average of one
staff hour to clear a ticket, at the burdened
rate of $40/hour (for level two UNIX support),
the savings are $60,000 annually for that application
alone.
The reductions also cascade down to level 1
support personnel at the help desk, who save
an average of 10-12 minutes/call; the time is
invested in other activities (there were no
staff reductions) such as managing chronic problems.
There are also savings to the end users (customer
service representatives), whose problems typically
took 2-3 hours chronological time to resolve
(during which their productivity was reduced).
There were no firm estimates for savings here
- although it could be noted that the effect
of these problems was that customer service
representatives were often unable to enroll
new accounts while the server was down. This
could have resulted in lost business - no numbers
were available here.
Carradine's major concern is that Tivoli has
been slow to port its new releases to the older
DG 88K servers; but otherwise, he reports few
if any implementation issues.
Barnett Banks. Florida's leading
bank, with $45 billion in deposits, which was
acquired by NationsBank in January of this year,
has taken a centralized approach to Tivoli,
and has restricted its efforts to profitable,
low-hanging fruit where returns are visible
and readily generated.
Before being acquired, Barnett was in the midst
of migrating from OS/2 to NT clients as the
new company standard; in all, the bank manages
over 20,000 desktops among its 620 southeast
US retail bank branches, 100 loan offices (from
a subsidiary which it previously acquired),
and other facilities.
Both Barnett and NationsBank had been Tivoli
TME 10 sites; however, both implemented with
different purposes. Barnett embraced centralized
desktop management whereas NationsBank had focused
more on enterprise systems management. 'We focused
on where we felt we could get the most bang
for the buck,' said Rod Stockwell, who heads
system management and engineering for Barnett.
Specifically, Barnett focused on automated software
distribution. 'We told our [internal] clients,
we can maintain and deliver applications from
the central site to their desktops, conducting
all updates and changes over the network. That
was something tangible that we could deliver,'
said Doug Register, directory of network design
and implementation.
In part, Barnett's success was aided by the
fact that it already had previous experience
with an earlier-generation branch automation
project.
Around 1990-91, it used DCMS (Distribution Change
Management Facility), an legacy IBM banking
software distribution product which ran in a
CICS transaction environment via SNA networks
to a population of primarily single-application
PCs. As branch PC connections were upgraded
to Novell LANservers, Barnett developed some
custom extensions to DCMS. At that time, the
bank commissioned a benchmarking study which
estimated $13 million annual savings for software
change management, based on an average rate
of two software changes per PC per month.
It proved useful experience. Barnett had to
migrate off DCMS when 32-bit Windows 95 and
NT emerged. An eight-person architectural team
met on a part-time basis for six months, including
a mixed background UNIX and NT administrators,
application developers, and technical support
to develop a change management process, and
to choose a new solution.
It was as much culture as technology change.
'The word discipline became the key,' said Register.
'Our team took ownership of the desktop and
set down rules and procedures specifying how
developers could change the applications, but
they could not change the registries.'
By locking down the registry, the group hopes
to limit if not eliminate ad hoc end-user changes
to desktops. They sold the notion to developers
as a means of facilitating deployment and reducing
configuration-related headaches. For power users,
they are selling lockdown as a service, where
system administrators help power users install
what they require on their machines.
Stockwell conceded that, at the time they made
this decision, three years ago, TME/Tivoli Courier
was 'a diamond in the rough,' containing about
80% of the desired functionality - fan-out updates,
change control, automatic rollback, back out
of successful changes (when desired). But, to
this day, Tivoli remains a work in progress.
For instance, Courier is a 'push' technology
that uses a schedule to activate software distribution;
for laptops, this didn't work. They developed
a workaround.
The first Courier project was with EquiCredit,
a 100-office, 800-desktop subsidiary that Barnett
had previously acquired. Register and Stockwell
lead a five-person team to define server and
client software configurations, and then standardize
them across the business unit.
It took the team four months, at a labor cost
that we would estimate at $250,000 (burdened).
Courier has achieved a 98.6% success rate for
software delivery (ratio of successful updates
versus aborted downloads).
Currently, they are adding 'wake up' capabilities
that allow remote on/off switching of PCs, which
is useful for maintenance tasks performed at
3am, and they are testing Intel's Wired for
Management technologies that would automatically
build profiles of hardware and software configurations
of each PC, identified by serial number. EquiCredit
went live with Courier March 1997.
The only benefit that the team has documented
so far is successful software installation rates;
they have not tallied the number of updates
made, nor extrapolated the labor savings attributable
to that. But one important step taken to ensure
productivity was using the same Intersolv PVCS
tool that the bank's development staff was already
using for configuration management.
The result is that developers do not need to
learn a new tool when an application or upgrade
passes from QA to rollout phase. The team integrated
PVCS to Tivoli Courier.
'There's no single product that does everything
you need it to do,' said Register, explaining
the use of PVCS, and the fact that they have
been willing to customize their own enhancements,
rather than wait for Tivoli to add them.
Barnett has started extending Courier to the
portion of its retail branches located in Publix
supermarkets. The team has also been given the
lead to manage Courier for its new parent organization.
TDS Inc. A $1 billion Madison,
Wisconsin-based diversified telecommunications
firm, offering cellular, PCS digital, and rural
telephone service to different areas of the
US, TDS recently began replacing a failed TME
installation with Unicenter.
The organization is highly decentralized, with
a 377-MIP headquarters mainframe running a customer
billing batch and on-line application, along
with 85 Sun and IBM UNIX servers residing in
a headquarters data center and several major
regional operations facilities (the organization
also has numerous additional UNIX servers at
local telephone offices which are not administered
by corporate IT, and therefore were not part
of the Tivoli environment).
The point of pain was maintaining availability
and user administration in the distributed environment.
In 1995, when TDS had roughly half the UNIX
server population, it had a half dozen full-time
UNIX administrators. It was using Yellow Pages,
a UNIX utility for communicating user IDs and
passwords to distributed systems, and it was
writing its own scripts to retrieve event data
(e.g., utilization of CPU, memory, database,
and disk resources).
In all, it dedicated nearly two full-time administrators
to writing UNIX shell scripts, managing users
IDs, and monitoring system performance. Based
on a generic estimate of $45/hour burdened (three
years ago), we would estimate that in labor
alone TDS was spending almost $180,000 annually
for its UNIX systems management effort.
The problem was not labor costs necessarily,
but the ability to continue managing what was
an expanding environment (at that time, the
company began phasing in SAP R/3). First, there
was the non-value added time and effort for
maintaining homegrown scripts, not to mention
scalability limitations of the basic Yellow
Pages user ID utility. 'We had reached the point
where Yellow Pages was having difficulty dealing
with the sheer number of users,' said Mike Love,
technology strategies manager. He noted that
technical staff had to gingerly move user ID
and password files around so as not to exceed
capacity limits on any server, and to be especially
careful to avoid corrupting the basic Yellow
Pages files. It was increasingly becoming a
tightrope act. Furthermore, the company did
not want to increase administrative staffing
as its UNIX environment grew.
At that time, TDS chose Tivoli because it found
few other alternatives; for instance, point
solutions such as BMC Patrol were even less
mature than Tivoli at the time. TDS also strongly
believed in the Tivoli framework approach.
'We wanted to bring in tools that could manage
Oracle and be able to fit it into the Tivoli
environment,' said Love, who added that no single
product could be all things to all administrators.
They invested in the low six figures for a TME
license.
TDS installed Tivoli event management and user
administration on a testbed, which worked fine.
After a month, they ramped up production at
a regional center. Tivoli event management continued
to work satisfactorily, but the same wasn't
true for user administration.
Symptoms included passwords resetting without
warning and delays of up to 20-30 minutes for
user logins to be accepted. The user administration
system was pulled from production quickly, with
the problem eventually identified as architectural
incompatibilities (Tivoli did not support TDS's
topology).After months of investigation, Tivoli
was de-installed. Recently, TDS signed a contract
to install a few basic Unicenter modules, including
event management, user administration, and the
TNG front end. Ironically, they still believe
strongly in the Tivoli framework approach, with
Unicenter's recent addition of a framework for
third-party solutions a factor in their recent
decision.
For instance, they are looking at installing
BMC Patrol to manage the Oracle database, along
with several backup point products, which they
hope to integrate with Unicenter.
The bottom line was that TDS did not wish to
increase staffing, and that downtime from overtaxed
system resources was a serious issue.
They have not calculated the cost of downtime,
but simply have the instinctive knowledge that
the failure of a mission-critical application
such as end-of-month financial closing has impact
greater than the resulting lost labor costs
alone.
START SIMPLE
The best successes from enterprise management
frameworks occur when points of pain are clearly
identified, and where bounds are placed on both
the goals and the scope of the project. Projects
that aim to perform event management are likely
to produce diminishing returns if the potential
benefit is not quantified at the outset. Table
1 outlines some recommendations
Barnett Banks identified a clear cost center,
in this case, software distribution, and limited
its TME project to solving the specific problem.
Furthermore, while enterprise frameworks promise
grand visions, they are only as effective as
the infrastructures on which they are implemented.
For instance, while TDSs use of Tivoli was effective
at monitoring system utilization, and was able
to stretch out hardware purchases, it didnt
eliminate the need for infrastructure improvements
(they eventually added an EMC Symmetrix storage
array to resolve disk utilization bottlenecks).
A cost that tends to be often overlooked is
the staff resources necessary for implementation.
As we discovered with Unicenter users, the effort
to develop the necessary hooks between the enterprise
management system and the resource being monitored
or managed is often more complex than it initially
appears.
This is yet another hint, while global enterprise
management visions are honorable, when it comes
to managing resources, all politics are local.
It pays to focus, one resource at a time, and
it also pays to examine how the proposed solution
will impact existing practices, and whether/how
they must change. And it pays to track resource
time and costs.
And if implemented properly, effective event
monitoring can help postpone capacity upgrades.
TDS was able to delay the expenditure of about
$200,000 for a Sun Enterprise 10000 server for
about 3-6 months (roughly 10% of a servers lifetime)
due to an internally developed capability.
The economic ramifications of delayed investments
include cost of capitol, and, given ongoing
hardware price/performance trends, more bang
for the server buck.
In our next report, we will continue our study,
examining point solutions and future framework
contenders.
_____________________________________________________________________
TME
10 critical success factors:
* A comprehensive enterprise-wide architecture
design review (done properly a job that will
take weeks, rather than days) to map business
processes, network topology and system resources.
* Development of a detailed statement of work
and project plan and a project management process
that tempers customer expectations and minimizes
scope creep after initial planning.
* Design of a pilot that can be completed in
60 days, offers a manageable first deployment,
and produces clear ROI opportunities (typically,
such a project would take in no more than 300
target clients and may be four servers).
* A budget that does not underestimate the need
for proper skills.
© 1998 ComputerWire Inc