© 1997 GIFPA Ltd. 2016
Competitive pressures in the private sector and cost pressures in the public sector are forcing more and more software development and maintenance to be contracted out, either via single contracts, or by outsourcing all or part of the services. Yet the means of controlling these contracts, particularly for the customer, and even to some extent for the supplier, are still relatively crude and not well understood.
This paper explores the critical issues of how software contracts can be controlled, and the difficulties involved from the customer and supplier viewpoints. Customer and supplier objectives are examined, and ways in which the supplier can add value. The types of performance measures needed are described, and the importance of performance trade-offs is discussed. The difficulties of conventional software contracting processes are described, as well as a new 'SCUD' process which appears to overcome some of the difficulties. Finally, several case studies are used to illustrate the main messages of this paper, and some concluding lessons are drawn for suppliers, customers, and the Software Engineering community at large.
(Note that the focus of this paper is on controlling software contracts from a performance measurement viewpoint, with an emphasis on protection of the customer's interests. There are many other contractual issues which need to be addressed when outsourcing the supply of software, such as ownership of intellectual property rights, protection of the rights of staff transferred, retention of key skills, contract termination, etc., etc. These no less important issues are beyond the scope of this paper.)
If organisations choose to buy software externally rather than develop in-house, the minimum they will expect from their supplier is a service at a better price-performance ratio than they could achieve with their own resources. Unfortunately, few software customers have a clear idea what they want in terms of 'performance' They will certainly want lower costs, but they may also want faster delivery, better 'quality' of the delivered software, which could mean greater flexibility, greater ease of use, or less defects than would be expected from an in-house solution. Increasingly there is an expectation of off-loading development risk to the supplier. And the customer will expect the supplier to demonstrate some track-record of understanding the customer's business.
These are mixed and demanding objectives, and not often well thought out . For example, they are not often prioritised.
In contrast, the supplier's objectives are generally clearer. First, he needs to make a profit. Second, he will want to satisfy the customer, for a satisfied customer will be the best source of repeat business, and failure to satisfy could be extremely expensive. If he is bidding competitively, the supplier will also want to differentiate himself in some way from the competition. This might be achieved by being a low-cost supplier, by offering some unique deal or, increasingly, by trying to move up the customer's value chain.
Customer and supplier objectives therefore differ in various ways, but they also overlap in that both want to achieve customer satisfaction. This is commonly expressed in both parties wanting a 'partnership' that must be made to work. However, if one party does not behave as expected, or becomes dominant, the relationship is unlikely to be satisfactory to the other party. Software contracting is therefore successful if both parties 'win' in achieving all their objectives. But the supplier starts out with the distinct advantage that software contracting is his business. By comparison, most customers are relatively inexperienced. True partnership is therefore a difficult balance to achieve.
When software is supplied, value is added at three levels
|Application software adds value by providing information-processing functionality. This may be specified by the user, or come pre-packaged from the supplier. The customer will also specify quality requirements - ISO standard 9126 provides a good list. The more demanding these are, the greater the value added in achieving them.|
|When the application software is implemented in production on an infrastructure, value is further added in varying degrees, for example in the extent to which the system is accessible and the extent of data implemented and, of course, by the routine processing of the application.|
|All the software development and processing ultimately adds value by enabling achievement of business objectives, such as helping to lower costs, or making better decisions which provide competitive advantage. The net value accrued at this level is the benefits obtained by the business, offset by the costs of ownership of the application.|
For each of these levels, we need different types of measures to determine the 'value added'.
At the Application level, the 'size' of the functionality delivered is a primary measure of value, but other measures, for example of the various quality attributes of the software product, are also needed to get a full picture. Measurement of value added at the Implementation level is fairly obvious and straightforward. At the Business level, on the other hand, whilst measures of business performance are well-understood, disentangling the contribution of the software is often difficult.
It is important to recognise that software suppliers are increasingly trying to move up the customer's value chain, in order to grow their business and to get a greater grip on their customers. Hence some major outsourcing suppliers are proposing contracts in which they are paid on the basis of the improvement in business performance arising from their services, as opposed to the more classic method of being paid for the software delivered. This seems to work satisfactorily when the customer wants to outsource a whole business process where information technology plays a major role, and the supplier is paid on, say, a cost per transaction processed. But the more ambitious schemes of aiming to be paid on the general improvement in profitability arising from development and implementation of improved systems seem fraught with difficulty. A general down-turn in prices, for example due to increased competition, could seriously undermine profitability, no matter how much had been invested in improved systems.
From hereon, we will concentrate on the classic task of controlling software contracts at the Application level, with some reference to the interaction with the Implementation level. Customers should be aware, however, of the ambitions of the outsourcing supply industry, and of the greater difficulties of controlling value for money at these higher levels of added value.
Some of the most important performance measures needed at the Application and Production levels are shown in this diagram.
The vital point here is that significant trade-offs in performance are possible, yet customer organisations rarely have the data available to make informed choices, and may even be unaware of the possibilities and risks.
Perhaps the commonest customer failing when negotiating a software contract is to concentrate on the immediate development cost, whilst ignoring the fact that the life-time cost of ownership of a system is dominated by the production and on-going maintenance and support costs. Alternatively, a business may be prepared to pay more for a development if the system can be delivered very quickly to provide competitive advantage. Effort (and hence cost) and time are tradable to a considerable extent. Unfortunately there are conflicting views in the industry on the best description of the trade-off relationship.
The golden rule for the customer is to define all his performance objectives, prioritise them, and then seek to define the set of performance goals which provides the best balance for meeting those objectives. It is very unlikely that pulling hard on just one performance lever will be enough to achieve all your objectives.
To be able to do this properly requires the customer organisation to have built up an understanding of the factors which influence performance in software delivery, and how they interact. This takes time, and is knowledge which even major software suppliers do not always seem to possess in all parts of their organisation.
Any software measurement programme must be founded on measures of software size. The first question asked in estimating any development effort is 'how big is it?'. And software size is the key component of performance measures such as 'productivity' (size/effort), 'delivery rate' (size/elapsed time), etc. There are several choices for this measure.
|Source Lines of Code ('SLOC') are still very widely used, although they suffer from several well-known deficiencies. Most important of these is that the actual number of SLOC is not known until the software is developed. SLOC can therefore only be used in estimating if they can be predicted from another measure obtained much earlier in the software life-cycle|
|Function Point Analysis ('FPA') was first proposed by Allan Albrecht of IBM in the late 1970's to overcome the main weaknesses of SLOC. He developed a composite index of counts of the functions required (inputs, outputs, inquiries, logical files and interfaces), and of the degree of influence of some 14 quality and technical requirements. The definition of the resulting index in units of 'Function Points' continues to be refined by the International Function Point User Group. IFPUG Function Points have become the most widely adopted measure of software size in the business information systems world.|
|'MkII' Function Points were developed by the present author in the late 1980's to overcome certain perceived weaknesses in the IFPUG index, including basing the size measure on concepts which had meantime come into use in requirements specifications, namely logical transactions and entities. The MkII index is also very easily adapted to apply to object-oriented models. As well as offering a means of software sizing, the method also has an integrated estimating method.|
|Both Function Point methods evolved out of the business information systems world, and are mainly used for so-called 'data-rich' software. But conventional FPA does not work well for software whose characteristics are dominated by, for example, complex functionality (e.g. as in scientific and engineering programs, rule-based systems, operating systems, etc.), or which have major real-time constraints, such as in telephony software. In these software domains, SLOC are commonly used as the measure of size. For estimating, therefore, many local FP-like sizing methods have been developed to enable early prediction of SLOC.|
|Finally, some development organisations have recognised the applicability of the concepts of 'standard-hours' from the world of Industrial Engineering, as a means of measuring the work associated with software development tasks or deliverables. These measurements allow monitoring of, for example, productivity 'against standard', and the measures can be used for estimating.|
How good are these measures? The answer is that judged against the needs of the software contracting industry, they are not as good we would all like them to be. But they are all we have.
As an illustration, multi-billion dollar software outsourcing contracts are being controlled, and legal disputes are being resolved, on the basis of Function Point measures. Yet these measures are known to have certain deficiencies and do not work reliably in all software domains. Used with great care, they can give adequate means of estimating and control, at least in the business software world. But clearly there is a need for better measures.
There are two approaches to contracting software which are commonly followed
Neither is particularly satisfactory, especially from the customer viewpoint.
In the first approach, the customer defines his requirements in detail, and then draws up an Invitation to Tender ('ITT'), which is sent to prospective suppliers. The suppliers then have to study the ITT, develop their estimates, and write detailed proposals. This process is time-consuming and expensive for both parties. Preparing a detailed statement of requirements and an ITT requires considerable skill, and if potential suppliers have not been involved, good ideas may have been missed. The suppliers are constrained by the ITT, and may have limited opportunity to differentiate themselves and add value.
An alternative process, therefore, is for the customer to issue only an outline statement of requirements. Suppliers may be asked to bid to complete the detailed requirements (perhaps for a fixed price), with an indicative price to complete the development subsequently. This process has the advantage of requiring less time and effort for both the customer and suppliers through the bidding phase, and it allows the chosen supplier to bring his experience to contribute to the definition of the system.
However, the customer has less control in this process. Once the supplier is installed and is helping to shape the system definition, the customer's bargaining position is considerably weakened, if subsequently the price to complete the system seems to have risen above that which was indicated at the time of the initial bid. Where this process is followed and the supplier is permanently installed, as in an outsourcing contract, the process is even less satisfactory for the customer. For this process to be satisfactory, the customer needs other performance measures than just fixed or indicative prices.
A process which overcomes the sort of weaknesses described above has emerged recently in Australia, and is finding its way into outsourcing contracts. It is known as the 'SCUD' or 'Software Charged by Unit Delivered' process.
Here, as in the second process described above, the customer issues an ITT with only an outline statement of requirements. The ITT requires, however, that the suppliers bid to complete the system through all its phases, at a fixed price per Function Point, eg in units of $/FP. The preferred supplier is selected on the basis of his quoted unit price in $/FP, and of course the usual other factors.
As the detailed requirements are developed by the customer and the chosen supplier, the first 'Baseline Function Point Count' is established. This, combined with the agreed unit price, enables the customer to decide if the overall cost is going to be affordable. When the scope is decided, development can proceed. The final price paid is determined by a final FP count for the delivered system, and the quoted unit price in $/FP. An independent third party is employed by both the customer and supplier to determine the FP counts at each stage, and the impact of changes made.
This process appears to offer distinct advantages to both the customer and potential suppliers. The time and effort required for the bidding round is much reduced, and the customer has the assurance from the start that he is getting a good market unit price (it can be compared against benchmarks). The customer has the possibility of controlling the overall price to be paid by keeping a tight hold on the scope of his requirements, and the process moves much of the risk from customer to supplier.
As with most such processes, it appears deceptively simple at first sight. In practice, both customer and supplier need to be aware of potential pitfalls, and to manage the process with care. The most obvious point, as has been discussed above, is that other performance measures must be agreed in addition to the basic $/FP, and the limitations of these measures must be understood. For example, if the system has complex processing rules, which are not properly reflected in the FP measure of size, then the $/FP and the project estimates must be compensated to allow for this.
Managing changes could also become problematic. If the customer is very indecisive, and introduces changes more often than the supplier allowed for, then there is potential for conflict. In an extreme situation, one can imagine a supplier quoting his $/FP on the assumption that a particular package will meet the requirements. On more detailed examination, it could turn out that the package will not meet the requirements, and another solution at an entirely different $/FP might be needed.
All these potential problems can be overcome if considered carefully in advance, but the process needs to be managed with care. The process is being used by the Government of the State of Victoria in Australia to procure software, and is being introduced into outsourcing contracts in the USA and in the UK.
A UK retailer was negotiating the outsourcing of all its application development and maintenance services. A major goal was to reduce costs. Targets were therefore set for the bidders that within a given period they should achieve 'upper quartile' performance in application development productivity, measured in FP/man-months, according to a particular benchmarking service. The retailer had had no previous experience of measuring performance measurement in software activities.
In order to establish a baseline, some measurements of current productivity were made. These showed, for the small sample of projects, that productivity was around one-third of industry-average. This result was an unpleasant revelation to the retailer, but looked at positively, it indicated the scale of performance improvement and cost-reduction which should be achievable from outsourcing.
As a check, we asked that the 'Delivery Rate', in FP/elapsed week, be measured for these same projects. This performance parameter showed that the speed of delivery for these same projects had been very much higher than industry-average. It turned out that the projects used for the measurement sample, had been set the objective to deliver as fast as possible, no matter what the cost, as the resulting systems were essential for competitive advantage. Speed of delivery was therefore very high, which was achieved by pouring resources into the project. Hence productivity was low, and the quality of the delivered software was also low due to the high speed of development. But the business goals and benefits were achieved as planned, and the projects were a great success from a business viewpoint.
If the outsourcing contract performance goals had been defined only on the basis of development productivity, then the retailer could have been bound into a situation where the supplier could meet his goals quite easily, but the retailer would suffer from lengthier development times and less business flexibility. This case illustrates the importance of understanding performance trade-offs, and the need to establish a balanced set of measures in line with business goals.
The IT Director of a UK Utility Company expressed his frustration with the service he was getting from his supplier of outsourced application maintenance services in the memorable words:
"We used to have them on fixed price, and we could never find them. Now we have them on Time and Materials, and we can't get rid of them."
These words illustrate perfectly the difficulties of controlling on input measures of resources consumed, rather than on output measures of work delivered. The answer was to develop some simple estimating formulae, specific to the local applications and environment, by which customer and supplier could rapidly agree the cost of any standard task.
Types of standard maintenance tasks were defined at various levels of complexity. The latter was expressed in, for example, the number and size of files or screen fields that had to be changed. The effort required to complete each standard task was established in units of 'standard-hours'. So when any user needed a maintenance change, a cost could be given according to the agreed formulae. Performance improvement targets could be set for the supplier, in terms of target reductions in the standard-hours for specified tasks, and hence also reductions in costs.
In this case a major multinational manufacturer outsourced his world-wide legacy application maintenance and support to a single supplier. The key performance parameter for payments was agreed to be $/FP supported. This implied measurement of the size of the portfolio of some hundreds of thousands of FP's, a formidable task. The problem was solved by a combination of very careful sampling of the portfolio, and the use of approximation techniques of FP sizing, which enabled the task to be completed within acceptable timescales, cost and accuracy.
The contract is now understood to be working smoothly. Independent benchmarks are used periodically to establish external performance comparators and, with experience, other performance measures have been added, notably product quality measures.
This supplier of military command and control systems faces severe estimating and bidding challenges. ITT's are voluminous, but of uneven detail. Some requirements are specified at a low level of detail, whilst other simple statements may cover a lot of complex functionality. Increasingly, the solutions to such requirements are built mainly from 'COTS' (or Commercial Off-The Shelf) software from various sources, and other re-usable code. The supplier's task is to provide the 'glue' software to bind the COTS software into a coherent system. This may be required at various 'levels', for example at the man-machine interface level, the application level, the middleware, and the operating systems of clients and servers.
It is clear that a variety of environment and COTS-specific estimating and performance measurement methods are needed. A particular challenge when an ITT is received is to form a view very rapidly on the size of the contract on offer, so that the bid strategy can be determined. This requires simple estimating 'rules of thumb' based on parameters such as the number of major entities or entity-groups, and COTS components required. These can be developed by keeping and analysing records of previous bids and projects, both of components delivered and of the effort involved. FPA may be valuable for certain components, for example where bespoke software is required.
A recent case in the UK of a dispute between a customer and software supplier which was resolved in court, provides a number of lessons and set an important legal precedent. The case concerned a Local Government organisation, which had commissioned software at a cost of £1.3M to handle a new local tax. The software proved unreliable, and as a result the customer's costs doubled.
The court determined that software is 'goods' (that is, not the result of services provided) and hence has to be 'fit for purpose'. This means that it is the supplier's responsibility to ensure that the software 'behaves as advertised'. The supplier lost the case, had to pay substantial damages, and suffered adverse publicity. It turned out that the contract had been managed badly from several points of view, including a poor definition of requirements and poor estimating.
In general, this area of performance measurement to help control software contracts, especially outsourcing contracts, is rather under-developed. A combination of customer inexperience and weaknesses of the performance control methods has probably contributed to the seeming acceptance that budget and cost over-runs, and poor quality products are commonplace in the software supply industry.
But the game is changing. Suppliers are probably aware that customers are becoming more sophisticated, and certainly are more inclined to go to law or arbitration to settle disputes. So suppliers must build their experience in software performance measurement and estimating. (This seems to be an under-developed subject for even some of the best-known names in the industry!) As the software world becomes more competitive and professional, good estimating will become a matter of survival.
The SEI's Capability Maturity Model requires mastery of software metrics only at Level 4, the 'Managed Process' level. But it may take years to develop experience in performance measurement and to build a base of measures sufficient to help improve performance and estimating. Software producers should certainly have introduced performance measurement by the time they have reached maturity Level 2, the 'Repeatable Process' level. By definition, if the process is repeatable, then past performance measures can be used reliably for future estimating, and can be applied to re-usable software components.
Although customers may feel they can get overall better Value for Money by contracting out or outsourcing their software development, there are limits to what responsibilities can be shifted to the supplier, even with the most constructive partnership. Above all, customers must retain the ability to manage contracts and suppliers. They must also at least understand the subject of performance measurement if they are to retain control. (The measurement work itself can be sub-contracted to independent experts.) With this understanding, and an understanding of supplier objectives and needs, they can take on methods such as 'SCUD' for software contracting with confidence.
Finally, there are lessons for the whole software engineering community. The methods of measuring software are only just adequate for controlling certain types of development and maintenance activities. By comparison with the advances made in software engineering generally, the subject of software metrics makes progress at snail's pace.
There is a real need for improved software metrics, compatible with modern software development methods, and with a sound theoretical basis. The metrics must be capable of continuous improvement via the 'estimate - measure - refine' feedback cycle. They must be made to work in a coherent manner across all software domains. Undoubtedly there is much to learn from ideas on work measurement in Industrial Engineering - provided you have repeatable processes! This is one of the biggest challenges we face, if we are to justify the title of 'Software Engineers'.
|143 High Street |
| Tel +44 (0) 1900 863 123 |