Monday, December 19, 2016

How can my organization know if our Agile transformation is successful?

Scope of Report
It is commonly accepted that most organizations today have moved, are moving, or are evaluating a move toward the use of the Agile methodology. This report considers: (a) why the move to Agile; (b) what it means to adopt the Agile methodology to incur a transformation; (c) how to measure to know if your transformation is successful; and (d) how to ensure that the effects of the transformation are continued.

Why the move to Agile?
An IT organization has certain responsibilities that relate directly to their business client and the rest of the organization. From a business perspective, there are five (5) core goals for any IT team.
  1. Effectively manage workflow
  2. Proactively manage end user expectations
  3. Accurately plan, budget and forecast deliveries
  4. Accurately estimate deliverables
  5. Show value to the organization and the client
Agile, when properly adopted, has been shown to be an effective development method that addresses each of these five goals. As with any new business strategy, the move to Agile would be an attempt to optimize business efficiencies that affect the bottom line and the client-supplier relationship.

What is Agile transformation?
Tom Cagley has suggested that a transformation is a “complete or major change in someone's or something's appearance, form, etc.”; in other words, a changeover, metamorphosis, transfiguration, or conversion. Transformation “evokes a long-term change program that will result in a large-scale, strategic change impacting a whole organization (or at least a significant part)”. For Agile, it means fostering an environment of teamwork, trust, and open communication to facilitate continuous or frequent delivery of working software.

When an organization embraces such a change, it typically has gone through several stages. First, discovery -- a realization of organization needs and how you will attempt to fulfill the needs through a process solution. This is also characterized by knowledge gathering and process analysis. Secondly, proof-of-concept coordination through the organization to solicit sponsors and stakeholders, and assign participants to test the solution. This is executed through a pilot program, or a sampling of teams to use Agile, to generate interest and enthusiasm. Using the lessons learned, and positive and negative feedback, the organization then moves to definition, a more structured approach to implementing Agile. The last phase is institutionalization, in which the transformation is complete, and Agile is used throughout the organizational IT community. This is exemplified as not just a practice, but a ‘core foundation’ based upon innovation and business value.

Do we only start to measure when institutionalization occurs, or do we measure through all the process steps to realize when we have arrived at transformation? Obviously, the answer is that we implement metrics as the process evolves to be able to measure process outcomes, adjust the implementation as necessary, continuing to progress until the goal is reached.

What then do we measure to gauge transformation?
Scrum is a common approach to implement Agile project management. Other Agile and Lean frameworks include Extreme Programing (XP), Crystal, and Scaled Agile Framework Enterprise to name a few. The measures and metrics mentioned in this paper can be applied to most if not all.
There are several key metrics that are used to measure the Scrum environment. To review the terms and the process, the following is the framework which is being measured.
  • A product owner creates a prioritized requirement list called a product backlog.
  • During sprint planning, the team pulls a subset from the product backlog to accomplish in a single sprint.
  • The team decides how to implement the features that are represented in the subset.
  • The team has to complete the work in a 1-4 (2 weeks being typical) week sprint.
  • The team meets each day to assess its progress (daily Scrum or Stand-up).
  • During the sprint, the Scrum Master facilitates delivery of value.
  • By the end of the sprint, the features (work performed) meet the definition of done and are ready for delivery.
  • At the end of the sprint, the team engages in a sprint review and retrospective.
  • For the next sprint, the team chooses another subset of the product backlog and the cycle begins again.
The following are the recommended metrics based upon process measurement within that framework. All of them imply that there are organizational targets that once met would support the transformation.

1. Velocity and Productivity
According to the Scrum Alliance: “Velocity is how much product backlog effort a team can handle in one sprint. This can be estimated by using the historical data generated in previous sprints, assuming the team composition and sprint duration are kept constant. Once established, velocity can be used to plan projects and forecast releases.”

Velocity is a measure of throughput - an indication of how much, on average, a particular team can accomplish within a time box. Velocity can be gauged by the number of user stories delivered in a sprint, by the number of story points delivered in a sprint, or by the number of function points delivered in a sprint. Since user stories are not generally considered equal in complexity or time to develop, they have too much variability to be a reliable measure. Story points are subjective and are generally only consistent within a stable team. Again there may be too much variability to measure at an organization level, or across teams.

While story points provide the micro view within teams, we need some way to measure the macro view across multiple teams. Function points can be used at the inception of the project to size the backlog, to determine the deliverability of the minimum viable product and to capture actual size at completion. This allows a quantitative view of volatility. In addition, function points are a rules based measure of size, therefore, can be applied consistently and are useful for standardizing velocity or productivity. Productivity is size/effort, expressed as function points delivered per FTE or team member. Using function points as a basis for size, an organization can compare performance within dynamic teams and to the industry through the use of agile benchmark data.

2. Running Tested Features (RTF)
In general terms, the Running Tested Features (RTF) metric reflects “how many high-risk and high-business- value working features were delivered for deployment. RTF, counts the features delivered for deployment denominated per dollar of investment. The idea is to measure, at every moment in the project, how many features/stories pass all their (automated) acceptance tests and are known to be working”. The two components are time (daily) and the number of running, tested features ready for delivery to the business client. This metric is often used in environments where operations or production environments are “owned” by separate organizations (often true in DoD and Government environments).

3. Burn down/Burn up charts
According to Wikipedia, “A burn down chart is a graphical representation of work left to do versus time. The outstanding work (or backlog) is often on the vertical axis, with time along the horizontal. That is, it is a run chart of outstanding work. It is useful for predicting when all of the work will be completed.”

A burn up chart tracks progress towards a project's completion. In the simplest form, there are two lines on the chart. The vertical axis is amount of work, and is measured in units customized to your own project. Some common units are number of tasks, estimated hours, user stories or story points. The horizontal axis is time, usually measured in days.

These charts can allow you to identify issues (e.g. scope creep) so adjustments can be made early in the cycle. They are also effective tools for communicating with clients and management. The advantage of a burn up chart over a burn down chart is the inclusion of the scope line. It also allows you to visualize a more realistic completion date for the project, by extending a trend line from the scope as well as the completion line. Where the two trend lines meet is the estimated time of completion.

4. Technical Debt
Technical debt is a measure of the corners cut when developing a functionality (e.g. to prove that the functionality can be implemented and is desirable) the code may be written without full error trapping. As technical debt increases, it can become harder to add new features because of constraints imposed by previous poor coding. The measurement of technical debt was introduced in parallel with Extreme Programming (XP) which introduced the concept of “refactoring” or regularly revisiting inefficient or hard to maintain code to implement improvements. XP builds in refactoring, restructuring and improving the code as part of the development process. Technical debt is typically measured using code scanners which use proprietary algorithms to generate a metric based on the number of best practice rules that a particular piece of code infringes.

5. Defect Removal Effectiveness (DRE) and Defect Escape Rate (DER)
Measuring quality has always been a key metric, regardless of the life cycle methodology. The two key metrics in this area measure the ability to remove defects prior to release where:

The question usually arises over the time frame for a ‘release’. Quite simply, it depends on your delivery schedule – if you do a real release every 2 weeks, then that may be your measure of time. It is important to be consistent. As with any defect measurement, you will have to decide what priority defects are considered and are they all treated equally in the equation.

6. Work Allocation
There are three team metrics which can be used to support the outcomes of other metrics (cause and effect). The organization makes a sizable investment in building a solid cross-functional team with the right expertise for the product development. To protect the investment there is a key focus on building core product teams with deep product and technology knowledge. Rotating team members reduces the team scalability as continuity is constantly broken between releases. The following metrics are mainly targeted to gauge impact of team assignments, team changes between releases, and how the time is actually used – all which can affect delivery and costs:

1) Team utilization is quantified by the Team Utilization Quotient (TUQ). TUQ = Average time spent by team on the project
Example: Utilization is 10 resources for 5 months project.
- 4 resources joined in the beginning
- 2 resources joined after 2.5 months (50% project left)
- 4 resources joined in the last month of the project (25% project left)
TUQ = {(4*1)+(2*.5)+(4*.25)}/10 = .60 = 60%

2) Team scalability is quantified by the Team Scalability Quotient (TSQ): TSQ = % of the team retained from the previous release

In a TUQ example, we built a team of 10 people. The team had low utilization because of team assignments. Assuming the team is ready to take on next the version of the product, if you replace half of the team members with newer members to work on the new product release it reduces team scalability by 50%.

The third team metric is Work Allocation. This is a simple chart showing what percentage of available time was spent across all work categories for the sprint. Time activities should not only consider development activities but must include the time spent with clients, customers and stakeholders. In Agile, which fosters a cooperative environment, time needed for communication and feedback is as important as the time to code and test.

The use of these metrics should encourage resource managers, Scrum masters and Scrum coaches, to carefully consider how time and resource allocation impacts team efficiency and scalability. The transformation of the organization is from hero building to team building, and if you want to gain a fair ROI, you will invest in developing cross-functional teams. Obviously, disrupting teams will not generate the delivery responses you seek. Conversely, as team dynamics are fostered and improve, so will velocity.

7. Customer Satisfaction and Team SatisfactionLast but certainly not least, one of the measures which is highly revealing of performance is customer satisfaction. Customer satisfaction answers the question of whether the client is happy with the delivery, quality and costs of the functionality being delivered. Satisfaction provides a view into how the team is perceived by the clients.

Team satisfaction measures how the team is affected by agile adoption. Agile transformation provides an environment that values technical innovation, collaboration, teamwork, and open and honest communication which yields higher team satisfaction. Team satisfaction is positively correlated to productivity. Team satisfaction can be an indicator of how well the organization has implemented Agile.

How do you know that the effects of the transformation will continue?
The most common answer is “you don’t know for sure”. As a matter of record, experience has shown us that without continued measurement and adequate coaching, teams fall into entropy and lose efficiencies. A measurement feedback model should be in place to monitor performance levels, to know when to get coaching and how to address process improvements as needed.
At any point in the transformation, an independent assessment may be in order to determine where you are in comparison to where you want to be. Feedback from an assessment is critical for developing a fact-based plan for improvement.

The journey to transformation involves a cultural organizational change which can be thoroughly measured using common Agile metrics. The efficiencies of the new Agile environment can be quantified, maintained and improved through the use of a continuous measurement framework and periodic independent assessments.

SPAMCAST, Tom Cagley. Nov 2015. So You Want A Transformation! Agile Metrics: Running Tested Features, 9 June 2014,
Wikipedia Burn down chart
Metrics Minute: Burn-Up Chart, Tom Cagley.
Metrics Minute: Burn-Down Chart, Tom Cagley.
Clarios Technology: What is a burnup chart?,
Technopedia: Technical Debt,
XBOSOFT: Defect Removal Effectiveness Agile Testing Webinar Q&A,
Agile Helpline: Agile Team's Efficiency in Various Resource Allocation Models. various.html
DCG Software Value. Webinar: Agile Metrics What, When, How, David Herron. Nov. 2015.

This blog was originally posted at

Wednesday, November 16, 2016

Four Steps to Assessing Software Value in an M&A

If there is one time when business value is front and center in a conversation, it is during a merger or acquisition process.  The acquiring company wants to know the true value of the company it’s acquiring and the company being acquired wants to prove its value as a viable option for acquisition.  In the case of a merger, both companies have these same two concerns – what is their real value and what is the value of the company with which they are potentially merging?

In today’s organizations, technology, and more specifically, software is an aspect that needs to be carefully assessed to determine its value to the M&A deal as an asset or potential liability (i.e. requiring significant upgrades or maintenance or performing poorly).    

To begin the evaluation process, I recommend looking at the software in relation to the business functions of the target company.  Is the software unique to the company’s line of business or is it used for a business function that is common between the two organizations (i.e. HR, payroll, CRM).  Most likely, the software that is performing the same function in both companies will be of little business value to the acquiring company as they will choose to keep their existing software.

However, a software solution that is unique to the target company could have tremendous value.  The challenge is that the acquiring company may not be familiar with the software and have a limited understanding of its value or the risk associated with that software.  In addition, if there are only a few individuals who understand how to use and maintain the software (especially with proprietary software) there is a risk that they will not remain at the company and as a result there will be no knowledgebase to maintain and/or enhance the software.

I recommend taking four key steps during the acquisition process to determine the value of the target company’s software:
  1. Software Asset Due Diligence (ADD) – determine how the target organization relies on the software.
  2.  Software Asset Risk Management (ARM) – assess the risk involved in transitioning to the target organization’s software.
  3.  Software Asset Maturity Analysis (AMA) – determine the future ROI for the acquired software.
  4. Software Asset Integration Management (AIM) – analyze how to integrate the acquired software into the current environment. 
A software assessment needs to be an integral part of the M&A process – no matter what end you’re on.  It can no longer be an after-thought.  Software can provide significant value or pose a huge risk for an organization and that needs to be determined up front.

I’m always interested in hearing from others about your experiences on how your organization has handled the software assessment process during a merger or acquisition.  What lessons have you learned?

Mike Harris

This blog was originally posted at

Wednesday, October 26, 2016

Measuring Software Value Using a Team Health Assessment

Software development is a team effort. Agile software development, in particular, depends on a high level of communication between team members. In order to be able to improve the business value they are delivering, it is important that the software development teams conduct regular self-assessments. By taking the time to conduct an in-depth assessment of the key areas that impact team performance and health, an organization can make modifications to their processes to enable continual improvement that can lead to increased business value.

In Agile, teams typically rely on sprint retrospectives to analyze their performance for continuous improvement. The challenge is that these events are team- and sprint-specific and often become wasteful ceremonies in that they don’t add any new value.

It is common for the team to reach a point where they have discussed and fixed the things they can fix and the things they can’t fix require organizational intervention, which is outside their span of control. It is easy – and probably correct – for teams in this situation to conclude that sprint retrospectives should be abandoned because, from a lean perspective, they are not adding value and so represent waste to be removed.  

Over the years, our team has leveraged the AgilityHealth℠ Radar (AHR) TeamHealth Assessment as an event to review team dynamics on a quarterly basis. This structured, facilitated event is an opportunity for a more strategic review than the sprint retrospective typically allows..
There are five vital areas that can impact the health of an Agile team: Clarity, Performance, Leadership, Culture, and Foundation. Each should be carefully evaluated to help the team identify their strengths, areas of improvements and top impediments to growth. From there, a growth plan outlining the target outcomes for the following few months can be developed.

The true value of an assessment like this comes from the open and honest conversations that take place enabling the team to evaluate their performance and outcomes and continually improve their processes for the future.  

Does your software development team regularly assess the team’s performance and make adjustments for future growth?  If so, is there a specific methodology your organization uses?

Mike Harris

This blog was originally posted at

Monday, October 10, 2016

How can I use SNAP to improve my estimation practices?

Scope of Report
This month’s report will focus on how to improve estimation practices by incorporating the Software Non- functional Assessment Process (SNAP) developed by the International Function Point User’s Group (IFPUG) into the estimation process.

Software Estimation
The Issue
Software development estimation is not an easy or straightforward activity. Software development is not like making widgets where every deliverable is the same and every time the process is executed it is the same. Software development varies from project to project in requirements definition and what needs to be delivered. In addition, projects can also vary in what processes and methodologies are used as well as the technology itself. Given these variations it can be difficult to come up with a standard, efficient, and accurate way of estimating all software projects.

The Partial Solution
Software estimation approaches have improved but these have not been widely adopted. Many organizations still rely on a bottom-up approach. For many years, development organizations have used a bottom-up approach to estimation based on expert knowledge. This technique involves looking at all of the tasks that need to be developed and using Subject Mater Experts (SMEs) to determine how much time will be required for each activity. Often organizations ask for input separately, but often a Delphi method is used. The Delphi method was developed in the 1950’s by the Rand Corporation. Per Rand “The Delphi method solicits the opinions of experts through a series of carefully designed questionnaires interspersed with information and feedback in order to establish a convergence of opinion”. As the group converges the theory is that the estimate range will get smaller and become more accurate. This technique, and similarly Agile planning poker, is still utilized, but often is just relying on expert opinion and not data.

As software estimation became more critical other techniques began to emerge. In addition to the bottom-up method, organizations began to utilize a top-down approach. This approach involved identifying the total costs and dividing it by the number of various activities that needed to be completed. Initially this approach also was based more on opinion than fact.

In both of the above cases the estimates were based on tasks and costs rather than on the deliverable. Most industries quantify what needs to be built/created and then based on historical data determine how long it will take to reproduce. For example, it took one day to build a desk yesterday so the estimate for building the same desk today will also be one day.

The software industry needed a way to quantify deliverables in a consistent manner across different types of projects that could be used along with historical data to obtain more accurate estimates. The invention of Function Points (FPs) made this possible. Per the International Function Point User Group (IFPUG), FPs are defined as a unit of measure that quantifies the functional work product of software development. It is expressed in terms of functionality seen by the user and is measured independently of technology. That means that FPs can be used to quantify software deliverables independently of the tools, methods, and personnel used on the project. It provides for a consistent measure allowing data to be collected, analyzed, and used for estimating future projects.

With FPs available the top-down methodologies were improved. This technique involves quantifying the FPs for the intended project and then looking at historical data for projects of similar size to identify the average productivity rate (FP/Hour) and determine the estimate for the new project. However, as mentioned above, not every software development project is the same, so additional information is required to determine the most accurate estimate.

Although FPs provide an important missing piece of data to assist in estimation, they do not magically make estimation simple. In addition to FP size, the type of project (Enhancement or New Development) and the technology (Web, Client Server, etc.) have a strong influence on the productivity. It is important to segment historical productivity data by FP size, type, and technology to ensure that the correct comparisons are being made. In addition to the deliverable itself, the methodology (waterfall, agile), the experience of personnel, the tools used, and the organizational environment can all influence the effort estimate. Most estimation tools have developed a series of questions surrounding these ‘soft’ attributes that raise or lower the estimate based on the answers. For example, if highly productive tools and reuse are available then the productivity rate should be higher than average and thus require less effort. However, if the staff are new to the tools, then the full benefit may not be realized. Most estimation tools adjust for these variances and are intrinsic to the organizations’ historical data.

At this point we have accounted for the functional deliverables and the tools, methods, and personnel involved. So what else is needed?

The Rest of the Story
Although FPs are a good measure of the functionality that is added, changed, or removed in a software development or enhancement project, there is often project work separate from the FP measurement functionality that cannot be counted under the IFPUG rules. These are typically items that are defined as Non-Functional requirements. As stated in the IFPUG SNAP Assessment Practices Manual (APM), ISO/IEC 24765, Systems and Software Engineering Vocabulary defines non-functional requirements as “a software requirement that describes not what the software will do but how the software will do it. Examples include software performance requirements, software external interface requirements, software design constraints, and software quality constraints. Non-functional requirements are sometimes difficult to test, so they are usually evaluated subjectively.”

IFPUG saw an opportunity to fill this estimation gap and developed the Software Non-Functional Assessment Practice (SNAP) as a method to quantify non-functional requirements.


IFPUG began the SNAP project in 2008 by initially developing an overall framework for measuring non- functional requirements. Beginning in 2009 a team began to define rules for counting SNAP and in 2011 published the first release of the APM. Various organizations beta tested the methodology and provided data and feedback to the IFPUG team to begin statistical analysis. The current version of APM is APM 2.3 and includes definitions, rules, and examples. As with the initial development of FPs, as more SNAP data is provided adjustments will need to be made to the rules to improve accuracy and consistency.

SNAP Methodology
The SNAP methodology is a standalone process; however, rather than re-invent the wheel, the IFPUG team utilized common definitions and terminology from the IFPUG FP Counting Practices Manual within the SNAP process. This also allows for an easier understanding of SNAP for those that are already familiar with FPs.

The SNAP framework is comprised of non-functional categories that are divided into sub-categories and evaluated using specific criteria. Although SNAP is a standalone process it can be used in conjunction with FPs to enhance a software project estimate.

The following are the SNAP categories and subcategories assessed:

Each sub-category has its’ own definition and assessment calculation. That means that each subcategory should be assessed independently of the others to determine the SNAP points for that subcategory. After all relevant subcategories have been assessed the SNAP points are added together to obtain the total SNAP points for the project.

Keep in mind that a non-functional requirement may be implemented using one or more subcategories and a subcategory can be used for many types of non-functional requirements. So the first step in the process is to examine the non-functional requirements and determine which categories/subcategories apply. Then only those categories/subcategories are assessed for the project.
With different assessment criteria for each subcategory it is impossible to review them all in this report; however, the following is an example of how to assess subcategory 3.3 Batch Processes:
Definition: Batch jobs that are not considered as functional requirements (they do not qualify as transactional functions) can be considered in SNAP. This subcategory allows for the sizing of batch processes which are triggered within the boundary of the application, not resulting in any data crossing the boundary.

Snap Counting Unit (SCU): User identified batch job

Complexity Parameters: 1. The number of Data Elements (DETs) processed by the job

2. The number of Logical Files (FTRs) referenced or updated by the job
SNAP Points calculation:

Result: Scheduling batch job uses 2 FTRs so High complexity. 10*25 DETs= 250 SP >/p>
Each non-functional requirement is assessed in this manner for the applicable subcategories and the SP results are added together for the total project SNAP points.

SNAP and Estimation
Once the SNAP points have been determined they are ready to be used in the software project estimation model. SNAP is used in the historical top-down method of estimating, similar to FPs. The estimator should look at the total SNAP points for the project and look at historical organization data if available, or industry data for projects with similar SNAP points to determine the average productivity rate for non-functional requirements (SNAP/Hours). Once the SNAP/Hour rate is selected the estimate can calculate effort by taking the SNAP points divided by the SNAP/Hour productivity rate. It is important to note that this figure is just the effort for developing/implementing the non-functional requirements. The estimator will still need to develop an effort estimate for the functional requirements. This can be done by taking the FPs divided by the selected FP/Hour productivity rate. Once these two figures are calculated they can be added together to have the total effort estimate for the project.

Estimate example:

Note that the SNAP points and the FPs are not added together, just the effort hours. SNAP and FP are two separate metrics and should never be added together. It is also important to make sure that the same functionality is not counted multiple times between SNAP and FPs as that would be ‘double counting’. So, for example, if multiple input/output methods are counted in FPs they should not be counted in SNAP.

This initial estimate is a good place to start; however, it is also good to understand the details behind the SNAP points and FPs to determine if the productivity rate should be adjusted. For instance, with FPs, an enhancement project that is mostly adding functionality would be more productive than a project that is mostly changing existing functionality. Similarly, with SNAP, different categories/subcategories may achieve higher or lower productivity rates. For example, a non-functional requirement for adding Multiple Input Methods would probably be more productive than non-functional requirements related to Data Entry Validations. These are the types of analyses that an organization should conduct with their historical data so that it can be used in future project estimations.

FPs have been around for over 30 years so there has been plenty of time for data collection and analysis by organizations and consultants to develop industry trends; but it had to start somewhere. SNAP is a relatively new methodology and therefore has limited industry data that can be used by organizations. As more companies implement SNAP more data will become available to the industry to develop trends. However, that doesn’t mean that an organization needs to wait for industry data. An individual company can start implementing SNAP today and collecting their own historical data, conducting their own analyses, and improving their estimates. Organizational historical data is typically more useful for estimating projects anyway.

An estimate is only as good as the information and data available at the time of the estimate. Given this, it is always recommended to use multiple estimation methods (e.g. bottom-up, top-down, Delphi, Historical/Industry data based) to find a consensus for a reasonable estimate. Having historical and/or industry data to base an estimate upon is a huge advantage as opposed to ‘guessing’ what a result may be. Both FP/Hour and SNAP/Hour productivity rates can be used in this fashion to enhance the estimation process. Although the estimation process still isn’t automatic and requires some analysis, having data is always better than not having data. Also, being able to document an estimate with supporting data is always useful when managing projects throughout the life cycle and assessing results after implementation.

  • Rand Corporation
  • Counting Practices Manual (CPM), Release 4.3.1; International Function Point User Group (IFPUG),
  • APM 2.3 Assessment Process Manual (SNAP); International Function Point User Group (IFPUG),

This blog was originally posted at

Monday, October 3, 2016

The Magic Quadrant for Software Test Automation

One of the most fundamental questions test engineers ask before starting a new project is what tools they should use to help create their automated tests. Luckily, Gartner issues a yearly report to address this issue. This report, “Magic Quadrant for Software Test Automation,” focuses specifically on functional software test automation and the UI automation facilities of tools. The use cases the report considers with regard to each tool includes:
  • They must support mobile applications
  • They must feature responsive design
  • They must support packaged applications
With those use cases as evaluation criteria, Gartner evaluated 12 major vendors:
  1. Automation Anywher
  2.  Borland
  3. Hewlett Packard Enterprise
  4. IBM
  5. Oracle
  6. Original Software
  7. Progress
  8. Ranorex
  9. SmartBear
  10.  TestPlant
  11. Tricentist
  12.  Worksoft
As part of its analysis, Gartner placed each vendor in one of four categories:
  1. Leaders – Those who support all three use cases.
  2. Challengers – Those who have strong execution but typically only support two of the use cases.
  3. Visionaries – Those who generally focus on a particular test automation problem or class of user.
  4. Niche Players – Those who provide unique functions to a specific market or use case.
Beyond that, the vendors were assessed by their ability to execute and their completeness of vision. In short, ability to execute is ultimately the ability of the organization to meet its goals and commitments. Completeness of vision is the ability of the vendor to understand buyers’ wants and needs and successfully deliver against them.

The result is above. It’s important to mention that Gartner notes that most organizations typically have more than one automation tool provider. In addition, many of the solutions are still maturing – and will continue to mature over time.

Gartner updates the report on an annual basis, and it’s valuable to any organization who does testing. Testing, as we often say at DCG, is a key part of the development process, but it’s one that is often overlooked. The information in this report can enable organizations to make educated choices about software vendors, resulting in improved software quality and execution.

Read the article: “Magic Quadrant for Software Test Automation.”

Mike Harris

This blog was originally posted at

Wednesday, September 14, 2016

How can I establish a software vendor management system?

 Scope of Report
This month’s report will focus on two key areas of vendor management. The first is vendor price evaluation which involves projecting the expected price for delivery on the requirements. The second is vendor governance. This is the process of monitoring and measuring vendor output through the use of service level measures.

Vendor Price Evaluation
“Vendor Price Evaluation” seeks to enable pricing based on an industry standard unit of measure for functionality that puts the buyer and seller on an even playing field for pricing, bid evaluation and negotiation.

Organizations leverage third party vendors for the development of many of their software initiatives. As such, they are continuously evaluating competing bids and looking for the best value proposition.
Being able to properly size and estimate the work effort is critical to evaluating the incoming vendor bids. Furthermore, an internally developed estimate provides a stronger position for negotiating terms and conditions. The effective delivery of an outsourced project is in part dependent on an open and transparent relationship with the vendor. A collaborative estimating effort provides for greater transparency, an understanding of potential risks, and a collective accountability for the outcomes.
To better control the process, an economic metric is recommended to provide the ability to perform true value analysis. This metric is based on historical vendor spending over a diverse sampling of outsourced projects, thus creating an experiential cost-per-unit “baseline”. Knowing the cost-per-unit price gives you leverage in negotiation. Instead of using hours billed as a fixed price measurement, you know the functional value of deliverables which allows billing on a per unit delivered basis.
To achieve this, we recommend the use of function points as a measure of the functional size of the project. Function Points (FPs) provide an accurate, consistent measure of the functionality delivered to the end user, independent of technology, with the ability to execute the measurement at any stage of the project, beginning at completion of requirements. Abundant function point-based industry benchmark data is available for comparison.

By comparing historical cost-per-FP to industry benchmark data, organizations can quickly determine whether or not they have been over- (or under-) spending. Under-spending may not seem like a problem but under-bidding by vendors is an established tactic to win business that may not represent a sustainable price. If forced to sustain an unrealistic low price, vendors may respond by populating project teams with progressively cheaper (and weaker) staff to the point where quality drops and/or delivery dates are not met. At this point, having great lawyers to enforce the original contract doesn’t help much.

Implementing this approach provides an organizational methodology for bid evaluation and a metric for determination of future financial performance.

Vendor Governance
The key to a successful vendor governance program is an effective set of Service Level Agreements (SLAs) backed up with historical or industry benchmarked data and agreement with the vendor on the SLAs.

The measures, data collection, and reporting will depend on the SLAs and/or the specific contract requirements with the software vendor. Contracts may be based strictly on cost-per-FP or they may be based on the achievement of productivity or quality measures. A combination can also be used with cost-per-FP as the billable component and productivity and quality levels used for incentives and/or penalties.

There are a number of key metrics that must be recorded from which we can derive other measures. The key metrics commonly used are: Size, Duration, Effort, Staff, Defects, Cost & Computer resources. For each key metric, a decision must be made as to the most appropriate unit of measurement. See the appendix for a list of key metrics and associated descriptions.
The service level measures must be defined in line with business needs and, as each business is different, the SLAs will be different for each business. The SLAs may be the typical quality, cost, productivity SLAs or more focused operational requirements like performance, maintainability, reliability or security needs within the business. All SLAs should be based on either benchmarked or historical data and agreed with the vendor. Most SLAs are a derivative of the base metrics and thus quantifiable.

One output measure to consider adding is business value; fundamentally, the reason we are developing any change should be to add business value. Typically, business value isn’t an SLA but it can add real focus on why the work is being undertaken and so we are now recommending it. The business value metric can be particularly helpful in the client-vendor relationship because it helps to align the business priorities of the client and the vendor (or to highlight any differences!).
The key is to define the measures and the data components of the measures prior to the start of the contract to avoid disputes during the contract period.

Measurement reports for vendor management are typically provided during the due diligence phase of vendor selection and during the execution of the contract. During due diligence, the reports should provide the vendor with the client expectations regarding the chosen measures (e.g. Cost-per-FP, hours-per-FP, etc.). During the life of the contract, reports should be produced to show compliance to contract measures and to aid in identifying process improvement opportunities for all parties.
The typical reporting for vendor management consists of balanced scorecards for senior level management, project reports for project managers, and maintenance reports for support areas.

Balanced scorecard
These reports provide a complete picture of all measures required for managing the contract. These are typically summary reports that include data from multiple projects. The Balanced Scorecard Institute states that,

“the balanced scorecard was originated by Robert Kaplan and David Norton as a performance measurement framework that added strategic non-financial performance measures to traditional financial metrics to give managers and executives a more 'balanced' view of organizational performance”.

In the case of software vendor management, the scorecard should have multiple measures that show contract results. For example, even though productivity may be a key ‘payment’ metric, quality should also be included to ensure that in efforts to improve productivity, quality does not suffer. The report should also include a short analysis that explains the results reported to ensure appropriate interpretation of the data.

Project reporting
These reports focus on individual projects and are provided to project teams. The reports should contain measures that support the contract and provide insight into the project itself. Analysis should always be provided to assist teams with assessing their project and identifying process improvement opportunities to better meet the contract requirements.

Maintenance reporting
These reports are at an application level and would be provided to support staff. This data would provide insight into the maintenance/support work being conducted. Again, this would be in support of specific contract measures, but it can also be used to identify process improvement opportunities and/or identify which applications may be candidates for redesign or redevelopment.

Data Definition and Collection
Data definition and collection processes need to be developed to support the reporting. As stated in the book, “IT Measurement – Practical Advice from the Experts”, this step should,
“focus on data definition, data collection points, data collection responsibilities, and data collection vehicles”.

Who is going to collect the data? When it will be collected? How it will be collected? Where it will be stored? These are important questions to drive the implementation of the contract measurements but these all depend on the most difficult step, data definition.

Data definition involves looking at all of the data elements required to support a measure and ensuring that both the client and the vendor have the same understanding of the definition. Since most software vendor contracts utilize productivity (FP/effort), this report will focus on defining the data elements of FPs and effort by way of example.

Function Point Data Definition
Function point guidelines should be developed for all parties to follow. This should include which industry standard will be used (e.g. International Function Point User Group Counting Practices Manual 4.x) as well as any company specific guidelines. Company specific guidelines should not change any industry standard rules, but provide guidance on how to handle specific, potentially ambiguous situations. For example, how will purchased packages be counted -- Will all functions be counted? Or will just the ‘customized’ functions be counted? Another consideration is how changes to requirements throughout the lifecycle will be counted. For example, some organizations count functions one time for a project unless a changed requirement is introduced late in the life cycle (e.g. system testing). Then a function may be counted more than once. Guidelines need to be established up front for as many situations as possible, but may need to be updated throughout the life of the contract as new situations arise.

Effort Data Definition
Effort can be one of the more contentious data elements to define in software vendor management systems. It is important to determine what life cycle activities are included in each aspect of the software vendor contract. For instance, if productivity is an SLA or a payment incentive, then vendors will want to exclude certain activities that clients may want to include. One example is setting up a test environment for a project. A vendor may want to exclude this from the productivity calculations while a client may think it should be included. A ‘rule of thumb’ is that if an activity is required for the project specifically the effort should be included. If the activity is to set up something that is for all projects to use, then it should be excluded. So in the test environment example if the vendor is setting up scenarios or simulators to test specific project functionality the effort should be included as part of the project productivity calculation. If the vendor is installing servers to host test data and tools, the effort should be excluded. There are more effort categories to examine than can be included in this report. A non- category issue decision with effort is the inclusion or not of “overtime” hours. The recording of overtime hours in time management systems tends to vary widely even within organizations because many software development employees are not paid for overtime hours. The important thing is for vendors and clients to work together to define and document the guidelines.

Code Quality Analytics
In addition to the standard SLAs and beyond functional testing, code analytics and an application analytics dashboard can provide an IT organization with key insights into code quality, reliability and stability of the code being delivered by the vendor.

Code analytics tools, such as those provided by CAST Software, analyze the delivered code to detect structural defects, security vulnerabilities and technical debt. The metrics generated by these tools can be used as SLAs.

There is value in understanding what is being developed throughout the lifecycle. In this way security, performance and reliability issues can be understood and addressed earlier while still in development.

In a waterfall development environment, code analytics can be executed at defined intervals throughout the lifecycle and after deployment to production. In an Agile framework, code analytics can be run as part of each code build, at least once per sprint, and code quality issues can be resolved real time.

Having this information early in the lifecycle enables fact-based vendor management. Code analytics, along with traditional measurements provides the buyer with the information needed to manage their vendor relationships and ensure value from their IT vendor spend.

A robust vendor management system includes:
  • Pricing evaluation using industry standard measures to promote meaningful negotiations,
  • Service level metrics backed up with historical or industry benchmarked data and
  • Code analytics to ensure quality, reliability and stability are built into the systems being developed.
With these components in place an organization can efficiently manage vendor risk, monitor and evaluate vendor performance and ensure value is derived from every vendor relationship.
  • Balanced Scorecard Institute Website – scorecard
  • “IT Measurement – Practical Advice from the Experts.” International Function Point Users Group. Addison Wesley (Pearson Education, Inc.).2002 – Chapter 6 Measurement Program Implementation Approaches.
  • CAST Software Website – Application Analytics Software -
Appendix - Key Metrics

Project size can be described in several ways, with software lines of code (SLOC) and function points being the most common.

Function Points
The industry standard approach to functional size is Function Points (FPs). It is a technology agnostic approach and can be performed at any point of the lifecycle.

FP analysis provides real value as a sizing tool. Even in software developed using the latest innovations in technology, the five components of function point analysis still exist so function point counting remains a valuable tool for measuring software size. Because a FP count can be done based on a requirements document or user stories, and the expected variance in FP counts between two certified function point analysts is between 5% and 10%, an accurate and consistent measure of the project size can be derived. And because FP analysis is based on the users’ view and independent of technology it works just as well as technology evolves.

Source lines of code is a physical view of the size but can only be derived at the end of a project.
It has some inherent problems, one being that inefficient coding produces more lines of code, another being the fact that determining the SLOC size of a project before it is coded is itself an estimate.
However, it can be used retrospectively to review a projects performance and you need to consider ESLOC or effective Source lines of code to remove the expert/novice factor of more lines of codes highlighted above.

Code analysis tools like CAST can provide excellent diagnostics and even FP counts based on the code.

Story Points
Projects in an Agile framework typically use Story Points to describe their relative size. They work well within a team but are ineffective at an organization level to consider relative size.
For example, a team can double their velocity simply by doubling the number of story points they assign to each story. They can also vary from one team to another as they are only relevant to the team, and sometimes, the sprint in question.

Time (Duration)
Simply the time measure for completing the project and/or supporting the application. This is calendar time, not effort

Effort is the amount of time to complete a project and/or support an application. Typically, hours is the metric used as it is standard across organizations. Work days or months may have different definitions across organizations.

Effort is one of the more challenging pieces of data to collect and the granularity at which you can analyze your measures is determined by how you record and capture the effort.
In agile teams, the effort is relatively fixed but flexible in the work performed, so if you want to analyze testing performance you need to know the split of testing work and so on.

Quality is a key measure in a vendor management situation as the quality of the code coming into testing and into production determines how well the project performs. We are all aware of the throw it over the wall mentality when deadlines start to hit and the resultant cost is defects being delivered to production.

A common request is how many defects are expected for a project of particular size.
The truth is that the answer is not straightforward as many organizations have a different view of what a defect is and how to grade them. Set your criteria with the vendor framework first and then record going forward. A view of historical performance is extremely useful here as well.
The defects should be measured during User acceptance test as well as go-live during the warranty period and used to predict future volumes and identify releases where further investigation or discussion is warranted.

Staff – FTEs
This is the people metric, it is usually measured in FTE or Full time equivalents so we have a comparable metric, you might have had 20 different people work on a project with a peak staff of 8FTEs or 10 people with the same effort and staffing profile, it’s the FTEs that is consistent and comparable.

There is also person resource type that can be relevant here so consideration to things like onshore/offshore, contractor/permanent/consultant or designer/manager/tester may need to be included.

This may be actual cost or a blended rate per hour. Where multiple currencies are involved, assumptions may be need to be fixed about appropriate exchange rates.

Computer Resources
Computer resources covers the parameters of the technology environment such as platform, programming language etc. The final metric captures the “what?” and “how?” to allow to compare against similar project types by language and technical infrastructure.

This blog was originally posted at

Monday, September 12, 2016

How to Count Function Points from User Stories

I was recently involved in a consulting engagement where Agile methodologies were being implemented with User Stories as the documentation standard. The organization had used function points (FPs) for years on their waterfall projects and were wondering if they could use them for their Agile methodology – and if User Stories would be a good input into the FP counting process. The answer I provided was a resounding “YES.” Having User Stories is actually a huge advantage to counting FPs, especially early in the lifecycle, because User Stories are typically focused on the user perspective, just like FPs.

The only difficulty in using FPs in Agile methodologies is determining what to count and when to count. As with any metric, this always goes back to the purpose. For example, if you want to know the size of the final delivered product, then you count the FPs at the end of the project. If you want to estimate effort for a Sprint or Program Increment (PI), then you need to count at the beginning of the Sprint or PI.  The key is defining the purpose early in order to have access to what you need at the time of data collection.

When actually counting FPs from User Stories, there are a few tips that help with the process. Depending on the level of the User Stories, more questions or assumptions may be needed to get to an accurate FP count. There are also key words used in User Stories that may help identify FP components (e.g. Maintain, Report, Enter, Select). Often User Stories equate to transactional functions in FPs, so it is important for the FP analyst to identify data functions as they go along.
More tips and advice, including real-life examples, will be provided in my upcoming webinar, “Counting Function Points from User Stories,” taking place on Wednesday September 28, 2016 at 12:00 pm EST. Please register here. If you have any questions before the webinar, just leave a comment and I’ll be sure to address them during the presentation.

Lori Limbacher
Estimation Specialist; Certified Function Point Specialist (CFPS)