Mars Polar Lander bug – requirement or code problem

In 2000, I used model-based testing to identify the bug that was the likely cause of the Mars Polar Lander (MPL) crash. See “Mars Polar Lander Fault Identification Using Model-based Testing” for additional detail.

The question came up as to whether the bug was a requirement or code (implementation) problem. It is true that there are many cases when the issue with a system is caused by a requirement problem. The requirements can be wrong and even though the software completely satisfies the requirements (verification is correct), it doesn’t perform as needed (validation is not correct).

However, in the case of the MPL, the actual software had a defect – the requirements in this case were correct. The model was created from the requirements. The tests were generated from the model, and the tests exposed the defect in the software. Here’s a short description of the defect in the software implementation – see image at bottom.

If there were two consecutive sensor reads on the sensor of any of the three legs during the first phase of descent (5KM to 40 meters), those values were stored (latched) in state memory of the software. The SW was suppose to mark that sensor/leg bad (and it did), because it could not be on the surface yet. However, when the software switched mode at the second/final phase of descent (40 meters), the software used the sensor information stored in state memory, which caused the software to command the thruster to shutdown, resulting in a crash from approximately 40 meters above the surface of Mars.

Once the software marked a leg bad (during the first phase of descent), it should have cleared state memory, but it didn’t. That single line of additional code was added, after the fact, to correct the implementation.

The model represented the value of a sensor leg at two points in time, and the test driver injected the current and previous values for a leg sensor. Because there was no access to the state information, the technique used in the test driver (a common pattern used with MATRIXx) was to call the MPL twice, first with the previous sensor value, then call it again with the current state value of the sensor input, which propagates the information in state memory. On the second call both values are latched in state memory. This is the test case that exposed the defect.

Mars Polar Lander Descent Path

Posted in Uncategorized | 1 Comment

Estimation Issues – Part 2: Interface-driven Requirements

In Part 1 of Estimation Issues, I discussed issues that often surface during program area analyses. We have seen on several programs that inadequate estimation of the size of the effort is a common problem, which impacts the cost, schedule and an organization’s ability to deliver on time. Diseconomies of Scale occur estimating lines of source code; when using sources lines of code (LOC) as an estimation measure, if the range is within the 10,000 LOC to 100,000 LOC the growth is usually linear, however for programs over 100,000 LOC, the typical growth accelerates and can grow exponentially.

In this second post of our two-part series, I’ll introduce a second common issue — a lack of understanding of requirements. While one may argue that customers, as well as systems engineers and developer often do not fully understand the user-level requirements, let’s assume for a minute that they do understand the “high-level” requirements. The bigger issue in large multi-tier systems is that cost estimates do not account for derived requirements (feature impacts). These derived requirements are associated interface changes resulting from design decisions. The vertical dependencies down through the infrastructure are not tracked to understand the impacts. Horizontal dependencies prevalent in service-based/distributed systems and non-functional design decisions (e.g., security, reliability) are not factored into the estimate.

What’s a derived requirement? Derived requirements are “lower-level” requirements created as a result of making design decisions. Design decisions result in system components or modules that have new or modified interfaces. These types of requirements can result from technology or platform choices and details. There can be many levels of derived requirements. The key concept is:

If there is an Interface Boundary, then there are Derived Requirements

The interfaces associated with a derived requirement play a significant role in the estimation because interfaces are more directly related to the components than the derived requirement statement.

Conceptually, the approach is straight forward — simply look at the interfaces that will change at each component boundary and attempt to estimate the impact. Identify the number of attributes that must be produced as outputs or attributes that must be processed as inputs. That processing directly relates to requirements that must be developed and corresponding tests that must be planned, designed and executed.

Program managers, project leads, system engineers, and customers can use the information to understand detailed impacts on a requirement-by-requirement basis. This information can support better cost estimates and delivery scheduling. Personnel resource scheduling can be better factored into the development effort and cost. If tradeoffs are needed to address schedule or costs for a particular release, the quantification of the relative cost impacts can be discussed with customers who should have a better understanding of the immediate system needs.

We have developed an instrument to support the estimation process, which:

  • Promotes a more complete description of the requirements through systematic refinement of customer or high-level requirements
  • Provides a better understanding of the allocation of the requirements and associated interfaces related to the derived requirements
  • Quantifies the impact of a requirement across subsystems or components
  • Enables visualization of component dependencies as they relate to derived requirements and the various interface types
  • Promotes a more consistent and complete definition of interfaces across the different system components
  • Provides more accurate cost and schedule estimates through use of more fine-grained information

Let us help you with a program area analysis. Our method and tools helps us work with you to dive deeply into program issues. We work with the team members and discuss the details of the system while discussing issues. We often identify gaps in engineering practices. Ideally we help teams understand those issues, resolve the issues, and bridge those gaps on current or for future programs.

Please contact me to discuss how our approach might help your team improve its estimating effectiveness.

 

Posted in Uncategorized | 2 Comments

Engineering Analytics – Improving Cycle-time Prediction

You have probably heard of business analytics, which involves continuous iterative exploration and investigation of past business performance to gain insight and drive business planning. Engineering analytics is similar, but focused on engineering performance, and in this case prediction and estimation of cycle times. Many of the SSCI members and clients are high maturity organizations. They have been improving their process and practices, and capturing data such as control charts that track statistics about their projects. Unfortunately, even with all of the statistical data there are a lot of factors, both technical and non-technical that can cause significant variance from project to project.

We gave a Webinar on Bayesian Networks – A New Class of Management Tools for Prediction, Estimation and Risk Management. We discussed some applications performed with SSCI members in predicting software reliability based on quantitative defect data as well as subjective judgments about factors such as quality, complexity and architectural stability. The key benefit is that a Bayesian Nets (BN) represents causal models that combine sparse data with expert judgment transforming qualitative knowledge about the processes into quantitative predictions.

We have been working on another project applying hybrid BNs to predict project cycle times. We are using the quantitative statistical data collected in their control charts, and the BN combines the subjective judgment about factors such as complexity, quality of the requirements, design reuse, engineers’ expertise, and even judgments about subcontractors and suppliers. The results are quite impressive. In applying the models to about 25 projects, all which last more than one year, the models predictions reduces the predicted variance by more than 50%.

We are interested in hearing your interests and experiences with these type of prediction tools and techniques.

 

Posted in Uncategorized | Leave a comment

Estimation Issues – Part 1: Diseconomies of Scale

We often perform program area analyses, where we work with clients to understand issues and ultimately trace the issues to root causes that negatively impact their program performance. One issue that we have seen on several programs is inadequate estimation of the size of the effort, which impacts the cost, schedule and an organization’s ability to deliver on time.

Organizations sometimes use models of the software size, usually defined in terms of lines of code (LOC), as a basis for predicting cost and schedule. There are many factors that can lead to significant deviations in software size data; here are some common issues:

  • Underestimation of infrastructure
  • Lack of understanding of requirements
  • Unrealistic interpretation of original requirements and resource estimates to develop the system
  • Unexpected impact of legacy integration
  • Over expectation of the value of commercial off-the-shelf software (COTS)

In this first of a two-part series on estimation issues, we’re going to reflect on the Diseconomies of Scale, which lead to inaccurate estimates of the size of the software development effort. In software, the larger the system becomes, the greater the cost of each unit. If software exhibited economies of scale, a 100,000-LOC system would be less than 10 times as costly as a 10,000-LOC system, but the opposite is almost always the case. Barry Boehm, in 2000, provided historical data that reflects on the diseconomies of scale for software-intensive systems development. When the range is within the 10,000 LOC to 100,000 LOC the growth is usually linear, however for programs over 100,000 LOC, the typical growth accelerates and can grow exponentially.

On a recent program, we used Boehm’s graph to map and compare the predicted size to the actual size for a program that was greater than 100,000 LOC at the start of the effort. Boehm’s typical growth prediction, for programs greater than 100,000 LOC, was accurate in estimating that the actual size for this particular program would be 80% greater than the size that had been predicted by the team.

We encourage you to use historical evidence like the diseconomies of scale before committing to a schedule or budget in the future. Part 2 of Estimation Issues will discuss other ways beyond lines of code to quantify the size of the development effort.

Let us help you with a program area analysis. Our problem area analysis method helps us work with you to dive deeply into program issues. We work with the team members and discuss the details of the system while discussing issues. We often identify gaps in engineering practices. Ideally we help teams understand those issues, resolve the issues, and bridge those gaps on current or for future programs.

Please feel free to contact me to discuss problem area analyses or estimation approaches.

 

Posted in Uncategorized | 4 Comments