Thursday, December 3, 2009

Software Testing - A Cautionary Tale

Sponsored Ad

There are some basic rules of them, who will serve you well in testing any application that deals with lists of data (and applications that do not?).

1.1, few, many "

2. "They make no hypothesis"

3. "Remember to mix a little"

This case study covers a very interesting example of where following the rules of thumbs * exactly * dividends.

The problem of

There was no interactive reporting solution that was having performance issues: in essence, there was some pathological performance degradation in some circumstances.

It contributes the lion's share of wasted time was in a 3rd party component that really could not be

There were several passes on the problem and things always did:

  • Direct: get the draft issue and simply pause the debugger at the point of obvious pain
  • Scrub indirect: the code in search of opportunities
  • (Time) through validation tests: write an automated script to perform the information component through a series of configurable settings

These scrubs palpable difference in each iteration: for reasons related to the limitations of the component of the 3rd party application and framework which was held on traceability was not large in the code and found to be difficult to get the right job the first time.

There were significant improvements made however: minutes became seconds in some cases, giving an indication of the potential seriousness of the issue of interactive applications.

However, it is that even if the encoding was achieved "first law" was not a hidden track that would have caused a return to the bench due to a serious performance problem.

The pump

In this case, it pays to enter a small detail:

Developers and testers have a certain level of awareness of the importance and complexity of a given process run time: in essence, tend to have an expectation that (approximately) the runtime is going as O (n) and many other applications is a strong pressure of nitrogen to a small number, otherwise the user experience can be very disappointing. Many defects arise around the issue that n is not really a small number. In this case, no figure appeared to be an acceptable if not excellent.

Developers and testers got a big surprise when at the last minute of the product release cycle of a project that was presented wholly challenged the performance as seen by developers and quality control. It took years to carry out what is immediately in much larger chunks of data.

What was happening?

The developer to debug the application: the calling sequence like any other data, only taken a very different time.

What is different in the project?

Well, one of the arrested were limits to open from the default of several hundred to several thousand people - that there was never any evidence of this number of elements in these lists but still a thousand years is a small number of modern machines with the functions that have an acceptable order of n.

Therefore, expectations are challenged: What was happening?

The automated test was repeated with the limits set to the default-busting - the quality was still acceptable and much better than the proposed problem.

It has reduced the data in the project. The automated test data that was created in a controlled manner and could not help.

Back to the developer.

They work hard for the last days of term development of the answer suddenly became clear in the purification and compare the behavior of large projects "good" and "bad" major projects.

The behavior of a key role to participate in the interactive exercise component was not simply characterized as O (n). A better indication would be O (n) + O (x) where x and y are the counts of items in the lists used by the component.

Where X and Y, below the default implementation of the second term was never observed by the behavior and the 3rd party component source was never discussed at the level where the fault could be found. When the defaults of the application exceeds this mass O (x) had the opportunity to be - "very important".

Why not capture the automated tests that when the larger sizes, were established in new settings?

Because the tests made a reasonable assumption - that n matter and therefore was not observed by the evaluator and advocate that the size of the list were all the same – x y is always zero

The proposed issue has real data with lists of great and different - all he did was a list that several articles of 1000 long and the other to be small for the performance problem besetting be exposed.

The moral of the story

One could argue the original bug report contained the core of the solution, providing an example of a rule of thumb # 3.

Golden Rule # 1 was initially intended to be covered adequately, but fatally compromised by the pathological behavior of the application. The application appears to be the game of assessors and internal model of development of certain very specific conditions; they were sadly realistic that a key feature.

0 comments:

Post a Comment

Website Updates