Statistics
Analyse your testing data
Flakiness Analysis
To specifically quantify the flakiness score in tests, you want to focus on how often the test fluctuates between passing and failing (i.e., where a test intermittently passes and fails without any apparent changes in the underlying code or environment).
So, to quantify a flaky test the following definitions:
-
Fluctuations occur when a test result alternates between pass and fail in successive executions.
-
The more frequent these alternations, the more “flaky” the test is considered to be.
So, when extracting data from the database in a time-dependent manner, for each test, track the sequence of test outcomes(i.e. pass/fail history) over a period of time.
Example of a Test Sequences:
-
Test A: Pass, Fail, Pass, Pass, Fail, Pass → This shows fluctuations. (flaky)
-
Test B: Pass, Pass, Pass → No fluctuations. (not flaky)
-
Test C: Fail, Fail, Fail → Also no fluctuations, though it has a consistent failure. (not flaky)
Calculating Flakiness Score
-
Track the status changes in a particular test within that test suite. For example,
-
First Run: Pass
-
Second Run: Fail
-
Third Run: Pass
-
Fourth Run: Pass
-
-
The total number of status changes is 2
-
The flakiness score is calculated as:
Flakiness Score = Number of fluctuations / (Total Runs−1 )
So in this example,
Flakiness score = (2 / (4 - 1)) * 100 = 66%
This means the test has a 66% flakiness index, indicating significant volatility in its behaviour.
The severity of the volatility is defined by the below table:
Flakiness Score Range | Severity | Colour |
---|---|---|
More than 75 | Critical | Red |
less than 75 | Concerning | Orange |
less than 50 | Moderate | Yellow |
less than 25 | Stable | Green |
less than 0 | No Flaky Tests | No Flaky Tests |