interpreting test results

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

interpreting test results

Stefan Seefeld-2
Hello,

I'm trying to use the test results posted online to get a grasp of the
state of a library (and specifically, whether a recent check-in has
fixed a previous bug, or caused a regression. However, I find this to be
quite a daunting task, for multiple reasons, so I want to see whether
I'm missing anything, or should use a different approach. Here are some
issues I observe:


* the amount of data posted on
http://www.boost.org/development/tests/develop/developer/summary.html is
huge, so even just to find the relevant bits is non-trivial

* more specifically, when I look at
http://www.boost.org/development/tests/develop/developer/python.html, I
find mostly failures (the vast majority of cells a yellow). Is that
really representative of the state of Boost.Python on the develop branch ??

* It seems more likely that rather than the tested Boost code being that
buggy, there are fundamental issues with the setup of the tester. So
instead of flagging all the tests as "failed", it would be more useful
to have some simple "meta-tests" that test the testing environment
itself, and then just don't display the test results unless those
meta-tests have validated the tester.

* Clicking on a specific cell (i.e., a 'fail' of a particular test), I'm
taken to
http://www.boost.org/development/tests/develop/developer/CrystaX-apilevel-21-armeabi-v7a-gnu-libstdc++-python-gcc-4-9-andreas_beyer-variants_.html,
which contains "Output by test variants:", followed by two
(build-variant-specific) links. Both of these take me to pages
containing "fail" and another link, which then contains "succeed".
Nowhere do I see an actual failure message. Is this intentional ? How am
I to interpret these data ??

* Columns on the test matrix correspond to individual builds, which are
often a bit dated. It might be useful to have some logic (javascript ?)
to mask test results prior to a given date / revision.

* It would also be useful to know the schedule of the builds, to know
how long to wait after a checkin before test results corresponding to
that will show up.


Thanks,
        Stefan

--

      ...ich hab' noch einen Koffer in Berlin...

_______________________________________________
Boost-Testing mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/boost-testing
Reply | Threaded
Open this post in threaded view
|

Re: interpreting test results

Steven Watanabe-4
AMDG

On 03/07/2016 06:26 AM, Stefan Seefeld wrote:

>
> * Clicking on a specific cell (i.e., a 'fail' of a particular test), I'm
> taken to
> http://www.boost.org/development/tests/develop/developer/CrystaX-apilevel-21-armeabi-v7a-gnu-libstdc++-python-gcc-4-9-andreas_beyer-variants_.html,
> which contains "Output by test variants:", followed by two
> (build-variant-specific) links. Both of these take me to pages
> containing "fail" and another link, which then contains "succeed".
> Nowhere do I see an actual failure message. Is this intentional ? How am
> I to interpret these data ??
>

  It's not intentional.  It's a bug that appears
when a library fails to build.  process_jam_log
assumes that every target has a single source,
so when there is more than one, it ends up picking
randomly.

In Christ,
Steven Watanabe

_______________________________________________
Boost-Testing mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/boost-testing
Reply | Threaded
Open this post in threaded view
|

Re: interpreting test results

Stefan Seefeld-2
On 07.03.2016 11:28, Steven Watanabe wrote:

> AMDG
>
> On 03/07/2016 06:26 AM, Stefan Seefeld wrote:
>> * Clicking on a specific cell (i.e., a 'fail' of a particular test), I'm
>> taken to
>> http://www.boost.org/development/tests/develop/developer/CrystaX-apilevel-21-armeabi-v7a-gnu-libstdc++-python-gcc-4-9-andreas_beyer-variants_.html,
>> which contains "Output by test variants:", followed by two
>> (build-variant-specific) links. Both of these take me to pages
>> containing "fail" and another link, which then contains "succeed".
>> Nowhere do I see an actual failure message. Is this intentional ? How am
>> I to interpret these data ??
>>
>   It's not intentional.  It's a bug that appears
> when a library fails to build.  process_jam_log
> assumes that every target has a single source,
> so when there is more than one, it ends up picking
> randomly.

Well, I have to admit that at present
http://www.boost.org/development/tests/develop/developer/python.html is
unusable for me. I basically have to (try to) ignore all columns but
those that are "mostly green", as only those may contain "real" failures.

I think having a way to clearly differentiate failed tests from failed
(library) builds is a very important usability requirement. Even better
if by some simple criteria failed builds could be excluded from this
test result matrix. (It might still be useful to see the build logs for
those failed library builds, but definitely not as part of the test
results matrix.)

        Stefan

--

      ...ich hab' noch einen Koffer in Berlin...

_______________________________________________
Boost-Testing mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/boost-testing