seeking endorsement for histogram library

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

seeking endorsement for histogram library

Boost - Dev mailing list
Dear Boost developers,

I believe the histogram library is now ready to be presented here and I am looking for endorsement.

Histogram is a C++11 header-only library that provides a safe, convenient, and fast multi-dimensional histogram for statistical analysis and visualisation. The library has a unique feature set, among it a safety guarantee that the counts in the histogram cannot overflow. It is easily customisable for power users, while providing defaults that just work for the occasional user. Meta-programming is used to provide an especially fast histogram implementation that can be used when the histogram configuration is known at compile-time. A dynamic implementation is also provided for the other case when the configuration is only known at run-time. The two implementations share a common interface, so it is easy to switch between them. Python bindings are included for the dynamic implementation. The Python interface supports Numpy arrays to greatly speed up the exchange of data between the Python and C++ side. I tested the performance of the library in benchmarks against other libraries
 , which have fewer features, and this library beats them in almost all cases.

I am stealing the style of the rest of the email from Antony Polukhin.

Library: https://github.com/HDembinski/histogram <https://github.com/HDembinski/histogram>
Docs: https://htmlpreview.github.io/?https://raw.githubusercontent.com/HDembinski/histogram/html/doc/html/index.html <https://htmlpreview.github.io/?https://raw.githubusercontent.com/HDembinski/histogram/html/doc/html/index.html>
Boost Library Incubator: http://blincubator.com/bi_library/histogram-2/?gform_post_id=1582 <http://blincubator.com/bi_library/histogram-2/?gform_post_id=1582>

Library changes since last mail:

* Support for efficient adding of multiple histograms and scaling
* Support for reduce transformation
* Re-design of category axis as a general mapping between unique values and bins
* Re-design of the bin description an axis returns upon element access
* Regular axis can accepted bijective (user-provided) transformations
* Interface cleanup, refactoring, and simplification
* Finished documentation

Best regards,
Hans

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: seeking endorsement for histogram library

Boost - Dev mailing list
On Monday, November 13, 2017 2:23:54 PM CST Hans Dembinski via Boost wrote:
> Dear Boost developers,
>
> I believe the histogram library is now ready to be presented here and I am
> looking for endorsement.

I'm not sure what "endorsement" entails for Boost, but I think a histogram
would be a great addition.  I've had to write simple 1d histograms many times
over the years and this library goes way beyond that so I'd look forward to
using it.

Regards,
-Steve


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: seeking endorsement for histogram library

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
> Dear Boost developers,
>
> I believe the histogram library is now ready to be presented here and
> I am looking for endorsement.
>
> Histogram is a C++11 header-only library that provides a safe,
> convenient, and fast multi-dimensional histogram for statistical
> analysis and visualisation. The library has a unique feature set,
> among it a safety guarantee that the counts in the histogram cannot
> overflow. It is easily customisable for power users, while providing
> defaults that just work for the occasional user. Meta-programming is
> used to provide an especially fast histogram implementation that can
> be used when the histogram configuration is known at compile-time. A
> dynamic implementation is also provided for the other case when the
> configuration is only known at run-time. The two implementations
> share a common interface, so it is easy to switch between them.
> Python bindings are included for the dynamic implementation. The
> Python interface supports Numpy arrays to greatly speed up the
> exchange of data between the Python and C++ side. I tested the
> performance of the library in benchmarks against other libraries
>  , which have fewer features, and this library beats them in almost
> all cases.

[snip]

The histogram classes look quite useful, especially the property that
you can add histograms and thus accumulate data in parallel is
important in many scientific contexts.

Did you also intend to provide estimates of mean, variance and possibly
higher order statistical moments of the whole distribution? This can be
achieved with so-called on-line algorithms (even in the multi-variate
case) and would help to get more information about your data. This can
also be implemented such that the summation of histograms still work
(and probably also the scaling).

Best,
Fabian

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: seeking endorsement for histogram library

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
Hi Steve,

> On 13. Nov 2017, at 18:19, Steve Robbins via Boost <[hidden email]> wrote:
>
> On Monday, November 13, 2017 2:23:54 PM CST Hans Dembinski via Boost wrote:
>> Dear Boost developers,
>>
>> I believe the histogram library is now ready to be presented here and I am
>> looking for endorsement.
>
> I'm not sure what "endorsement" entails for Boost, but I think a histogram
> would be a great addition.  I've had to write simple 1d histograms many times
> over the years and this library goes way beyond that so I'd look forward to
> using it.

that is cool, thank you. :) If you have any feedback to share about the library, please let me know.

Here is the bit about endorsement in the submission guide lines http://www.boost.org/development/submissions.html <http://www.boost.org/development/submissions.html>

"When you feel that your library is ready for entry into Boost, you need to find at least one member (but preferably several) of the Boost community who is willing to publicly endorse your library for entry into Boost. A simple method of achieving this is to post to the Boost developers mailing list <http://www.boost.org/community/groups.html> a short description of your library, links to its github and documentation, and a request for endorsements."

Best regards,
Hans

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: seeking endorsement for histogram library

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
Dear Fabian,

> On 13. Nov 2017, at 18:59, Fabian Bösch via Boost <[hidden email]> wrote:
>
> The histogram classes look quite useful, especially the property that
> you can add histograms and thus accumulate data in parallel is
> important in many scientific contexts.

yes, right! :) I also mention that use case in the user guide. :)

> Did you also intend to provide estimates of mean, variance and possibly
> higher order statistical moments of the whole distribution? This can be
> achieved with so-called on-line algorithms (even in the multi-variate
> case) and would help to get more information about your data. This can
> also be implemented such that the summation of histograms still work
> (and probably also the scaling).

assuming I understand you correctly, then I think the functionality of computing moments on-line is already in Boost.Accumulators, see
http://www.boost.org/doc/libs/1_65_1/doc/html/accumulators/user_s_guide.html#accumulators.user_s_guide.the_statistical_accumulators_library

The histogram library has a weak overlap in scope with Boost.Accumulators, but mostly they are complementary. There is a section in the rationale about on this point:

https://htmlpreview.github.io/?https://github.com/HDembinski/histogram/html/doc/html/index.html#histogram.rationale.comparison_to_boost_accumulators

Best regards,
Hans

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: seeking endorsement for histogram library

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
Dear Francesco,

> On 13. Nov 2017, at 19:09, Francesco Guerrieri <[hidden email]> wrote:
>
> Hi this proposal is of great interest to me, thanks for sharing your work with us. Which compilers/OSs are supported?

I tested the code on OSX with Apple clang, and via TravisCI on several versions of gcc on Linux. The CMake build works on both OSX and Linux. I wanted to also test Windows via Appveyor, but couldn't get my CMake build script to work with msvc. I am passing compiler flags directly, which works for clang and gcc, but naturally not for msvc. Debugging this non-interactively, by committing a change to github, letting Appveyor run, seeing it breaking again..., was a pain and I stopped at some point.

If an expert for Windows had a look at my CMakeLists.txt, it is probably something that could be easily fixed.

Best regards,
Hans



_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: seeking endorsement for histogram library

Boost - Dev mailing list
Dear Hans thanks for the clarification. If nobody else will do it sooner, I
will give it a look on Windows in the weekend.

Best,
Francesco

Il Mar 14 Nov 2017, 14:31 Hans Dembinski <[hidden email]> ha
scritto:

> Dear Francesco,
>
> On 13. Nov 2017, at 19:09, Francesco Guerrieri <[hidden email]>
> wrote:
>
> Hi this proposal is of great interest to me, thanks for sharing your work
> with us. Which compilers/OSs are supported?
>
>
> I tested the code on OSX with Apple clang, and via TravisCI on several
> versions of gcc on Linux. The CMake build works on both OSX and Linux. I
> wanted to also test Windows via Appveyor, but couldn't get my CMake build
> script to work with msvc. I am passing compiler flags directly, which works
> for clang and gcc, but naturally not for msvc. Debugging this
> non-interactively, by committing a change to github, letting Appveyor run,
> seeing it breaking again..., was a pain and I stopped at some point.
>
> If an expert for Windows had a look at my CMakeLists.txt, it is probably
> something that could be easily fixed.
>
> Best regards,
> Hans
>
>
>

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: seeking endorsement for histogram library

Boost - Dev mailing list

> On 14. Nov 2017, at 22:02, Francesco Guerrieri <[hidden email]> wrote:
> Dear Hans thanks for the clarification. If nobody else will do it sooner, I will give it a look on Windows in the weekend.
>
Cool, that would be awesome! :)

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: seeking endorsement for histogram library

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
Dear Fabian,

I am CCing boost, because maybe some others want to read this as well and/or correct me.

> On 15. Nov 2017, at 14:21, Fabian Bösch <[hidden email]> wrote:
>
> Thanks for clarifying. While it is true that Boost.Accumulators does
> this, it is for the uni-variate case only, afaik. Please correct me, if
> I'm wrong.

I believe you can compute a multi-dimensional mean, if you feed an appropriate Boost.Accumulator with std::valarrays. Maybe there is an easier way.

> Now you have multi-variate histograms for which one would probably also
> like to know the Covariance (rank-2 tensor), for example. Maybe this is
> out of scope but a general solution for this would be nice. But perhaps
> this would rather be addressed in another generalized accumulator
> library. What do you think?

I think you are right. You can compute the covariance of two variables with Boost.Accumulators:
http://www.boost.org/doc/libs/1_65_0/doc/html/accumulators/user_s_guide.html#accumulators.user_s_guide.the_statistical_accumulators_library.covariance

but it does not give you a full matrix for N covariates. Maybe you could contact the maintainer of Accumulators, Eric Niebler. It is a reasonable request that would fit better into Accumulators than in Histogram.

Best regards,
Hans

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: seeking endorsement for histogram library

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
I endorse boost.histogram.

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: seeking endorsement for histogram library

Boost - Dev mailing list
Thank you, Bjørn! :)

> On 18. Nov 2017, at 16:56, Bjorn Reese via Boost <[hidden email]> wrote:
>
> I endorse boost.histogram.
>
> _______________________________________________
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: seeking endorsement for histogram library

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
Thank you Bjorn!

Best,
Ron

> On Nov 18, 2017, at 7:56 AM, Bjorn Reese <[hidden email]> wrote:
>
> I endorse boost.histogram.


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: seeking endorsement for histogram library

Boost - Dev mailing list
Dear Ron, dear John,

since "histogram" has been endorsed by Bjorn, I have been looking for a review manager. Klemens is willing to manage the review some time in March 2018. I list all the information again for convenience:

Submission: histogram
Submitter: Hans Dembinski
Source: https://github.com/hdembinski/histogram
Documentation: http://hdembinski.github.io/histogram/doc/html
Endorsers: Bjorn Reese
Review Manager: Klemens Morgenstern
Review Dates: Some date in March 2018

Best regards,
Hans

> On 21. Nov 2017, at 14:20, Ronald Garcia via Boost <[hidden email]> wrote:
>
> Thank you Bjorn!
>
> Best,
> Ron
>
>> On Nov 18, 2017, at 7:56 AM, Bjorn Reese <[hidden email]> wrote:
>>
>> I endorse boost.histogram.
>
>
> _______________________________________________
> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: seeking endorsement for histogram library

Boost - Dev mailing list
Dear Hans,

Thank you for providing the information.  Now Klemens and you should agree on a 10-day period in March for which to hold the review.  Let me know and I will add it to the review schedule.

Best,
Ron



> On Dec 13, 2017, at 2:29 AM, Hans Dembinski <[hidden email]> wrote:
>
> Dear Ron, dear John,
>
> since "histogram" has been endorsed by Bjorn, I have been looking for a review manager. Klemens is willing to manage the review some time in March 2018. I list all the information again for convenience:
>
> Submission: histogram
> Submitter: Hans Dembinski
> Source: https://github.com/hdembinski/histogram
> Documentation: http://hdembinski.github.io/histogram/doc/html
> Endorsers: Bjorn Reese
> Review Manager: Klemens Morgenstern
> Review Dates: Some date in March 2018
>
> Best regards,
> Hans
>
>> On 21. Nov 2017, at 14:20, Ronald Garcia via Boost <[hidden email]> wrote:
>>
>> Thank you Bjorn!
>>
>> Best,
>> Ron
>>
>>> On Nov 18, 2017, at 7:56 AM, Bjorn Reese <[hidden email]> wrote:
>>>
>>> I endorse boost.histogram.
>>
>>
>> _______________________________________________
>> Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
>


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: seeking endorsement for histogram library

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
Le 13/12/2017 à 11:29, Hans Dembinski via Boost a écrit :
> Dear Ron, dear John,
>
> since "histogram" has been endorsed by Bjorn, I have been looking for a review manager. Klemens is willing to manage the review some time in March 2018. I list all the information again for convenience:
Hi,

I don't know if having a single endorsement (+ the review manager) is
enough.

Vicente

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: seeking endorsement for histogram library

Boost - Dev mailing list


> -----Original Message-----
> From: Boost [mailto:[hidden email]] On Behalf Of Vicente J. Botet Escriba via Boost
> Sent: 15 December 2017 17:57
> To: [hidden email]
> Cc: Vicente J. Botet Escriba; [hidden email]
> Subject: Re: [boost] seeking endorsement for histogram library
>
> Le 13/12/2017 à 11:29, Hans Dembinski via Boost a écrit :
> > Dear Ron, dear John,
> >
> > since "histogram" has been endorsed by Bjorn, I have been looking for a review manager. Klemens is willing to manage the
> review some time in March 2018. I list all the information again for convenience:
> Hi,
>
> I don't know if having a single endorsement (+ the review manager) is
> enough.

I think one is enough, but I endorse proposed histogram library in case this helps.

Paul


---
Paul A. Bristow
Prizet Farmhouse
Kendal UK LA8 8AB
+44 (0) 1539 561830




_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost