Status of development /Benchmarks

classic Classic list List threaded Threaded
35 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Status of development /Benchmarks

Oswin Krause
Hi,

I am interested in the current development status of the library. is
there the possibility to get the latest version somewhere? I tried to
find a working branch on https://github.com/boostorg/ublas but could
only find changes older than half a year.

I am interested in Benchmarking the "new" ublas against "my"
reimplemented subset (think about my personal summer of code :) )

Greetings,
Oswin
_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

David Bellot
Hi,

the latest version is indeed on github. We didn't commit many things this year so far. We are talking (slowly, I must admit, please blame me not the others) about new features and changes:

(1) integrating GSOC 2013 in the main branch
(2) removing the vector object to replace it by a new one that will be a matrix and therefore implement a real row of column vector
(3) Matrix multiplication with operator* valid for all standard operation (vector, matrix, etc...). It is related to (2)
(4) Algorithms infrastructure (so that we can have real useful features)
(5) Matrix/vector views for interoperability <- I think this is ultra critical because now ublas is monolithic in the sense that you have to use it everywhere you manipulate data. This would really help into letting people for example have a list of vectors (they are plotting) and ublas working on top of that to do for example transformations
(6) NEW DOCUMENTATION - examples and the rest. The current one is ... old
(7) Incorporate some critical bindings (i.e. mumps bindings which is currently probably the most efficient smp and distributed open source linalg solver)
(8) Matlab binding? Matlab I/O etc... Well same thing with R, csv files, etc... Bindings and I/O in general are pretty poor
(9) Distributed ublas (GSOC'13)


I'm really interested seeing your benchmark. Maybe you can join the github ublas project as well. If you give me your github id I can associate it with the project and you can push your code. Practically speaking, there is a ublas project on github which is a copy a the main boostorg/ublas project. And there are many branches for each project that people can use.

Links you can be interested in:
https://github.com/uBLAS/ublas/wiki

The workflow that we will follow (as also suggested in boost mailing list) is: http://nvie.com/posts/a-successful-git-branching-model/
with the difference that we have two "masters" i.e. one in ublas_develop (which is our working master) and one that will be always in sync with the latest boost release.

And last but not least, there are a few changes for supporting IDE. In fact, that would be good if people can contribute more. I'm a vi user only, so I need others' skills for that. QTCreator is on its way. Ideally, if we can have some sort of support for the main IDE, on top of my mind: Eclipse, Netbeans, Visual Studio, Anjuta, KDevelop, etc...

Best regards,
David



On Fri, Nov 29, 2013 at 9:04 AM, oswin krause <[hidden email]> wrote:
Hi,

I am interested in the current development status of the library. is there the possibility to get the latest version somewhere? I tried to find a working branch on https://github.com/boostorg/ublas but could only find changes older than half a year.

I am interested in Benchmarking the "new" ublas against "my" reimplemented subset (think about my personal summer of code :) )

Greetings,
Oswin
_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

Oswin Krause
Hi David,

Thanks for the fast reply. There might be a misunderstanding: I don't have any code here that is ready for uBLAS. The story goes back to approximately spring where it was decided to improve uBLAS. I decided to part ways because of our differences in how uBLAS should look like (and since it is your library it is of course your right to choose!). So I rewrote the whole thing (or better: the parts that i needed, that is dense and sparse algebra, no banded matrices) and now got a 16000 LOC subset of uBLAS with most of the functionality (and sometimes more) and automatic ATLAS bindings for fast_prod etc.

Now i wanted to come back and compare the different approaches and see whether I could adopt some of your changes as well as maybe port some of my code back to uBLAS.

So once most of the new features are there, i would be happy to compare! You would also get a direct benchmark against atlas if you like.

My version is unfortunately not standalone yet because of $REASONS. Also looking at your changes, i doubt that the libraries are compatible, for example because I overloaded op* and op/ for element-wise operations ( I need them so often, element_prod and element_div just does not cut it...)

But if you like you can have a look at it here

(most of ublas functionality minus systems of equations and algorithms)
http://sourceforge.net/p/shark-project/code/HEAD/tree/trunk/Shark/include/shark/LinAlg/BLAS/

(and systems solver + algorithms (wip))
http://sourceforge.net/p/shark-project/code/HEAD/tree/trunk/Shark/include/shark/LinAlg/

Best  Regards,
Oswin

On 29.11.2013 12:40, David Bellot wrote:
Hi,

the latest version is indeed on github. We didn't commit many things this year so far. We are talking (slowly, I must admit, please blame me not the others) about new features and changes:

(1) integrating GSOC 2013 in the main branch
(2) removing the vector object to replace it by a new one that will be a matrix and therefore implement a real row of column vector
(3) Matrix multiplication with operator* valid for all standard operation (vector, matrix, etc...). It is related to (2)
(4) Algorithms infrastructure (so that we can have real useful features)
(5) Matrix/vector views for interoperability <- I think this is ultra critical because now ublas is monolithic in the sense that you have to use it everywhere you manipulate data. This would really help into letting people for example have a list of vectors (they are plotting) and ublas working on top of that to do for example transformations
(6) NEW DOCUMENTATION - examples and the rest. The current one is ... old
(7) Incorporate some critical bindings (i.e. mumps bindings which is currently probably the most efficient smp and distributed open source linalg solver)
(8) Matlab binding? Matlab I/O etc... Well same thing with R, csv files, etc... Bindings and I/O in general are pretty poor
(9) Distributed ublas (GSOC'13)


I'm really interested seeing your benchmark. Maybe you can join the github ublas project as well. If you give me your github id I can associate it with the project and you can push your code. Practically speaking, there is a ublas project on github which is a copy a the main boostorg/ublas project. And there are many branches for each project that people can use.

Links you can be interested in:
https://github.com/uBLAS/ublas/wiki

The workflow that we will follow (as also suggested in boost mailing list) is: http://nvie.com/posts/a-successful-git-branching-model/
with the difference that we have two "masters" i.e. one in ublas_develop (which is our working master) and one that will be always in sync with the latest boost release.

And last but not least, there are a few changes for supporting IDE. In fact, that would be good if people can contribute more. I'm a vi user only, so I need others' skills for that. QTCreator is on its way. Ideally, if we can have some sort of support for the main IDE, on top of my mind: Eclipse, Netbeans, Visual Studio, Anjuta, KDevelop, etc...

Best regards,
David



On Fri, Nov 29, 2013 at 9:04 AM, oswin krause <[hidden email]> wrote:
Hi,

I am interested in the current development status of the library. is there the possibility to get the latest version somewhere? I tried to find a working branch on https://github.com/boostorg/ublas but could only find changes older than half a year.

I am interested in Benchmarking the "new" ublas against "my" reimplemented subset (think about my personal summer of code :) )

Greetings,
Oswin
_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]



_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

David Bellot
Hi Oswin,

well, I'm just the maintainer of uBLAS, not the owner and for sure not the first developer as it was already an old project when I took over it. I believe any contributions is welcome and honnestly so many things can be changed in uBLAS. It's getting a little bit old.
So that's why I'm always happy to have new contributions. That's also the reason I wanted to have GSOC projects this summer and I think it has been a good thing.

I didn't know Shark was your library ! Well done. It's a good library.
I'm especially interested in your implementaion of RBM.

So please, feel free to propose anything you think is relevant to uBLAS. At some point we even thought about rewriting a big part of uBLAS and call it uBLAS 2 (and go through the peer review again, etc...). Well, I think uBLAS can evolve smoothly rather than having a big revolution.

The automatic ATLAS binding is definitively something I want !!!!

Best,
David


On Fri, Nov 29, 2013 at 12:04 PM, oswin krause <[hidden email]> wrote:
Hi David,

Thanks for the fast reply. There might be a misunderstanding: I don't have any code here that is ready for uBLAS. The story goes back to approximately spring where it was decided to improve uBLAS. I decided to part ways because of our differences in how uBLAS should look like (and since it is your library it is of course your right to choose!). So I rewrote the whole thing (or better: the parts that i needed, that is dense and sparse algebra, no banded matrices) and now got a 16000 LOC subset of uBLAS with most of the functionality (and sometimes more) and automatic ATLAS bindings for fast_prod etc.

Now i wanted to come back and compare the different approaches and see whether I could adopt some of your changes as well as maybe port some of my code back to uBLAS.

So once most of the new features are there, i would be happy to compare! You would also get a direct benchmark against atlas if you like.

My version is unfortunately not standalone yet because of $REASONS. Also looking at your changes, i doubt that the libraries are compatible, for example because I overloaded op* and op/ for element-wise operations ( I need them so often, element_prod and element_div just does not cut it...)

But if you like you can have a look at it here

(most of ublas functionality minus systems of equations and algorithms)
http://sourceforge.net/p/shark-project/code/HEAD/tree/trunk/Shark/include/shark/LinAlg/BLAS/

(and systems solver + algorithms (wip))
http://sourceforge.net/p/shark-project/code/HEAD/tree/trunk/Shark/include/shark/LinAlg/

Best  Regards,
Oswin


On 29.11.2013 12:40, David Bellot wrote:
Hi,

the latest version is indeed on github. We didn't commit many things this year so far. We are talking (slowly, I must admit, please blame me not the others) about new features and changes:

(1) integrating GSOC 2013 in the main branch
(2) removing the vector object to replace it by a new one that will be a matrix and therefore implement a real row of column vector
(3) Matrix multiplication with operator* valid for all standard operation (vector, matrix, etc...). It is related to (2)
(4) Algorithms infrastructure (so that we can have real useful features)
(5) Matrix/vector views for interoperability <- I think this is ultra critical because now ublas is monolithic in the sense that you have to use it everywhere you manipulate data. This would really help into letting people for example have a list of vectors (they are plotting) and ublas working on top of that to do for example transformations
(6) NEW DOCUMENTATION - examples and the rest. The current one is ... old
(7) Incorporate some critical bindings (i.e. mumps bindings which is currently probably the most efficient smp and distributed open source linalg solver)
(8) Matlab binding? Matlab I/O etc... Well same thing with R, csv files, etc... Bindings and I/O in general are pretty poor
(9) Distributed ublas (GSOC'13)


I'm really interested seeing your benchmark. Maybe you can join the github ublas project as well. If you give me your github id I can associate it with the project and you can push your code. Practically speaking, there is a ublas project on github which is a copy a the main boostorg/ublas project. And there are many branches for each project that people can use.

Links you can be interested in:
https://github.com/uBLAS/ublas/wiki

The workflow that we will follow (as also suggested in boost mailing list) is: http://nvie.com/posts/a-successful-git-branching-model/
with the difference that we have two "masters" i.e. one in ublas_develop (which is our working master) and one that will be always in sync with the latest boost release.

And last but not least, there are a few changes for supporting IDE. In fact, that would be good if people can contribute more. I'm a vi user only, so I need others' skills for that. QTCreator is on its way. Ideally, if we can have some sort of support for the main IDE, on top of my mind: Eclipse, Netbeans, Visual Studio, Anjuta, KDevelop, etc...

Best regards,
David



On Fri, Nov 29, 2013 at 9:04 AM, oswin krause <[hidden email]> wrote:
Hi,

I am interested in the current development status of the library. is there the possibility to get the latest version somewhere? I tried to find a working branch on https://github.com/boostorg/ublas but could only find changes older than half a year.

I am interested in Benchmarking the "new" ublas against "my" reimplemented subset (think about my personal summer of code :) )

Greetings,
Oswin
_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]



_______________________________________________ ublas mailing list [hidden email] http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

Oswin Krause
Hi David,

Yes at a point i thought about contributing to uBLAS. But there are two problems with it that are making it hard to develop.

1. There is no real consensus of what uBLAS should be targeted at. Right now I have the feeling that it should be more or less like Eigen: be applicable to every use case out there. And this is hard. Most of Eigen looks to me like a big mess. This is because it must behave differently in the cases of small, big and very big problems. Right now uBLAS has only a good performance for small dense problems and I am mostly only interested in the middle category. And the last time i proposed changes I got nearly no support but fire from the "small" and "very big" guys. This makes it quite unattractive for me to propose further changes.

2.

So please, feel free to propose anything you think is relevant to uBLAS. At some point we even thought about rewriting a big part of uBLAS and call it uBLAS 2 (and go through the peer review again, etc...). Well, I think uBLAS can evolve smoothly rather than having a big revolution.
I don't think that this is possible without iteratively annoying all uBLAS users who actually use things like loops over matrix elements and the other low level interfaces. I had to rewrite the sparse interface completely(and i am still not sure that i have done that right) as well as the iterator interface. I tried to do it smoothly but I nevertheless broke it 3 times.
The next problem, again sparse, is that there are 3 different formats supported by uBLAS. It is hard to write algorithms for it, because the three formats need to be treated differently. But in a smooth evolution we can't really throw out stuff, especially if there is no 1:1 replacement.
I think you can find such issues in a lot of parts, as you said, uBLAS is old. There are more things, like that you might want to change the way expressions are evaluated so that things like

noalias(A)+=prod(B,C)+D;

are fast and don't require intermediate results in prod().

I think a full rewrite from scratch is required to get rid of all the old burdens. While a review process of course is a bit annoying it might actually help in developing it to something that is actually useful :). (And honestly i doubt that uBLAS as is would pass a review.)

Best,
Oswin

On 29.11.2013 13:55, David Bellot wrote:
Hi Oswin,

well, I'm just the maintainer of uBLAS, not the owner and for sure not the first developer as it was already an old project when I took over it. I believe any contributions is welcome and honnestly so many things can be changed in uBLAS. It's getting a little bit old.
So that's why I'm always happy to have new contributions. That's also the reason I wanted to have GSOC projects this summer and I think it has been a good thing.

I didn't know Shark was your library ! Well done. It's a good library.
I'm especially interested in your implementaion of RBM.

So please, feel free to propose anything you think is relevant to uBLAS. At some point we even thought about rewriting a big part of uBLAS and call it uBLAS 2 (and go through the peer review again, etc...). Well, I think uBLAS can evolve smoothly rather than having a big revolution.

The automatic ATLAS binding is definitively something I want !!!!

Best,
David


On Fri, Nov 29, 2013 at 12:04 PM, oswin krause <[hidden email]> wrote:
Hi David,

Thanks for the fast reply. There might be a misunderstanding: I don't have any code here that is ready for uBLAS. The story goes back to approximately spring where it was decided to improve uBLAS. I decided to part ways because of our differences in how uBLAS should look like (and since it is your library it is of course your right to choose!). So I rewrote the whole thing (or better: the parts that i needed, that is dense and sparse algebra, no banded matrices) and now got a 16000 LOC subset of uBLAS with most of the functionality (and sometimes more) and automatic ATLAS bindings for fast_prod etc.

Now i wanted to come back and compare the different approaches and see whether I could adopt some of your changes as well as maybe port some of my code back to uBLAS.

So once most of the new features are there, i would be happy to compare! You would also get a direct benchmark against atlas if you like.

My version is unfortunately not standalone yet because of $REASONS. Also looking at your changes, i doubt that the libraries are compatible, for example because I overloaded op* and op/ for element-wise operations ( I need them so often, element_prod and element_div just does not cut it...)

But if you like you can have a look at it here

(most of ublas functionality minus systems of equations and algorithms)
http://sourceforge.net/p/shark-project/code/HEAD/tree/trunk/Shark/include/shark/LinAlg/BLAS/

(and systems solver + algorithms (wip))
http://sourceforge.net/p/shark-project/code/HEAD/tree/trunk/Shark/include/shark/LinAlg/

Best  Regards,
Oswin


On 29.11.2013 12:40, David Bellot wrote:
Hi,

the latest version is indeed on github. We didn't commit many things this year so far. We are talking (slowly, I must admit, please blame me not the others) about new features and changes:

(1) integrating GSOC 2013 in the main branch
(2) removing the vector object to replace it by a new one that will be a matrix and therefore implement a real row of column vector
(3) Matrix multiplication with operator* valid for all standard operation (vector, matrix, etc...). It is related to (2)
(4) Algorithms infrastructure (so that we can have real useful features)
(5) Matrix/vector views for interoperability <- I think this is ultra critical because now ublas is monolithic in the sense that you have to use it everywhere you manipulate data. This would really help into letting people for example have a list of vectors (they are plotting) and ublas working on top of that to do for example transformations
(6) NEW DOCUMENTATION - examples and the rest. The current one is ... old
(7) Incorporate some critical bindings (i.e. mumps bindings which is currently probably the most efficient smp and distributed open source linalg solver)
(8) Matlab binding? Matlab I/O etc... Well same thing with R, csv files, etc... Bindings and I/O in general are pretty poor
(9) Distributed ublas (GSOC'13)


I'm really interested seeing your benchmark. Maybe you can join the github ublas project as well. If you give me your github id I can associate it with the project and you can push your code. Practically speaking, there is a ublas project on github which is a copy a the main boostorg/ublas project. And there are many branches for each project that people can use.

Links you can be interested in:
https://github.com/uBLAS/ublas/wiki

The workflow that we will follow (as also suggested in boost mailing list) is: http://nvie.com/posts/a-successful-git-branching-model/
with the difference that we have two "masters" i.e. one in ublas_develop (which is our working master) and one that will be always in sync with the latest boost release.

And last but not least, there are a few changes for supporting IDE. In fact, that would be good if people can contribute more. I'm a vi user only, so I need others' skills for that. QTCreator is on its way. Ideally, if we can have some sort of support for the main IDE, on top of my mind: Eclipse, Netbeans, Visual Studio, Anjuta, KDevelop, etc...

Best regards,
David



On Fri, Nov 29, 2013 at 9:04 AM, oswin krause <[hidden email]> wrote:
Hi,

I am interested in the current development status of the library. is there the possibility to get the latest version somewhere? I tried to find a working branch on https://github.com/boostorg/ublas but could only find changes older than half a year.

I am interested in Benchmarking the "new" ublas against "my" reimplemented subset (think about my personal summer of code :) )

Greetings,
Oswin
_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]



_______________________________________________ ublas mailing list [hidden email] http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]



_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

Joaquim Duran
Dear all,

I think that al stuff pending of merge listed by David, should be merged and migrate to uBlas 2.0 and while uBlas 2.0 is in development/maintenance then design from scratch uBlas 3.0.

My two cents!!

Joaquim Duran


2013/12/2 oswin krause <[hidden email]>
Hi David,

Yes at a point i thought about contributing to uBLAS. But there are two problems with it that are making it hard to develop.

1. There is no real consensus of what uBLAS should be targeted at. Right now I have the feeling that it should be more or less like Eigen: be applicable to every use case out there. And this is hard. Most of Eigen looks to me like a big mess. This is because it must behave differently in the cases of small, big and very big problems. Right now uBLAS has only a good performance for small dense problems and I am mostly only interested in the middle category. And the last time i proposed changes I got nearly no support but fire from the "small" and "very big" guys. This makes it quite unattractive for me to propose further changes.


2.

So please, feel free to propose anything you think is relevant to uBLAS. At some point we even thought about rewriting a big part of uBLAS and call it uBLAS 2 (and go through the peer review again, etc...). Well, I think uBLAS can evolve smoothly rather than having a big revolution.
I don't think that this is possible without iteratively annoying all uBLAS users who actually use things like loops over matrix elements and the other low level interfaces. I had to rewrite the sparse interface completely(and i am still not sure that i have done that right) as well as the iterator interface. I tried to do it smoothly but I nevertheless broke it 3 times.
The next problem, again sparse, is that there are 3 different formats supported by uBLAS. It is hard to write algorithms for it, because the three formats need to be treated differently. But in a smooth evolution we can't really throw out stuff, especially if there is no 1:1 replacement.
I think you can find such issues in a lot of parts, as you said, uBLAS is old. There are more things, like that you might want to change the way expressions are evaluated so that things like

noalias(A)+=prod(B,C)+D;

are fast and don't require intermediate results in prod().

I think a full rewrite from scratch is required to get rid of all the old burdens. While a review process of course is a bit annoying it might actually help in developing it to something that is actually useful :). (And honestly i doubt that uBLAS as is would pass a review.)

Best,
Oswin


On 29.11.2013 13:55, David Bellot wrote:
Hi Oswin,

well, I'm just the maintainer of uBLAS, not the owner and for sure not the first developer as it was already an old project when I took over it. I believe any contributions is welcome and honnestly so many things can be changed in uBLAS. It's getting a little bit old.
So that's why I'm always happy to have new contributions. That's also the reason I wanted to have GSOC projects this summer and I think it has been a good thing.

I didn't know Shark was your library ! Well done. It's a good library.
I'm especially interested in your implementaion of RBM.

So please, feel free to propose anything you think is relevant to uBLAS. At some point we even thought about rewriting a big part of uBLAS and call it uBLAS 2 (and go through the peer review again, etc...). Well, I think uBLAS can evolve smoothly rather than having a big revolution.

The automatic ATLAS binding is definitively something I want !!!!

Best,
David


On Fri, Nov 29, 2013 at 12:04 PM, oswin krause <[hidden email]> wrote:
Hi David,

Thanks for the fast reply. There might be a misunderstanding: I don't have any code here that is ready for uBLAS. The story goes back to approximately spring where it was decided to improve uBLAS. I decided to part ways because of our differences in how uBLAS should look like (and since it is your library it is of course your right to choose!). So I rewrote the whole thing (or better: the parts that i needed, that is dense and sparse algebra, no banded matrices) and now got a 16000 LOC subset of uBLAS with most of the functionality (and sometimes more) and automatic ATLAS bindings for fast_prod etc.

Now i wanted to come back and compare the different approaches and see whether I could adopt some of your changes as well as maybe port some of my code back to uBLAS.

So once most of the new features are there, i would be happy to compare! You would also get a direct benchmark against atlas if you like.

My version is unfortunately not standalone yet because of $REASONS. Also looking at your changes, i doubt that the libraries are compatible, for example because I overloaded op* and op/ for element-wise operations ( I need them so often, element_prod and element_div just does not cut it...)

But if you like you can have a look at it here

(most of ublas functionality minus systems of equations and algorithms)
http://sourceforge.net/p/shark-project/code/HEAD/tree/trunk/Shark/include/shark/LinAlg/BLAS/

(and systems solver + algorithms (wip))
http://sourceforge.net/p/shark-project/code/HEAD/tree/trunk/Shark/include/shark/LinAlg/

Best  Regards,
Oswin


On 29.11.2013 12:40, David Bellot wrote:
Hi,

the latest version is indeed on github. We didn't commit many things this year so far. We are talking (slowly, I must admit, please blame me not the others) about new features and changes:

(1) integrating GSOC 2013 in the main branch
(2) removing the vector object to replace it by a new one that will be a matrix and therefore implement a real row of column vector
(3) Matrix multiplication with operator* valid for all standard operation (vector, matrix, etc...). It is related to (2)
(4) Algorithms infrastructure (so that we can have real useful features)
(5) Matrix/vector views for interoperability <- I think this is ultra critical because now ublas is monolithic in the sense that you have to use it everywhere you manipulate data. This would really help into letting people for example have a list of vectors (they are plotting) and ublas working on top of that to do for example transformations
(6) NEW DOCUMENTATION - examples and the rest. The current one is ... old
(7) Incorporate some critical bindings (i.e. mumps bindings which is currently probably the most efficient smp and distributed open source linalg solver)
(8) Matlab binding? Matlab I/O etc... Well same thing with R, csv files, etc... Bindings and I/O in general are pretty poor
(9) Distributed ublas (GSOC'13)


I'm really interested seeing your benchmark. Maybe you can join the github ublas project as well. If you give me your github id I can associate it with the project and you can push your code. Practically speaking, there is a ublas project on github which is a copy a the main boostorg/ublas project. And there are many branches for each project that people can use.

Links you can be interested in:
https://github.com/uBLAS/ublas/wiki

The workflow that we will follow (as also suggested in boost mailing list) is: http://nvie.com/posts/a-successful-git-branching-model/
with the difference that we have two "masters" i.e. one in ublas_develop (which is our working master) and one that will be always in sync with the latest boost release.

And last but not least, there are a few changes for supporting IDE. In fact, that would be good if people can contribute more. I'm a vi user only, so I need others' skills for that. QTCreator is on its way. Ideally, if we can have some sort of support for the main IDE, on top of my mind: Eclipse, Netbeans, Visual Studio, Anjuta, KDevelop, etc...

Best regards,
David



On Fri, Nov 29, 2013 at 9:04 AM, oswin krause <[hidden email]> wrote:
Hi,

I am interested in the current development status of the library. is there the possibility to get the latest version somewhere? I tried to find a working branch on https://github.com/boostorg/ublas but could only find changes older than half a year.

I am interested in Benchmarking the "new" ublas against "my" reimplemented subset (think about my personal summer of code :) )

Greetings,
Oswin
_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]



_______________________________________________ ublas mailing list [hidden email] http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]



_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

David Bellot
Hi,

it has been a long discussion we all had for many months now. Should we rewrite ublas from scratch or simply improve it.
Joaquim and Oswin wants a brand new ublas
Nasos was more in favor of improving it.

I personally find very exciting the idea of writing something new, but ublas is very well known now. On the other hand, Eigen and Armadillo took the crown of the main C++ blas library in users' hearts.

On my todo list for ublas, there are things that will require ublas to be deeply changed. At this stage, we can almost talk about a new library.

Christmas is very close now, so maybe it's a good time to start talking about the features we wish for ublas and see if they can be implemented with the current version or if a new uBLAS 2.0 is necessary.
After all, Boost::signal did the same a few years ago. We can definitively do the transition.


I begin:

- unified representation of vectors and matrices to represent the fact that a vector IS a matrix. Matlab does the same
- automated use of different algorithm to let the compiler "chooses" the best implementation (if possible) and switch on SSE, distributed or whateve we need
- implementation of solvers and decompositions algorithms

and this is what Nasos and I think should be integrated too:
1. Matrix multiplication
2. Algorithms infrastructure (so that we can have real useful features)
3. Matrix/vector views for interoperability <- I think this is ultra critical because now ublas is monolithic in the sense that you have to use it everywhere you manipulate data. This would really help into letting people for example have a list of vectors (they are plotting) and ublas working on top of that to do for example transformations
4. NEW DOCUMENTATION - examples and the rest
5. Incorporate some critical bindings (i.e. mumps bindings which is currently probably the most efficient smp and distributed open source linalg solver)
6. matlab binding?
7. distributed ublas


Please add and ESPECIALLY, please tell me your view on the current infrastructure of uBLAS. It seems many people are not happy with the current "expression template" grammar.

I'm open to everything

Best,
David


On Thu, Dec 5, 2013 at 11:14 AM, Joaquim Duran <[hidden email]> wrote:
I think that al stuff pending of merge listed by David, should be merged and migrate to uBlas 2.0 and while uBlas 2.0 is in development/maintenance then design from scratch uBlas 3.0.



_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

Riccardo Rossi
Dear David,

all of the thins you poin out are nice ... if i could add something to the wish list i woul like to add a few items:
1 - please mantain some sort of backward compatibility for people that do not use internals... functions as noalias, prod innner_prod etc, while not strictly needed do not make any harm and maintaining them woul make people's life easier in the migration...
2 - clearly separate where openmp is going to be used (for example sparse and large dense matrices) and where not ("tiny" matrices both in stack and heap). this is important as one needs to avoid nested openmp parallelism
3 - it would be nice to have more c interoperability with the internals of sparse matrices. this would make much easier to wrie wrappers
4 - consider pastix as a direct solver apart for mumps. it is really good and actively developed...

good luck with your effort
ciao
Riccardo


On Sat, Dec 7, 2013 at 11:38 AM, David Bellot <[hidden email]> wrote:
Hi,

it has been a long discussion we all had for many months now. Should we rewrite ublas from scratch or simply improve it.
Joaquim and Oswin wants a brand new ublas
Nasos was more in favor of improving it.

I personally find very exciting the idea of writing something new, but ublas is very well known now. On the other hand, Eigen and Armadillo took the crown of the main C++ blas library in users' hearts.

On my todo list for ublas, there are things that will require ublas to be deeply changed. At this stage, we can almost talk about a new library.

Christmas is very close now, so maybe it's a good time to start talking about the features we wish for ublas and see if they can be implemented with the current version or if a new uBLAS 2.0 is necessary.
After all, Boost::signal did the same a few years ago. We can definitively do the transition.


I begin:

- unified representation of vectors and matrices to represent the fact that a vector IS a matrix. Matlab does the same
- automated use of different algorithm to let the compiler "chooses" the best implementation (if possible) and switch on SSE, distributed or whateve we need
- implementation of solvers and decompositions algorithms

and this is what Nasos and I think should be integrated too:
1. Matrix multiplication
2. Algorithms infrastructure (so that we can have real useful features)
3. Matrix/vector views for interoperability <- I think this is ultra critical because now ublas is monolithic in the sense that you have to use it everywhere you manipulate data. This would really help into letting people for example have a list of vectors (they are plotting) and ublas working on top of that to do for example transformations
4. NEW DOCUMENTATION - examples and the rest
5. Incorporate some critical bindings (i.e. mumps bindings which is currently probably the most efficient smp and distributed open source linalg solver)
6. matlab binding?
7. distributed ublas


Please add and ESPECIALLY, please tell me your view on the current infrastructure of uBLAS. It seems many people are not happy with the current "expression template" grammar.

I'm open to everything

Best,
David


On Thu, Dec 5, 2013 at 11:14 AM, Joaquim Duran <[hidden email]> wrote:
I think that al stuff pending of merge listed by David, should be merged and migrate to uBlas 2.0 and while uBlas 2.0 is in development/maintenance then design from scratch uBlas 3.0.



_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]



--

Dr. Riccardo Rossi, Civil Engineer

Member of Kratos Team

International Center for Numerical Methods in Engineering - CIMNE
Campus Norte, Edificio C1
 

c/ Gran Capitán s/n

08034 Barcelona, España

Tel:        (+34) 93 401 56 96

Fax:       (+34) 93.401.6517

web:       www.cimne.com

_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

Oswin Krause
In reply to this post by David Bellot
Hi,

I like your ideas. But i am not sure what you mean with 1. and 2.
The problem I see is, that uBLAS right now doesn't offer the desired quality and speed for single core, let alone multi core. Therefore I will focus on the internal infrastructure instead of motivating new algorithms.

short description:
1. Rock solid implementation of the BLAS algorithms( "compute kernels")
2. Rewriting of the expression mechanic.
2.1 automatic use of the compute kernels for the operations
2.2 automatic optimisation of expressions


long description
1. Rock solid implementation of the BLAS algorithms: matrix x vector and matrix x matrix in all useful combinations (sparse/sparse dense/dense dense/sparse ... triangular/dense...etc there are a lot), but also: matrix assignment. uBLAS will not be compared in #Features but in runtime compared to Armadillo/Eigen/Atlas/GotoBlas. Having a fast implementation of this is also crucial for multi-core and distributed computations. Also all higher level decompositions (Cholesky, LU, SVD, RankDecomp,...) and triangular solvers rely on fast BLAS.
 
2. Rewriting of the expression mechanic. The way uBLAS is implemented right now, writing

noalias(A)+=prod(B,C)+D
or
vec = prod(prod<matrix>(B,C),vec2);

is in most cases a bad decision. The second expression wouldn't even compile without the template parameter.
What I would like to have is the following:

2.1 automatic use of the fast algorithms. That is the above expression should be translated to
axpy_prod(B,C,A,false);
noalias(A) +=D;

right now the matrix_assign algorithms are used which do a terrible job on prod(B,C).

2.2 automatic optimization of expressions. The second expression above is really inefficient. First the matrix-matrix prod is evaluated and the result is afterwards used for a single matrix-vector prod.

now compare to this:
prod(B,prod<vector>(C,d));

only 2 matrix-vector products are performed which might save 99% of the FLOPS of the first expression. So it would be better if this would be transformed automatically. Normally the user could do that himself but sometimes it is not possible. Consider a Conjugate Gradient Solver:

x=CG(A,b);

a usual pattern of A is A=XX^T+c*Identity. CG uses internally only matrix-vector products - and possibly not many of them. Therefore you don't want to compute A because it is a) slow b) might require more memory than is available. Without expression optimizations it would be impossible to write a good CG that could do that.

Greetings,
Oswin



On 07.12.2013 11:38, David Bellot wrote:
Hi,

it has been a long discussion we all had for many months now. Should we rewrite ublas from scratch or simply improve it.
Joaquim and Oswin wants a brand new ublas
Nasos was more in favor of improving it.

I personally find very exciting the idea of writing something new, but ublas is very well known now. On the other hand, Eigen and Armadillo took the crown of the main C++ blas library in users' hearts.

On my todo list for ublas, there are things that will require ublas to be deeply changed. At this stage, we can almost talk about a new library.

Christmas is very close now, so maybe it's a good time to start talking about the features we wish for ublas and see if they can be implemented with the current version or if a new uBLAS 2.0 is necessary.
After all, Boost::signal did the same a few years ago. We can definitively do the transition.


I begin:

- unified representation of vectors and matrices to represent the fact that a vector IS a matrix. Matlab does the same
- automated use of different algorithm to let the compiler "chooses" the best implementation (if possible) and switch on SSE, distributed or whateve we need
- implementation of solvers and decompositions algorithms

and this is what Nasos and I think should be integrated too:
1. Matrix multiplication
2. Algorithms infrastructure (so that we can have real useful features)
3. Matrix/vector views for interoperability <- I think this is ultra critical because now ublas is monolithic in the sense that you have to use it everywhere you manipulate data. This would really help into letting people for example have a list of vectors (they are plotting) and ublas working on top of that to do for example transformations
4. NEW DOCUMENTATION - examples and the rest
5. Incorporate some critical bindings (i.e. mumps bindings which is currently probably the most efficient smp and distributed open source linalg solver)
6. matlab binding?
7. distributed ublas


Please add and ESPECIALLY, please tell me your view on the current infrastructure of uBLAS. It seems many people are not happy with the current "expression template" grammar.

I'm open to everything

Best,
David


On Thu, Dec 5, 2013 at 11:14 AM, Joaquim Duran <[hidden email]> wrote:
I think that al stuff pending of merge listed by David, should be merged and migrate to uBlas 2.0 and while uBlas 2.0 is in development/maintenance then design from scratch uBlas 3.0.




_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

Riccardo Rossi
Dear Oswin,

my first point is really trivial. i do not doubt that the current expression template mechanism is not optimal.

the only thing i propose is to mantain a small backward compatibility layer so that code that works with the old ublas is not unnecessarily broken.

for example if you assume that the novel syntax is

C += A*B

i believe that it could be useful to define some wraper between the old and new syntax. the easiest way (dirty) would be to define two macros

"noalias"                   and                  " prod"

so that
noalias(C) += prod(A,B)
 
expands to the new format... indeed using macros is not a good idea but it gives the point






my second point is mostly about  matrices of size say 30*30, whose size is NOT known at compile time.
for such matrix size openmp is not useful, however one may perform in parallel a lot of operations on different  small matrices, parallelizing some sort of outer iteration.
for this outer parallelization to be useful, one needs guarantees that openmp is not used in the inner loop... otherwise problems may arise with nested parallelism.
...i only implied this...


ciao
Riccardo









On Sat, Dec 7, 2013 at 10:10 PM, oswin krause <[hidden email]> wrote:
Hi,

I like your ideas. But i am not sure what you mean with 1. and 2.
The problem I see is, that uBLAS right now doesn't offer the desired quality and speed for single core, let alone multi core. Therefore I will focus on the internal infrastructure instead of motivating new algorithms.

short description:
1. Rock solid implementation of the BLAS algorithms( "compute kernels")
2. Rewriting of the expression mechanic.
2.1 automatic use of the compute kernels for the operations
2.2 automatic optimisation of expressions


long description
1. Rock solid implementation of the BLAS algorithms: matrix x vector and matrix x matrix in all useful combinations (sparse/sparse dense/dense dense/sparse ... triangular/dense...etc there are a lot), but also: matrix assignment. uBLAS will not be compared in #Features but in runtime compared to Armadillo/Eigen/Atlas/GotoBlas. Having a fast implementation of this is also crucial for multi-core and distributed computations. Also all higher level decompositions (Cholesky, LU, SVD, RankDecomp,...) and triangular solvers rely on fast BLAS.
 
2. Rewriting of the expression mechanic. The way uBLAS is implemented right now, writing

noalias(A)+=prod(B,C)+D
or
vec = prod(prod<matrix>(B,C),vec2);

is in most cases a bad decision. The second expression wouldn't even compile without the template parameter.
What I would like to have is the following:

2.1 automatic use of the fast algorithms. That is the above expression should be translated to
axpy_prod(B,C,A,false);
noalias(A) +=D;

right now the matrix_assign algorithms are used which do a terrible job on prod(B,C).

2.2 automatic optimization of expressions. The second expression above is really inefficient. First the matrix-matrix prod is evaluated and the result is afterwards used for a single matrix-vector prod.

now compare to this:
prod(B,prod<vector>(C,d));

only 2 matrix-vector products are performed which might save 99% of the FLOPS of the first expression. So it would be better if this would be transformed automatically. Normally the user could do that himself but sometimes it is not possible. Consider a Conjugate Gradient Solver:

x=CG(A,b);

a usual pattern of A is A=XX^T+c*Identity. CG uses internally only matrix-vector products - and possibly not many of them. Therefore you don't want to compute A because it is a) slow b) might require more memory than is available. Without expression optimizations it would be impossible to write a good CG that could do that.

Greetings,
Oswin




On 07.12.2013 11:38, David Bellot wrote:
Hi,

it has been a long discussion we all had for many months now. Should we rewrite ublas from scratch or simply improve it.
Joaquim and Oswin wants a brand new ublas
Nasos was more in favor of improving it.

I personally find very exciting the idea of writing something new, but ublas is very well known now. On the other hand, Eigen and Armadillo took the crown of the main C++ blas library in users' hearts.

On my todo list for ublas, there are things that will require ublas to be deeply changed. At this stage, we can almost talk about a new library.

Christmas is very close now, so maybe it's a good time to start talking about the features we wish for ublas and see if they can be implemented with the current version or if a new uBLAS 2.0 is necessary.
After all, Boost::signal did the same a few years ago. We can definitively do the transition.


I begin:

- unified representation of vectors and matrices to represent the fact that a vector IS a matrix. Matlab does the same
- automated use of different algorithm to let the compiler "chooses" the best implementation (if possible) and switch on SSE, distributed or whateve we need
- implementation of solvers and decompositions algorithms

and this is what Nasos and I think should be integrated too:
1. Matrix multiplication
2. Algorithms infrastructure (so that we can have real useful features)
3. Matrix/vector views for interoperability <- I think this is ultra critical because now ublas is monolithic in the sense that you have to use it everywhere you manipulate data. This would really help into letting people for example have a list of vectors (they are plotting) and ublas working on top of that to do for example transformations
4. NEW DOCUMENTATION - examples and the rest
5. Incorporate some critical bindings (i.e. mumps bindings which is currently probably the most efficient smp and distributed open source linalg solver)
6. matlab binding?
7. distributed ublas


Please add and ESPECIALLY, please tell me your view on the current infrastructure of uBLAS. It seems many people are not happy with the current "expression template" grammar.

I'm open to everything

Best,
David


On Thu, Dec 5, 2013 at 11:14 AM, Joaquim Duran <[hidden email]> wrote:
I think that al stuff pending of merge listed by David, should be merged and migrate to uBlas 2.0 and while uBlas 2.0 is in development/maintenance then design from scratch uBlas 3.0.




_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]



--

Dr. Riccardo Rossi, Civil Engineer

Member of Kratos Team

International Center for Numerical Methods in Engineering - CIMNE
Campus Norte, Edificio C1
 

c/ Gran Capitán s/n

08034 Barcelona, España

Tel:        (+34) 93 401 56 96

Fax:       (+34) 93.401.6517

web:       www.cimne.com

_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

Sathyam Vellal
In reply to this post by David Bellot
Hi all,

I'm very new here and don't know how much value I can add to the discussion! But if a rewrite certainly wins over the current implementation, then it could very well be a good choice! Python did the same when they transitioned to Python 3.0; they had no other way to remove some fundamental design flaws in Python2.x. 
I'm no expert in the area, just pitching in an opinion.

Best,
--
__________________________________________________
Sathyam M Vellal | Computer Science | PES Institute of Technology | Bangalore, India


On Sat, Dec 7, 2013 at 4:08 PM, David Bellot <[hidden email]> wrote:
Hi,

it has been a long discussion we all had for many months now. Should we rewrite ublas from scratch or simply improve it.
Joaquim and Oswin wants a brand new ublas
Nasos was more in favor of improving it.

I personally find very exciting the idea of writing something new, but ublas is very well known now. On the other hand, Eigen and Armadillo took the crown of the main C++ blas library in users' hearts.

On my todo list for ublas, there are things that will require ublas to be deeply changed. At this stage, we can almost talk about a new library.

Christmas is very close now, so maybe it's a good time to start talking about the features we wish for ublas and see if they can be implemented with the current version or if a new uBLAS 2.0 is necessary.
After all, Boost::signal did the same a few years ago. We can definitively do the transition.


I begin:

- unified representation of vectors and matrices to represent the fact that a vector IS a matrix. Matlab does the same
- automated use of different algorithm to let the compiler "chooses" the best implementation (if possible) and switch on SSE, distributed or whateve we need
- implementation of solvers and decompositions algorithms

and this is what Nasos and I think should be integrated too:
1. Matrix multiplication
2. Algorithms infrastructure (so that we can have real useful features)
3. Matrix/vector views for interoperability <- I think this is ultra critical because now ublas is monolithic in the sense that you have to use it everywhere you manipulate data. This would really help into letting people for example have a list of vectors (they are plotting) and ublas working on top of that to do for example transformations
4. NEW DOCUMENTATION - examples and the rest
5. Incorporate some critical bindings (i.e. mumps bindings which is currently probably the most efficient smp and distributed open source linalg solver)
6. matlab binding?
7. distributed ublas


Please add and ESPECIALLY, please tell me your view on the current infrastructure of uBLAS. It seems many people are not happy with the current "expression template" grammar.

I'm open to everything

Best,
David


On Thu, Dec 5, 2013 at 11:14 AM, Joaquim Duran <[hidden email]> wrote:
I think that al stuff pending of merge listed by David, should be merged and migrate to uBlas 2.0 and while uBlas 2.0 is in development/maintenance then design from scratch uBlas 3.0.



_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

Oswin Krause
In reply to this post by Riccardo Rossi
Hi,

I agree with you and I think a rewrite wouldn't change the "high-level" syntax. I like the way expressions are formed right now. There is no real way to get rid of noalias as there is no real cheap solution for alias-tests. And honestly i like prod() over op* because prod() scrams in your face "this is going to take a while".

for this outer parallelization to be useful, one needs guarantees that openmp is not used in the inner loop... otherwise problems may arise with nested parallelism.
I don't see the issue here. OpenMP has in most configurations a fixed size thread pool and would never spawn more threads than allowed. Further it is allowed for an OpenMP thread to spawn more threads.

The real issue is, OpenMP does not and will never support C++. There is OpenMP support in gcc but in fact there is no guarantee that this works as the C++11 memory model is not compatible with the OpenMP model. AFAIK there are no plans for a C++11 compatible OpenMP standard. Therefore we can't expect that there ever is OpenMP support for clang or other compilers and I really think that uBLAS would need to use proper threading models, for example using asio thread_pool



On 07.12.2013 23:35, Riccardo Rossi wrote:
Dear Oswin,

my first point is really trivial. i do not doubt that the current expression template mechanism is not optimal.

the only thing i propose is to mantain a small backward compatibility layer so that code that works with the old ublas is not unnecessarily broken.

for example if you assume that the novel syntax is

C += A*B

i believe that it could be useful to define some wraper between the old and new syntax. the easiest way (dirty) would be to define two macros

"noalias"                   and                  " prod"

so that
noalias(C) += prod(A,B)
 
expands to the new format... indeed using macros is not a good idea but it gives the point






my second point is mostly about  matrices of size say 30*30, whose size is NOT known at compile time.
for such matrix size openmp is not useful, however one may perform in parallel a lot of operations on different  small matrices, parallelizing some sort of outer iteration.
for this outer parallelization to be useful, one needs guarantees that openmp is not used in the inner loop... otherwise problems may arise with nested parallelism.
...i only implied this...


ciao
Riccardo









On Sat, Dec 7, 2013 at 10:10 PM, oswin krause <[hidden email]> wrote:
Hi,

I like your ideas. But i am not sure what you mean with 1. and 2.
The problem I see is, that uBLAS right now doesn't offer the desired quality and speed for single core, let alone multi core. Therefore I will focus on the internal infrastructure instead of motivating new algorithms.

short description:
1. Rock solid implementation of the BLAS algorithms( "compute kernels")
2. Rewriting of the expression mechanic.
2.1 automatic use of the compute kernels for the operations
2.2 automatic optimisation of expressions


long description
1. Rock solid implementation of the BLAS algorithms: matrix x vector and matrix x matrix in all useful combinations (sparse/sparse dense/dense dense/sparse ... triangular/dense...etc there are a lot), but also: matrix assignment. uBLAS will not be compared in #Features but in runtime compared to Armadillo/Eigen/Atlas/GotoBlas. Having a fast implementation of this is also crucial for multi-core and distributed computations. Also all higher level decompositions (Cholesky, LU, SVD, RankDecomp,...) and triangular solvers rely on fast BLAS.
 
2. Rewriting of the expression mechanic. The way uBLAS is implemented right now, writing

noalias(A)+=prod(B,C)+D
or
vec = prod(prod<matrix>(B,C),vec2);

is in most cases a bad decision. The second expression wouldn't even compile without the template parameter.
What I would like to have is the following:

2.1 automatic use of the fast algorithms. That is the above expression should be translated to
axpy_prod(B,C,A,false);
noalias(A) +=D;

right now the matrix_assign algorithms are used which do a terrible job on prod(B,C).

2.2 automatic optimization of expressions. The second expression above is really inefficient. First the matrix-matrix prod is evaluated and the result is afterwards used for a single matrix-vector prod.

now compare to this:
prod(B,prod<vector>(C,d));

only 2 matrix-vector products are performed which might save 99% of the FLOPS of the first expression. So it would be better if this would be transformed automatically. Normally the user could do that himself but sometimes it is not possible. Consider a Conjugate Gradient Solver:

x=CG(A,b);

a usual pattern of A is A=XX^T+c*Identity. CG uses internally only matrix-vector products - and possibly not many of them. Therefore you don't want to compute A because it is a) slow b) might require more memory than is available. Without expression optimizations it would be impossible to write a good CG that could do that.

Greetings,
Oswin




On 07.12.2013 11:38, David Bellot wrote:
Hi,

it has been a long discussion we all had for many months now. Should we rewrite ublas from scratch or simply improve it.
Joaquim and Oswin wants a brand new ublas
Nasos was more in favor of improving it.

I personally find very exciting the idea of writing something new, but ublas is very well known now. On the other hand, Eigen and Armadillo took the crown of the main C++ blas library in users' hearts.

On my todo list for ublas, there are things that will require ublas to be deeply changed. At this stage, we can almost talk about a new library.

Christmas is very close now, so maybe it's a good time to start talking about the features we wish for ublas and see if they can be implemented with the current version or if a new uBLAS 2.0 is necessary.
After all, Boost::signal did the same a few years ago. We can definitively do the transition.


I begin:

- unified representation of vectors and matrices to represent the fact that a vector IS a matrix. Matlab does the same
- automated use of different algorithm to let the compiler "chooses" the best implementation (if possible) and switch on SSE, distributed or whateve we need
- implementation of solvers and decompositions algorithms

and this is what Nasos and I think should be integrated too:
1. Matrix multiplication
2. Algorithms infrastructure (so that we can have real useful features)
3. Matrix/vector views for interoperability <- I think this is ultra critical because now ublas is monolithic in the sense that you have to use it everywhere you manipulate data. This would really help into letting people for example have a list of vectors (they are plotting) and ublas working on top of that to do for example transformations
4. NEW DOCUMENTATION - examples and the rest
5. Incorporate some critical bindings (i.e. mumps bindings which is currently probably the most efficient smp and distributed open source linalg solver)
6. matlab binding?
7. distributed ublas


Please add and ESPECIALLY, please tell me your view on the current infrastructure of uBLAS. It seems many people are not happy with the current "expression template" grammar.

I'm open to everything

Best,
David


On Thu, Dec 5, 2013 at 11:14 AM, Joaquim Duran <[hidden email]> wrote:
I think that al stuff pending of merge listed by David, should be merged and migrate to uBlas 2.0 and while uBlas 2.0 is in development/maintenance then design from scratch uBlas 3.0.




_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]



--

Dr. Riccardo Rossi, Civil Engineer

Member of Kratos Team

International Center for Numerical Methods in Engineering - CIMNE
Campus Norte, Edificio C1
 

c/ Gran Capitán s/n

08034 Barcelona, España

Tel:        (+34) 93 401 56 96

Fax:       (+34) 93.401.6517

web:       www.cimne.com


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

pmamales
Hi,
I would like to suggest :
- the ability of the usage of tbb as a threading model. OpenMP is basically a FORTRAN tool. OpenMP does not integrate well with
other threading models. OpenMP on a library level is going to become a  maintenance nightmare.
- the creation of solvers and more general lapack wrappers as classes, wrapping Lapack calls
(i.e. something like RogueWave libraries provided).
- Also, especially for the LAPACK fortran style matrices, the incorporation of BLAS calls as member functions.
So, instead of axpy type calls, the usage of operator= and operator += overloading with the specific template expressions.
This cannot be generic, i.e. for all kind of matrices, but nevertheless, is still very useful for the ones for which BLAS should be called.
- traits classes that would provide for whether the matrix is dense, banded etc, row/column major, what is the LAPACK leading dimension
etc.
Thak you for your consideration,
Petros P. Mamales
 
Sent: Sunday, December 08, 2013 4:18 AM
Subject: Re: [ublas] Status of development /Benchmarks
 
Hi,

I agree with you and I think a rewrite wouldn't change the "high-level" syntax. I like the way expressions are formed right now. There is no real way to get rid of noalias as there is no real cheap solution for alias-tests. And honestly i like prod() over op* because prod() scrams in your face "this is going to take a while".

for this outer parallelization to be useful, one needs guarantees that openmp is not used in the inner loop... otherwise problems may arise with nested parallelism.
I don't see the issue here. OpenMP has in most configurations a fixed size thread pool and would never spawn more threads than allowed. Further it is allowed for an OpenMP thread to spawn more threads.

The real issue is, OpenMP does not and will never support C++. There is OpenMP support in gcc but in fact there is no guarantee that this works as the C++11 memory model is not compatible with the OpenMP model. AFAIK there are no plans for a C++11 compatible OpenMP standard. Therefore we can't expect that there ever is OpenMP support for clang or other compilers and I really think that uBLAS would need to use proper threading models, for example using asio thread_pool



On 07.12.2013 23:35, Riccardo Rossi wrote:
Dear Oswin,

my first point is really trivial. i do not doubt that the current expression template mechanism is not optimal.

the only thing i propose is to mantain a small backward compatibility layer so that code that works with the old ublas is not unnecessarily broken.

for example if you assume that the novel syntax is

C += A*B

i believe that it could be useful to define some wraper between the old and new syntax. the easiest way (dirty) would be to define two macros

"noalias"                   and                  " prod"

so that
noalias(C) += prod(A,B)
 
expands to the new format... indeed using macros is not a good idea but it gives the point






my second point is mostly about  matrices of size say 30*30, whose size is NOT known at compile time.
for such matrix size openmp is not useful, however one may perform in parallel a lot of operations on different  small matrices, parallelizing some sort of outer iteration.
for this outer parallelization to be useful, one needs guarantees that openmp is not used in the inner loop... otherwise problems may arise with nested parallelism.
...i only implied this...


ciao
Riccardo


 






On Sat, Dec 7, 2013 at 10:10 PM, oswin krause <[hidden email]> wrote:
Hi,

I like your ideas. But i am not sure what you mean with 1. and 2.
The problem I see is, that uBLAS right now doesn't offer the desired quality and speed for single core, let alone multi core. Therefore I will focus on the internal infrastructure instead of motivating new algorithms.

short description:
1. Rock solid implementation of the BLAS algorithms( "compute kernels")
2. Rewriting of the expression mechanic.
2.1 automatic use of the compute kernels for the operations
2.2 automatic optimisation of expressions


long description
1. Rock solid implementation of the BLAS algorithms: matrix x vector and matrix x matrix in all useful combinations (sparse/sparse dense/dense dense/sparse ... triangular/dense...etc there are a lot), but also: matrix assignment. uBLAS will not be compared in #Features but in runtime compared to Armadillo/Eigen/Atlas/GotoBlas. Having a fast implementation of this is also crucial for multi-core and distributed computations. Also all higher level decompositions (Cholesky, LU, SVD, RankDecomp,...) and triangular solvers rely on fast BLAS.
 
2. Rewriting of the expression mechanic. The way uBLAS is implemented right now, writing

noalias(A)+=prod(B,C)+D
or
vec = prod(prod<matrix>(B,C),vec2);

is in most cases a bad decision. The second expression wouldn't even compile without the template parameter.
What I would like to have is the following:

2.1 automatic use of the fast algorithms. That is the above expression should be translated to
axpy_prod(B,C,A,false);
noalias(A) +=D;

right now the matrix_assign algorithms are used which do a terrible job on prod(B,C).

2.2 automatic optimization of expressions. The second expression above is really inefficient. First the matrix-matrix prod is evaluated and the result is afterwards used for a single matrix-vector prod.

now compare to this:
prod(B,prod<vector>(C,d));

only 2 matrix-vector products are performed which might save 99% of the FLOPS of the first expression. So it would be better if this would be transformed automatically. Normally the user could do that himself but sometimes it is not possible. Consider a Conjugate Gradient Solver:

x=CG(A,b);

a usual pattern of A is A=XX^T+c*Identity. CG uses internally only matrix-vector products - and possibly not many of them. Therefore you don't want to compute A because it is a) slow b) might require more memory than is available. Without expression optimizations it would be impossible to write a good CG that could do that.

Greetings,
Oswin




On 07.12.2013 11:38, David Bellot wrote:
Hi,

it has been a long discussion we all had for many months now. Should we rewrite ublas from scratch or simply improve it.
Joaquim and Oswin wants a brand new ublas
Nasos was more in favor of improving it.

I personally find very exciting the idea of writing something new, but ublas is very well known now. On the other hand, Eigen and Armadillo took the crown of the main C++ blas library in users' hearts.

On my todo list for ublas, there are things that will require ublas to be deeply changed. At this stage, we can almost talk about a new library.

Christmas is very close now, so maybe it's a good time to start talking about the features we wish for ublas and see if they can be implemented with the current version or if a new uBLAS 2.0 is necessary.
After all, Boost::signal did the same a few years ago. We can definitively do the transition.


I begin:

- unified representation of vectors and matrices to represent the fact that a vector IS a matrix. Matlab does the same
- automated use of different algorithm to let the compiler "chooses" the best implementation (if possible) and switch on SSE, distributed or whateve we need
- implementation of solvers and decompositions algorithms

and this is what Nasos and I think should be integrated too:
1. Matrix multiplication
2. Algorithms infrastructure (so that we can have real useful features)
3. Matrix/vector views for interoperability <- I think this is ultra critical because now ublas is monolithic in the sense that you have to use it everywhere you manipulate data. This would really help into letting people for example have a list of vectors (they are plotting) and ublas working on top of that to do for example transformations
4. NEW DOCUMENTATION - examples and the rest
5. Incorporate some critical bindings (i.e. mumps bindings which is currently probably the most efficient smp and distributed open source linalg solver)
6. matlab binding?
7. distributed ublas


Please add and ESPECIALLY, please tell me your view on the current infrastructure of uBLAS. It seems many people are not happy with the current "expression template" grammar.

I'm open to everything

Best,
David


On Thu, Dec 5, 2013 at 11:14 AM, Joaquim Duran <[hidden email]> wrote:
I think that al stuff pending of merge listed by David, should be merged and migrate to uBlas 2.0 and while uBlas 2.0 is in development/maintenance then design from scratch uBlas 3.0.




_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]



--

Dr. Riccardo Rossi, Civil Engineer

Member of Kratos Team

International Center for Numerical Methods in Engineering - CIMNE
Campus Norte, Edificio C1
 

c/ Gran Capitán s/n

08034 Barcelona, España

Tel:        (+34) 93 401 56 96

Fax:       (+34) 93.401.6517

web:       www.cimne.com


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]

_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

Oswin Krause
Hi,

On 08.12.2013 22:04, Petros wrote:

- traits classes that would provide for whether the matrix is dense, banded etc, row/column major, what is the LAPACK leading dimension
This is already there.
Given a  Matrix M, we have the following:

M::storage_category is
dense_tag if the matrix has a dense storage format
sparse_tag if the matrix has one of the many sparse formats
packed_tag if the matrix has packed (packed, triangular) storage
unknown_tag everything else (i.e. A+B has unknown_tag even if A and B are dense)

there are a few more tags regarding proxies, but these are the most important ones.

M::orientation_category
either row_major_tag or column_major_tag
if the storage_category is  not unknown, this returns the orientation of storage in memory

The leading dimension is a bit tricky but there are functions in detail/raw.hpp

But maybe you should take a look at the numeric bindings package. I think they have complte lapack bindings and the traits already implemented.

etc.
Thak you for your consideration,
Petros P. Mamales
 
Sent: Sunday, December 08, 2013 4:18 AM
Subject: Re: [ublas] Status of development /Benchmarks
 
Hi,

I agree with you and I think a rewrite wouldn't change the "high-level" syntax. I like the way expressions are formed right now. There is no real way to get rid of noalias as there is no real cheap solution for alias-tests. And honestly i like prod() over op* because prod() scrams in your face "this is going to take a while".

for this outer parallelization to be useful, one needs guarantees that openmp is not used in the inner loop... otherwise problems may arise with nested parallelism.
I don't see the issue here. OpenMP has in most configurations a fixed size thread pool and would never spawn more threads than allowed. Further it is allowed for an OpenMP thread to spawn more threads.

The real issue is, OpenMP does not and will never support C++. There is OpenMP support in gcc but in fact there is no guarantee that this works as the C++11 memory model is not compatible with the OpenMP model. AFAIK there are no plans for a C++11 compatible OpenMP standard. Therefore we can't expect that there ever is OpenMP support for clang or other compilers and I really think that uBLAS would need to use proper threading models, for example using asio thread_pool



On 07.12.2013 23:35, Riccardo Rossi wrote:
Dear Oswin,

my first point is really trivial. i do not doubt that the current expression template mechanism is not optimal.

the only thing i propose is to mantain a small backward compatibility layer so that code that works with the old ublas is not unnecessarily broken.

for example if you assume that the novel syntax is

C += A*B

i believe that it could be useful to define some wraper between the old and new syntax. the easiest way (dirty) would be to define two macros

"noalias"                   and                  " prod"

so that
noalias(C) += prod(A,B)
 
expands to the new format... indeed using macros is not a good idea but it gives the point






my second point is mostly about  matrices of size say 30*30, whose size is NOT known at compile time.
for such matrix size openmp is not useful, however one may perform in parallel a lot of operations on different  small matrices, parallelizing some sort of outer iteration.
for this outer parallelization to be useful, one needs guarantees that openmp is not used in the inner loop... otherwise problems may arise with nested parallelism.
...i only implied this...


ciao
Riccardo


 






On Sat, Dec 7, 2013 at 10:10 PM, oswin krause <[hidden email]> wrote:
Hi,

I like your ideas. But i am not sure what you mean with 1. and 2.
The problem I see is, that uBLAS right now doesn't offer the desired quality and speed for single core, let alone multi core. Therefore I will focus on the internal infrastructure instead of motivating new algorithms.

short description:
1. Rock solid implementation of the BLAS algorithms( "compute kernels")
2. Rewriting of the expression mechanic.
2.1 automatic use of the compute kernels for the operations
2.2 automatic optimisation of expressions


long description
1. Rock solid implementation of the BLAS algorithms: matrix x vector and matrix x matrix in all useful combinations (sparse/sparse dense/dense dense/sparse ... triangular/dense...etc there are a lot), but also: matrix assignment. uBLAS will not be compared in #Features but in runtime compared to Armadillo/Eigen/Atlas/GotoBlas. Having a fast implementation of this is also crucial for multi-core and distributed computations. Also all higher level decompositions (Cholesky, LU, SVD, RankDecomp,...) and triangular solvers rely on fast BLAS.
 
2. Rewriting of the expression mechanic. The way uBLAS is implemented right now, writing

noalias(A)+=prod(B,C)+D
or
vec = prod(prod<matrix>(B,C),vec2);

is in most cases a bad decision. The second expression wouldn't even compile without the template parameter.
What I would like to have is the following:

2.1 automatic use of the fast algorithms. That is the above expression should be translated to
axpy_prod(B,C,A,false);
noalias(A) +=D;

right now the matrix_assign algorithms are used which do a terrible job on prod(B,C).

2.2 automatic optimization of expressions. The second expression above is really inefficient. First the matrix-matrix prod is evaluated and the result is afterwards used for a single matrix-vector prod.

now compare to this:
prod(B,prod<vector>(C,d));

only 2 matrix-vector products are performed which might save 99% of the FLOPS of the first expression. So it would be better if this would be transformed automatically. Normally the user could do that himself but sometimes it is not possible. Consider a Conjugate Gradient Solver:

x=CG(A,b);

a usual pattern of A is A=XX^T+c*Identity. CG uses internally only matrix-vector products - and possibly not many of them. Therefore you don't want to compute A because it is a) slow b) might require more memory than is available. Without expression optimizations it would be impossible to write a good CG that could do that.

Greetings,
Oswin




On 07.12.2013 11:38, David Bellot wrote:
Hi,

it has been a long discussion we all had for many months now. Should we rewrite ublas from scratch or simply improve it.
Joaquim and Oswin wants a brand new ublas
Nasos was more in favor of improving it.

I personally find very exciting the idea of writing something new, but ublas is very well known now. On the other hand, Eigen and Armadillo took the crown of the main C++ blas library in users' hearts.

On my todo list for ublas, there are things that will require ublas to be deeply changed. At this stage, we can almost talk about a new library.

Christmas is very close now, so maybe it's a good time to start talking about the features we wish for ublas and see if they can be implemented with the current version or if a new uBLAS 2.0 is necessary.
After all, Boost::signal did the same a few years ago. We can definitively do the transition.


I begin:

- unified representation of vectors and matrices to represent the fact that a vector IS a matrix. Matlab does the same
- automated use of different algorithm to let the compiler "chooses" the best implementation (if possible) and switch on SSE, distributed or whateve we need
- implementation of solvers and decompositions algorithms

and this is what Nasos and I think should be integrated too:
1. Matrix multiplication
2. Algorithms infrastructure (so that we can have real useful features)
3. Matrix/vector views for interoperability <- I think this is ultra critical because now ublas is monolithic in the sense that you have to use it everywhere you manipulate data. This would really help into letting people for example have a list of vectors (they are plotting) and ublas working on top of that to do for example transformations
4. NEW DOCUMENTATION - examples and the rest
5. Incorporate some critical bindings (i.e. mumps bindings which is currently probably the most efficient smp and distributed open source linalg solver)
6. matlab binding?
7. distributed ublas


Please add and ESPECIALLY, please tell me your view on the current infrastructure of uBLAS. It seems many people are not happy with the current "expression template" grammar.

I'm open to everything

Best,
David


On Thu, Dec 5, 2013 at 11:14 AM, Joaquim Duran <[hidden email]> wrote:
I think that al stuff pending of merge listed by David, should be merged and migrate to uBlas 2.0 and while uBlas 2.0 is in development/maintenance then design from scratch uBlas 3.0.




_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]



--

Dr. Riccardo Rossi, Civil Engineer

Member of Kratos Team

International Center for Numerical Methods in Engineering - CIMNE
Campus Norte, Edificio C1
 

c/ Gran Capitán s/n

08034 Barcelona, España

Tel:        (+34) 93 401 56 96

Fax:       (+34) 93.401.6517

web:       www.cimne.com


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

Rutger ter Borg-2
In reply to this post by David Bellot

Hey David,

the whole problem of most numerical packages, IMHO, is that everything
is tied together. I would encourage a very loosely coupled system. I.e.,
a system that maybe even would be able to switch storage layout,
algorithms, etc., at run-time, maybe following some simple numerical
tests. Of course, only if this would be enabled at compile-time.

Data storage and manipulation should be fully loosely coupled, from low
abstraction to high abstraction:

* the storage of data. This is about memory-efficiency, and hence, speed
of computation. Storage engines might be linear, sparse, etc..
* functional shape of the data. Dense, triangular, etc.
* numerical properties of the data. Positive definite, etc.
* loading and saving the data. Why not support a whole bunch of
data-formats
* unified matrix/vector view on the data
* procedural manipulation of the data, shallow views on the data, views
on views, etc.

Algorithms should be loosely coupled, too, from low abstraction to high
abstraction:

* execution / memory models (synchronous, asynchronous, single core,
multi-core, shared memory, distributed memory, GPU memory)
* computational kernels (own C++ CPU-optimized implementations,
openblas, atlas, gotoblas2, lapack, mumps, opencl, cuda, fft, etc)
* expression trees, syntax of expression trees (Boost.Proto stuff)
* optimal assignment of computational kernels to expression trees (maybe
partly at runtime!)
* being able to extend the expression tree machinery (e.g., with hooks)
with own algorithms

Maybe a new library name would be appropriate?

Cheers,

Rutger




On 2013-12-07 11:38, David Bellot wrote:

> Hi,
>
> it has been a long discussion we all had for many months now. Should we
> rewrite ublas from scratch or simply improve it.
> Joaquim and Oswin wants a brand new ublas
> Nasos was more in favor of improving it.
>
> I personally find very exciting the idea of writing something new, but
> ublas is very well known now. On the other hand, Eigen and Armadillo
> took the crown of the main C++ blas library in users' hearts.
>
> On my todo list for ublas, there are things that will require ublas to
> be deeply changed. At this stage, we can almost talk about a new library.
>
> Christmas is very close now, so maybe it's a good time to start talking
> about the features we wish for ublas and see if they can be implemented
> with the current version or if a new uBLAS 2.0 is necessary.
> After all, Boost::signal did the same a few years ago. We can
> definitively do the transition.
>
>
> I begin:
>
> - unified representation of vectors and matrices to represent the fact
> that a vector IS a matrix. Matlab does the same
> - automated use of different algorithm to let the compiler "chooses" the
> best implementation (if possible) and switch on SSE, distributed or
> whateve we need
> - implementation of solvers and decompositions algorithms
>
> and this is what Nasos and I think should be integrated too:
> 1. Matrix multiplication
> 2. Algorithms infrastructure (so that we can have real useful features)
> 3. Matrix/vector views for interoperability <- I think this is ultra
> critical because now ublas is monolithic in the sense that you have to
> use it everywhere you manipulate data. This would really help into
> letting people for example have a list of vectors (they are plotting)
> and ublas working on top of that to do for example transformations
> 4. NEW DOCUMENTATION - examples and the rest
> 5. Incorporate some critical bindings (i.e. mumps bindings which is
> currently probably the most efficient smp and distributed open source
> linalg solver)
> 6. matlab binding?
> 7. distributed ublas
>
>
> Please add and ESPECIALLY, please tell me your view on the current
> infrastructure of uBLAS. It seems many people are not happy with the
> current "expression template" grammar.
>
> I'm open to everything
>
> Best,
> David
>
>


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

Marco Guazzone



On Mon, Dec 9, 2013 at 10:13 AM, Rutger ter Borg <[hidden email]> wrote:

Hey David,

the whole problem of most numerical packages, IMHO, is that everything is tied together. I would encourage a very loosely coupled system. I.e., a system that maybe even would be able to switch storage layout, algorithms, etc., at run-time, maybe following some simple numerical tests. Of course, only if this would be enabled at compile-time.

Data storage and manipulation should be fully loosely coupled, from low abstraction to high abstraction:

* the storage of data. This is about memory-efficiency, and hence, speed of computation. Storage engines might be linear, sparse, etc..
* functional shape of the data. Dense, triangular, etc.
* numerical properties of the data. Positive definite, etc.
* loading and saving the data. Why not support a whole bunch of data-formats
* unified matrix/vector view on the data
* procedural manipulation of the data, shallow views on the data, views on views, etc.

Algorithms should be loosely coupled, too, from low abstraction to high abstraction:

* execution / memory models (synchronous, asynchronous, single core, multi-core, shared memory, distributed memory, GPU memory)
* computational kernels (own C++ CPU-optimized implementations, openblas, atlas, gotoblas2, lapack, mumps, opencl, cuda, fft, etc)
* expression trees, syntax of expression trees (Boost.Proto stuff)
* optimal assignment of computational kernels to expression trees (maybe partly at runtime!)
* being able to extend the expression tree machinery (e.g., with hooks) with own algorithms


+1 for a library with loosely-coupled feature, even if this means a completely rewriting of uBLAS

For the integration of computational kernels, why don't we use the boost-numeric_bindings lib?

Also, I'd like to have more MATLAB-like functionalities (I've implemented some of them in boost-ublasx, but they relies on boost-numeric_bindings and LAPACK functions)

Best,

-- Marco


Maybe a new library name would be appropriate?

Cheers,

Rutger





On 2013-12-07 11:38, David Bellot wrote:
Hi,

it has been a long discussion we all had for many months now. Should we
rewrite ublas from scratch or simply improve it.
Joaquim and Oswin wants a brand new ublas
Nasos was more in favor of improving it.

I personally find very exciting the idea of writing something new, but
ublas is very well known now. On the other hand, Eigen and Armadillo
took the crown of the main C++ blas library in users' hearts.

On my todo list for ublas, there are things that will require ublas to
be deeply changed. At this stage, we can almost talk about a new library.

Christmas is very close now, so maybe it's a good time to start talking
about the features we wish for ublas and see if they can be implemented
with the current version or if a new uBLAS 2.0 is necessary.
After all, Boost::signal did the same a few years ago. We can
definitively do the transition.


I begin:

- unified representation of vectors and matrices to represent the fact
that a vector IS a matrix. Matlab does the same
- automated use of different algorithm to let the compiler "chooses" the
best implementation (if possible) and switch on SSE, distributed or
whateve we need
- implementation of solvers and decompositions algorithms

and this is what Nasos and I think should be integrated too:
1. Matrix multiplication
2. Algorithms infrastructure (so that we can have real useful features)
3. Matrix/vector views for interoperability <- I think this is ultra
critical because now ublas is monolithic in the sense that you have to
use it everywhere you manipulate data. This would really help into
letting people for example have a list of vectors (they are plotting)
and ublas working on top of that to do for example transformations
4. NEW DOCUMENTATION - examples and the rest
5. Incorporate some critical bindings (i.e. mumps bindings which is
currently probably the most efficient smp and distributed open source
linalg solver)
6. matlab binding?
7. distributed ublas


Please add and ESPECIALLY, please tell me your view on the current
infrastructure of uBLAS. It seems many people are not happy with the
current "expression template" grammar.

I'm open to everything

Best,
David




_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

Rutger ter Borg-2
On 2013-12-09 11:18, sguazt wrote:

>
>
> +1 for a library with loosely-coupled feature, even if this means a
> completely rewriting of uBLAS
>
> For the integration of computational kernels, why don't we use the
> boost-numeric_bindings lib?
>
> Also, I'd like to have more MATLAB-like functionalities (I've
> implemented some of them in boost-ublasx, but they relies on
> boost-numeric_bindings and LAPACK functions)
>

If there is enough interest to integrate (parts of) the numeric bindings
library, I'm willing to help.

Cheers,

Rutger



_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

Nasos Iliopoulos
In reply to this post by David Bellot
I would just like to make a few remarks here. I also integrated some of the ideas that people posted on the list up to now.

On 12/07/2013 05:38 AM, David Bellot wrote:
Hi,

it has been a long discussion we all had for many months now. Should we rewrite ublas from scratch or simply improve it.
Joaquim and Oswin wants a brand new ublas
Nasos was more in favor of improving it.

This is not correct. I only suggested that in order to be able to meet the GSOC deadlines.

I personally find very exciting the idea of writing something new, but ublas is very well known now. On the other hand, Eigen and Armadillo took the crown of the main C++ blas library in users' hearts.

On my todo list for ublas, there are things that will require ublas to be deeply changed. At this stage, we can almost talk about a new library.

Christmas is very close now, so maybe it's a good time to start talking about the features we wish for ublas and see if they can be implemented with the current version or if a new uBLAS 2.0 is necessary.
After all, Boost::signal did the same a few years ago. We can definitively do the transition.


I begin:

- unified representation of vectors and matrices to represent the fact that a vector IS a matrix. Matlab does the same
- automated use of different algorithm to let the compiler "chooses" the best implementation (if possible) and switch on SSE, distributed or whateve we need
- implementation of solvers and decompositions algorithms

and this is what Nasos and I think should be integrated too:
1. Matrix multiplication
2. Algorithms infrastructure (so that we can have real useful features)
3. Matrix/vector views for interoperability <- I think this is ultra critical because now ublas is monolithic in the sense that you have to use it everywhere you manipulate data. This would really help into letting people for example have a list of vectors (they are plotting) and ublas working on top of that to do for example transformations
4. NEW DOCUMENTATION - examples and the rest
5. Incorporate some critical bindings (i.e. mumps bindings which is currently probably the most efficient smp and distributed open source linalg solver)
6. matlab binding?
7. distributed ublas

Since I aided in compiling this list I will just point out some more details on the basic features I believe uBLAS 2.0 should focus on:

1. Expression templates mechanism rewrite : probably go with boost::proto

2. Unified matrix - vector classes. We could even make this a tensor super-class that I would prefer in the end. The critical issue would be to make a graceful traits model (maybe a traits object that bundles together all relevant information, rather than a bunch of properties the current traits model support)

3. Separation of data storage from matrix/vector/tensor classes to break the current monolithic model and enable interoperability with other libraries (i.e. graphics, storage, gui, algebra, etc.). A model like what is used in boost::gil is very appealing to me. Doing that for example would be ultra easy to make an ublas/octave/matlab wrapper.

4. Parallel implementation: The C++11 threading model can be adopted instead of openMP. It will be a little bit more involved to setup and it may require some state in the form of a singleton. To avoid this we will need some sort of interface to initialize threads pools and provide algorithms with the thread data.

5. Distributed uBlas. I consider this critical since it seems that large computations will move to clusters in the future (especially with the availability of all those inexpensive cloud computational cluster services).

6. Algorithms infrastructure (from fast BLAS to general purpose algebra and beyond): I  think this is very critical.

7. DENSE MATRIX MULTIPLICATION: We already have a good implementation from GSOC, that needs some face lifting before it gets into production.

8. Bindings (Migration of the current ones and extension to other popular packages like MUMPS, PARDISO and other solvers)

9. Enable syntax similar to MATLAB or even extend it by using summation convention and Einstein notation. Under such a model, algorithms will be ultra easy to write (although in many cases not optimal with respect to the CPU architecture). Imagine for example the multiplication:

ublas2::index i,j,k;
A(i,j) = B(i,k)*C(k,j); // Automatic looping over i,j and summation over the rhs 'k' index.
 and that would be it!

// note indexes can be reused since they are merely placeholders
ublas2::index m(1,10);
A(m,j) = B(m+20, j)*C(m,j);

For such a plan the critical path would be:

0a. Decide what type of data uBlas will be operating on and start with the traits mechanism. Because all algorithm dispatching will be based on it  it should be carefully designed. Of course a very first trivial interface would be to build on top of the current one:

typedef ublas2::matrix<double> my_matrix:
typedef my_matrix::traits traits;

with
template <typename Storage, typename Orientation, std::size_t Order>
struct traits {
    typedef Storage storage_category;
    typedef Orientation orientation_category;  
    static constexpr std::size_t tensor_order = Order;
}


0b. Make sure the traits model is easily extendable

1a. Dense tensor_view class and appropriate template specialization for matrix and vector classes that people are more accommodated with. Do not forget static containers!
1a. Sparse tensor_view and template specialization for matrix and vector classes if applicable. Otherwise implement only specialized sparse containers.
1b. tensor, matrix, vector classes (own data instead of views, probably wrap around views)

2a. Development of new expression template mechanism and introduction of computational kernel concept alongside with threaded implementation.
2b. Range and/or Summation convention/Einstein notation mechanism

3. Everything else.
 
Ideas/thoughts are welcomed.

-Nasos


Please add and ESPECIALLY, please tell me your view on the current infrastructure of uBLAS. It seems many people are not happy with the current "expression template" grammar.

I'm open to everything

Best,
David


On Thu, Dec 5, 2013 at 11:14 AM, Joaquim Duran <[hidden email]> wrote:
I think that al stuff pending of merge listed by David, should be merged and migrate to uBlas 2.0 and while uBlas 2.0 is in development/maintenance then design from scratch uBlas 3.0.




_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

Nasos Iliopoulos
In reply to this post by Riccardo Rossi
All Riccardo's remark have my vote.

On 12/07/2013 01:50 PM, Riccardo Rossi wrote:
Dear David,

all of the thins you poin out are nice ... if i could add something to the wish list i woul like to add a few items:
1 - please mantain some sort of backward compatibility for people that do not use internals... functions as noalias, prod innner_prod etc, while not strictly needed do not make any harm and maintaining them woul make people's life easier in the migration...
2 - clearly separate where openmp is going to be used (for example sparse and large dense matrices) and where not ("tiny" matrices both in stack and heap). this is important as one needs to avoid nested openmp parallelism
3 - it would be nice to have more c interoperability with the internals of sparse matrices. this would make much easier to wrie wrappers
4 - consider pastix as a direct solver apart for mumps. it is really good and actively developed...

good luck with your effort
ciao
Riccardo


On Sat, Dec 7, 2013 at 11:38 AM, David Bellot <[hidden email]> wrote:
Hi,

it has been a long discussion we all had for many months now. Should we rewrite ublas from scratch or simply improve it.
Joaquim and Oswin wants a brand new ublas
Nasos was more in favor of improving it.

I personally find very exciting the idea of writing something new, but ublas is very well known now. On the other hand, Eigen and Armadillo took the crown of the main C++ blas library in users' hearts.

On my todo list for ublas, there are things that will require ublas to be deeply changed. At this stage, we can almost talk about a new library.

Christmas is very close now, so maybe it's a good time to start talking about the features we wish for ublas and see if they can be implemented with the current version or if a new uBLAS 2.0 is necessary.
After all, Boost::signal did the same a few years ago. We can definitively do the transition.


I begin:

- unified representation of vectors and matrices to represent the fact that a vector IS a matrix. Matlab does the same
- automated use of different algorithm to let the compiler "chooses" the best implementation (if possible) and switch on SSE, distributed or whateve we need
- implementation of solvers and decompositions algorithms

and this is what Nasos and I think should be integrated too:
1. Matrix multiplication
2. Algorithms infrastructure (so that we can have real useful features)
3. Matrix/vector views for interoperability <- I think this is ultra critical because now ublas is monolithic in the sense that you have to use it everywhere you manipulate data. This would really help into letting people for example have a list of vectors (they are plotting) and ublas working on top of that to do for example transformations
4. NEW DOCUMENTATION - examples and the rest
5. Incorporate some critical bindings (i.e. mumps bindings which is currently probably the most efficient smp and distributed open source linalg solver)
6. matlab binding?
7. distributed ublas


Please add and ESPECIALLY, please tell me your view on the current infrastructure of uBLAS. It seems many people are not happy with the current "expression template" grammar.

I'm open to everything

Best,
David


On Thu, Dec 5, 2013 at 11:14 AM, Joaquim Duran <[hidden email]> wrote:
I think that al stuff pending of merge listed by David, should be merged and migrate to uBlas 2.0 and while uBlas 2.0 is in development/maintenance then design from scratch uBlas 3.0.



_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]



--

Dr. Riccardo Rossi, Civil Engineer

Member of Kratos Team

International Center for Numerical Methods in Engineering - CIMNE
Campus Norte, Edificio C1
 

c/ Gran Capitán s/n

08034 Barcelona, España

Tel:        (+34) 93 401 56 96

Fax:       (+34) 93.401.6517

web:       www.cimne.com


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Status of development /Benchmarks

Athanasios Iliopoulos
In reply to this post by Oswin Krause

Oswin,
what do you have in mind in 2.2?

On 12/07/2013 04:10 PM, oswin krause wrote:
Hi,

I like your ideas. But i am not sure what you mean with 1. and 2.
The problem I see is, that uBLAS right now doesn't offer the desired quality and speed for single core, let alone multi core. Therefore I will focus on the internal infrastructure instead of motivating new algorithms.

short description:
1. Rock solid implementation of the BLAS algorithms( "compute kernels")
2. Rewriting of the expression mechanic.
2.1 automatic use of the compute kernels for the operations
2.2 automatic optimisation of expressions

long description
1. Rock solid implementation of the BLAS algorithms: matrix x vector and matrix x matrix in all useful combinations (sparse/sparse dense/dense dense/sparse ... triangular/dense...etc there are a lot), but also: matrix assignment. uBLAS will not be compared in #Features but in runtime compared to Armadillo/Eigen/Atlas/GotoBlas. Having a fast implementation of this is also crucial for multi-core and distributed computations. Also all higher level decompositions (Cholesky, LU, SVD, RankDecomp,...) and triangular solvers rely on fast BLAS.
 
2. Rewriting of the expression mechanic. The way uBLAS is implemented right now, writing

noalias(A)+=prod(B,C)+D
or
vec = prod(prod<matrix>(B,C),vec2);

is in most cases a bad decision. The second expression wouldn't even compile without the template parameter.
What I would like to have is the following:

2.1 automatic use of the fast algorithms. That is the above expression should be translated to
axpy_prod(B,C,A,false);
noalias(A) +=D;

right now the matrix_assign algorithms are used which do a terrible job on prod(B,C).

2.2 automatic optimization of expressions. The second expression above is really inefficient. First the matrix-matrix prod is evaluated and the result is afterwards used for a single matrix-vector prod.

now compare to this:
prod(B,prod<vector>(C,d));

only 2 matrix-vector products are performed which might save 99% of the FLOPS of the first expression. So it would be better if this would be transformed automatically. Normally the user could do that himself but sometimes it is not possible. Consider a Conjugate Gradient Solver:

x=CG(A,b);

a usual pattern of A is A=XX^T+c*Identity. CG uses internally only matrix-vector products - and possibly not many of them. Therefore you don't want to compute A because it is a) slow b) might require more memory than is available. Without expression optimizations it would be impossible to write a good CG that could do that.

Greetings,
Oswin



On 07.12.2013 11:38, David Bellot wrote:
Hi,

it has been a long discussion we all had for many months now. Should we rewrite ublas from scratch or simply improve it.
Joaquim and Oswin wants a brand new ublas
Nasos was more in favor of improving it.

I personally find very exciting the idea of writing something new, but ublas is very well known now. On the other hand, Eigen and Armadillo took the crown of the main C++ blas library in users' hearts.

On my todo list for ublas, there are things that will require ublas to be deeply changed. At this stage, we can almost talk about a new library.

Christmas is very close now, so maybe it's a good time to start talking about the features we wish for ublas and see if they can be implemented with the current version or if a new uBLAS 2.0 is necessary.
After all, Boost::signal did the same a few years ago. We can definitively do the transition.


I begin:

- unified representation of vectors and matrices to represent the fact that a vector IS a matrix. Matlab does the same
- automated use of different algorithm to let the compiler "chooses" the best implementation (if possible) and switch on SSE, distributed or whateve we need
- implementation of solvers and decompositions algorithms

and this is what Nasos and I think should be integrated too:
1. Matrix multiplication
2. Algorithms infrastructure (so that we can have real useful features)
3. Matrix/vector views for interoperability <- I think this is ultra critical because now ublas is monolithic in the sense that you have to use it everywhere you manipulate data. This would really help into letting people for example have a list of vectors (they are plotting) and ublas working on top of that to do for example transformations
4. NEW DOCUMENTATION - examples and the rest
5. Incorporate some critical bindings (i.e. mumps bindings which is currently probably the most efficient smp and distributed open source linalg solver)
6. matlab binding?
7. distributed ublas


Please add and ESPECIALLY, please tell me your view on the current infrastructure of uBLAS. It seems many people are not happy with the current "expression template" grammar.

I'm open to everything

Best,
David


On Thu, Dec 5, 2013 at 11:14 AM, Joaquim Duran <[hidden email]> wrote:
I think that al stuff pending of merge listed by David, should be merged and migrate to uBlas 2.0 and while uBlas 2.0 is in development/maintenance then design from scratch uBlas 3.0.




_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]



_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]


-- 
Dr Athanasios Iliopoulos
Research Assistant Professor

George Mason University
Resident at Computational Multiphysics Systems Lab. Code 6394
Center of Computational Material Science
Naval Research Laboratory
4555 Overlook Ave. SW, Washington DC 20375
Tel : +1 202 767 2165
e-mail: [hidden email]
URL: https://www.nrl.navy.mil/mstd/ccms/cmsl/ 

_______________________________________________
ublas mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/ublas
Sent to: [hidden email]
12