Adding my sorting algorithm, ska_sort, to boost, attempt 2

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Adding my sorting algorithm, ska_sort, to boost, attempt 2

Boost - Dev mailing list
Hi everyone,

I started a discussion about this in October, but at the time the maintainer of the sort library was busy, then I was busy, so now it's January...

I wrote a generalized version of radix sort a couple years ago and gave a talk about it here:
https://www.youtube.com/watch?v=zqs87a_7zxw

I would like to get a new version of that sorting algorithm into boost. It's faster than what I showed several years ago. It's also even more general and has a new interface that should be easier to extend. I uploaded the code here:

https://github.com/skarupke/sort/tree/master

The documentation is also already updated with new benchmark results.

Let me know what's required to get this into boost. Would you like this in the form of a pull request?

One of the big things that probably needs review is the interface, since radix sorting works differently than comparison based sorting. For most normal uses you just need to provide a sort key, but more complicated cases can get more complicated. I tried very hard to make everything easy, but I'd also be curious if others can come up with better ways of doing the interface.

Thanks,

Malte

--
  Malte Skarupke
  [hidden email]

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Adding my sorting algorithm, ska_sort, to boost, attempt 2

Boost - Dev mailing list
On Wednesday, January 22, 2020, Malte Skarupke  wrote:

> I started a discussion about this in October, but at the time the
> maintainer of the sort library was busy, then I was busy, so now it's
> January...
>
> I wrote a generalized version of radix sort a couple years ago and gave a
> talk about it here:
> https://www.youtube.com/watch?v=zqs87a_7zxw
>
> I would like to get a new version of that sorting algorithm into boost.
>
> Let me know what's required to get this into boost.
>


Try e-mailing the maintainer of Boost.Sort again. If he does not approve,
you don't want to spend your time working on a pull request.

Glen

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Adding my sorting algorithm, ska_sort, to boost, attempt 2

Boost - Dev mailing list
On Wed, 22 Jan 2020 at 11:44, Glen Fernandes via Boost <
[hidden email]> wrote:

> If he does not approve, you don't want to spend your time working on a
> pull request.
>

 Then some other way has to be found. I did not try V2 (faster, says
author), but V1, what I do use, gives any sort a run for it's money, it's a
no-brainer.

On the other hand, personally I'd rather see the library as stand alone
(V2). V1 is/was a 1 file header, and so easy to integrate.

degski
--
@realdegski
https://brave.com/google-gdpr-workaround/
"We value your privacy, click here!" Sod off! - degski
"Anyone who believes that exponential growth can go on forever in a finite
world is either a madman or an economist" - Kenneth E. Boulding
"Growth for the sake of growth is the ideology of the cancer cell" - Edward
P. Abbey

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Adding my sorting algorithm, ska_sort, to boost, attempt 2

Boost - Dev mailing list
On Thu, Jan 23, 2020 at 8:13 AM degski wrote:
>  Then some other way has to be found.

There are other ways that don't involve Boost.Sort. One way is:
https://www.boost.org/community/reviews.html

> On the other hand, personally I'd rather see the library as stand alone (V2). V1 is/was a 1 file header, and so easy to integrate.

You could ask Matle to provide (on his GitHub) a single header
ska_sort.hpp (that has no dependencies besides the C++ standard
library).

Glen

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Adding my sorting algorithm, ska_sort, to boost, attempt 2

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list

> On 22. Jan 2020, at 18:23, Malte Skarupke via Boost <[hidden email]> wrote:
>
> Let me know what's required to get this into boost. Would you like this in the form of a pull request?
>
> One of the big things that probably needs review is the interface, since radix sorting works differently than comparison based sorting. For most normal uses you just need to provide a sort key, but more complicated cases can get more complicated. I tried very hard to make everything easy, but I'd also be curious if others can come up with better ways of doing the interface.

I am not an expert on Boost.Sort and only had a quick look at your changes, but:

If I understood correctly, your new algorithm increases the requirement for Boost.Sort to C++17 from C++11. I don't see a good reason for doing that. You can replace if constexpr and the generic lambdas in your implementation with other code constructs that work on C++11.

Regarding the interface, shouldn't it follow the example of spreadsort?

Regarding having an extra single-header version: that seems overkill to me. Boost.Sort already claims to have no dependencies on other Boost libraries and is header-only, which should make it extremely easy to include in any project.

"These algorithms do not use any other library or utility. The parallel algorithms need a C++11 compliant compiler."

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Adding my sorting algorithm, ska_sort, to boost, attempt 2

Boost - Dev mailing list
>
> Regarding having an extra single-header version: that seems overkill to me. Boost.Sort already claims to have no dependencies on other Boost libraries and is header-only, which should make it extremely easy to include in any project.
>
> "These algorithms do not use any other library or utility. The parallel algorithms need a C++11 compliant compiler."
>

FYI:
Apparently that statement seems to be not quite up to date:

According to boostdep, sort depends on config, core, range, serialization and static_assert.
However, that seems to be restricted to the spreadsort headers and transitively to the
<boost/sort/sort.hpp> header (pdqsort also needs type traits)

Best

Mike

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Adding my sorting algorithm, ska_sort, to boost, attempt 2

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
> If I understood correctly, your new algorithm increases the requirement
> for Boost.Sort to C++17 from C++11. I don't see a good reason for doing
> that. You can replace if constexpr and the generic lambdas in your
> implementation with other code constructs that work on C++11.

It only requires C++17 if you want to use ska_sort. The other algorithms remain at C++11. I agree that this is a little unfortunate, but when I briefly tried to get rid of "if constexpr" in the code, it turned out to be a giant pain. It would also probably make the code compile much more slowly. If you are trying to sort nested types, like vector<vector<pair<string, int>>>, compile time can explode without some of the short circuiting that I'm doing using if constexpr. (I can explain why it explodes if you're interested, but for now I'll just say that it works different from comparison based sorting and you have to instantiate new code at every level of depth) I know I can do the same short circuiting without if constexpr if I put in enough work, but I think the code quality would suffer a lot, making the code much harder to read and maintain. (not that it's simple as is, but it would go from "hard to read" to "impossible to read")

> Regarding the interface, shouldn't it follow the example of spreadsort?

The main benefit of this over spreadsort is that it's more generic. The spreadsort interface is hard to generalize to other types. For example it allows you to provide your own shift operator. But how do you provide a shift operator when trying to sort a pair<string, int>? When you get to the end of the string, you'd have to start providing bits from the int, but that's going to be annoyingly complicated to write. (especially since I'm not sure if spreadsort works on byte boundaries)

It gets even more complicated when trying to sort a variant<string, int>. Because now the bits depend on which value of the variant is active, but you don't want to have to check that over and over again.

The interface was very important for making this more general. When it comes to this kind of decisions, I have to make as many of them invisible for the user as possible, because only the internals of the sorting algorithm can know which bits to sort on next.



About providing a single header: That is definitely what I would do as a standalone library, but I think in boost it's OK to split this into multiple headers. Users can include boost/sort/ska_sort/ska_sort.hpp which includes all necessary headers. So usually they don't care that it's multiple headers internally. And if you want slightly faster compile time, you can just include ska_sort_base.hpp without the headers that provide tuple or variant implementations. (this is currently not part of the documentation because I'm not sure if I want to provide that forever)

So why make this part of boost instead of a standalone header? Basically I like having the assurance that it'll be kept working for a long time. I'm willing to do that and to put future work into this, but having it in boost ensures that that continues to happen even after I can no longer provide support.


--
  Malte Skarupke
  [hidden email]

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Adding my sorting algorithm, ska_sort, to boost, attempt 2

Boost - Dev mailing list
Hello,


The first thing is to apologize for the delay. I have medical problems in
my family, and all my free time is gone.


I didn't remember that in February 2019 we evaluated this same algorithm.
In the message that Steven wrote, he mentioned a list of major deficiencies
in the algorithm.


*It looks like ska_sort is a radix sort designed mostly for the best-case,
and that if you feed it the alrbreaker or binaryalrbreaker distributions it
will perform poorly (though not as poorly as radix implementations with no
fallback or that fallback to insertionsort).*

*There might be some counting and swapping optimizations here to copy into
spreadsort (I see block operations), though those might not work if the
items being sorted are references or pointers.*

*Mostly-sorted data is a common case to take seriously, and it took only
modest tweaking of spreadsort to do well there.*

*Based on Francisco's numbers, I should retune spreadsort to take advantage
of the higher speed of pdqsort vs std::sort, at least if I want to optimize
for 64-bit integer sorting on modern processors.*

 *https://www.boost.org/doc/libssort/spreadsort/*
<https://www.boost.org/doc/libs/1_67_0/boost/sort/spreadsort/detail/constants.hpp>

*A higher log_min_split_count (especially) and log_finishing_count is
probably appropriate.*


Every so often we get requests for evaluation of sorting algorithms, many
of which are "world's fastest algorithm". We try to evaluate such requests.
I think everyone deserves respect for their work. But so do we who spend
our time doing the evaluation.


I did the evaluation of the algorithm, and I was surprised, because the
recommendations we made in February, have not been followed. So the tests
I've done, have been to corroborate what we already said in February. In
other words, repeat work.


The comments I have to make to the algorithm are as follows:


1- The algorithm needs C++17. This is a big problem because the library is
made to be compatible with C++11. In many companies and workplaces, the
compilers that are being used are old, at most compatible with C++11. In
the same way that for production you don't usually use the latest version
of the Operating System and SW, you don't usually use the latest version of
the compiler either.

It is not possible to pass a part of the library to C++17 and another part
to remain with C++11. Passing it all to C++17 means leaving out many people
and many companies, and that goes against the spirit of the library which
is to reach as many users as possible.


2.- The algorithm has not implemented worst-case control, nor when all
elements are the same or are ordered, as it was indicated in February.


3.- The algorithm is incomplete. It is to be expected from the algorithm
that it orders, not only integers but also signed and unsigned integers, 32
64 and 128-bit floating-point numbers and strings.


4.- The only argument of the algorithm is its speed. About this, I have
several comments

a) When the implementation incorporates worst-case control of repeated and
equal elements, the speed may change


b) The speed of an algorithm is a factor to be taken into account, but it
is not the only one. For example, the Timsort algorithm is slower than any
other algorithm with random data. But it is extremely fast when the data
are almost ordered, a situation that occurs with much more frequency than
we imagine. This algorithm is highly valued by programmers and companies,
and even the Java language uses it for its sort function.

Ska_sort is very fast with the random data, but it is quite mediocre when
the data are almost ordered.


5) Having a hybrid algorithm like Spreadsort, having another hybrid
algorithm is not very interesting, because with small data sets, the time
differences, if any, will be small, and with large data sets, parallel
algorithms are used, which provide much more speed than any single-stranded
algorithm


In addition to all this, I would like to express my discomfort at having to
waste my time, which, as I said before, is very scarce, in repeating a
test, which does not contribute anything, because it has not followed any
of our recommendations.


Francisco Tapia

El jue., 23 ene. 2020 a las 19:36, Malte Skarupke via Boost (<
[hidden email]>) escribió:

> > If I understood correctly, your new algorithm increases the requirement
> > for Boost.Sort to C++17 from C++11. I don't see a good reason for doing
> > that. You can replace if constexpr and the generic lambdas in your
> > implementation with other code constructs that work on C++11.
>
> It only requires C++17 if you want to use ska_sort. The other algorithms
> remain at C++11. I agree that this is a little unfortunate, but when I
> briefly tried to get rid of "if constexpr" in the code, it turned out to be
> a giant pain. It would also probably make the code compile much more
> slowly. If you are trying to sort nested types, like
> vector<vector<pair<string, int>>>, compile time can explode without some of
> the short circuiting that I'm doing using if constexpr. (I can explain why
> it explodes if you're interested, but for now I'll just say that it works
> different from comparison based sorting and you have to instantiate new
> code at every level of depth) I know I can do the same short circuiting
> without if constexpr if I put in enough work, but I think the code quality
> would suffer a lot, making the code much harder to read and maintain. (not
> that it's simple as is, but it would go from "hard to read" to "impossible
> to read")
>
> > Regarding the interface, shouldn't it follow the example of spreadsort?
>
> The main benefit of this over spreadsort is that it's more generic. The
> spreadsort interface is hard to generalize to other types. For example it
> allows you to provide your own shift operator. But how do you provide a
> shift operator when trying to sort a pair<string, int>? When you get to the
> end of the string, you'd have to start providing bits from the int, but
> that's going to be annoyingly complicated to write. (especially since I'm
> not sure if spreadsort works on byte boundaries)
>
> It gets even more complicated when trying to sort a variant<string, int>.
> Because now the bits depend on which value of the variant is active, but
> you don't want to have to check that over and over again.
>
> The interface was very important for making this more general. When it
> comes to this kind of decisions, I have to make as many of them invisible
> for the user as possible, because only the internals of the sorting
> algorithm can know which bits to sort on next.
>
>
>
> About providing a single header: That is definitely what I would do as a
> standalone library, but I think in boost it's OK to split this into
> multiple headers. Users can include boost/sort/ska_sort/ska_sort.hpp which
> includes all necessary headers. So usually they don't care that it's
> multiple headers internally. And if you want slightly faster compile time,
> you can just include ska_sort_base.hpp without the headers that provide
> tuple or variant implementations. (this is currently not part of the
> documentation because I'm not sure if I want to provide that forever)
>
> So why make this part of boost instead of a standalone header? Basically I
> like having the assurance that it'll be kept working for a long time. I'm
> willing to do that and to put future work into this, but having it in boost
> ensures that that continues to happen even after I can no longer provide
> support.
>
>
> --
>   Malte Skarupke
>   [hidden email]
>
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost
>

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Adding my sorting algorithm, ska_sort, to boost, attempt 2

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
On Thu, 23 Jan 2020 at 18:34, Malte Skarupke via Boost
<[hidden email]> wrote:
>
> > If I understood correctly, your new algorithm increases the requirement
> > for Boost.Sort to C++17 from C++11. I don't see a good reason for doing
> > that. You can replace if constexpr and the generic lambdas in your
> > implementation with other code constructs that work on C++11.
>
> It only requires C++17 if you want to use ska_sort. The other algorithms remain at C++11. I agree that this is a little unfortunate, but when I briefly tried to get rid of "if constexpr" in the code, it turned out to be a giant pain. It would also probably make the code compile much more slowly. If you are trying to sort nested types, like vector<vector<pair<string, int>>>, compile time can explode without some of the short circuiting that I'm doing using if constexpr. (I can explain why it explodes if you're interested, but for now I'll just say that it works different from comparison based sorting and you have to instantiate new code at every level of depth) I know I can do the same short circuiting without if constexpr if I put in enough work, but I think the code quality would suffer a lot, making the code much harder to read and maintain. (not that it's simple as is, but it would go from "hard to read" to "impossible to read")

My experience with looking at C++17 codebases (which are still quite
limited) suggests that generalized use of "if constexpr" is a code
smell that suggests bad design, making code difficult to maintain and
extend and also causing unnecessarily long compilation times.

Designing generic components and combining them in a more traditional
fashion leads to more sharing of template instantiations as
special-casing happen at the declaration boundary rather than in the
middle of a definition.
If constexpr also encourages placing local specializations at random
points in the generic code instead of looking at the big picture,
rationalizing requirements and deferring any special action to a
policy that models a more or less refined concept, which definition is
in a single place within the code. Local changes will eventually pile
up, and since they probably have some relation to each other but are
not together in the code, they end up being fragile.

I haven't looked at your code in particular, but that might be
something to consider.

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: Adding my sorting algorithm, ska_sort, to boost, attempt 2

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
> Francisco Tapia, [hidden email] wrote:
> In addition to all this, I would like to express my discomfort at having to
> waste my time, which, as I said before, is very scarce, in repeating a
> test, which does not contribute anything, because it has not followed any
> of our recommendations.

I'm sorry as well that you had to waste your time, but I also don't know what I could have done here... I wasn't on the mailing list at the time and didn't see those recommendations until you just pointed them out to me. But a simple "hey, I think we already looked at this a year ago and here is a link to that discussion" would have been perfectly fine.

Alternatively, if you are having a discussion about one of my algorithms, include me in the discussion. I would have loved to have known about the discussion when it first happened.

So, having now finally heard your recommendations, here are my responses:

> It looks like ska_sort is a radix sort designed mostly for the best-case,
> and that if you feed it the alrbreaker or binaryalrbreaker distributions it
> will perform poorly (though not as poorly as radix implementations with no
> fallback or that fallback to insertionsort).

I can try to optimize more for these cases. I did of course handle those cases without degrading performance too much (and when sorting containers I skip shared prefixes to make this less likely) but I didn't try at all to optimize this. I didn't think that these were common cases, so I figured "just make sure performance doesn't degrade too much" would be enough. If you happen to remember, do you know what spreadsort does for cases that are traditionally really bad for radix sort? Maybe I can use the same optimization.

> It is not possible to pass a part of the library to C++17 and another part
> to remain with C++11. Passing it all to C++17 means leaving out many people
> and many companies, and that goes against the spirit of the library which
> is to reach as many users as possible.

Can you elaborate on that? Why is this not possible? Is there anything I can do to help? For example I could remove the ska_sort.hpp include from the common boost/sort/sort.hpp header. That way anyone who includes that header will continue to compile with C++11, and to get ska_sort you would have to directly include boost/sort/ska_sort/ska_sort.hpp.

> 2.- The algorithm has not implemented worst-case control, nor when all
> elements are the same or are ordered, as it was indicated in February.

It looks like spreadsort handles both of these by essentially doing std::is_sorted() on the input before doing any other work. I wouldn't want to do that in ska_sort because it slows down all the partially sorted cases to first have to do this check. But if you want, I can add an overload:

template<typename It>
void ska_sort_check_sorted(It begin, It end)
{
    if (!std::is_sorted(begin, end))
        ska_sort(begin, end);
}

With that overload it would be up to the user to decide if they want this check or not. They could either call this overload or ska_sort directly. But I still would prefer to not do that. My philosophy on this follows iammilind's response on this stack overflow question:
https://stackoverflow.com/questions/6567326/does-stdsort-check-if-a-vector-is-already-sorted

> 3.- The algorithm is incomplete. It is to be expected from the algorithm
> that it orders, not only integers but also signed and unsigned integers, 32
> 64 and 128-bit floating-point numbers and strings.

It already sorts all kinds of integers as well as float, double and long double, strings, tuples, vectors, variants, optionals, lists, maps and other types that I'm forgetting. I can add support for 128 bit floating point numbers (it's exactly the same logic as for long doubles) but it doesn't seem that widely used. Microsoft doesn't even support it in their compiler. Also spreadsort does not support it.

>4.- The only argument of the algorithm is its speed. About this, I have
> several comments
>
> a) When the implementation incorporates worst-case control of repeated and
> equal elements, the speed may change

See discussion about why I don't do this above. I expect it to slow down if I call std::is_sorted() first and if the user already knows that it's not sorted, they don't want that overhead.

> b) The speed of an algorithm is a factor to be taken into account, but it
> is not the only one. For example, the Timsort algorithm is slower than any
> other algorithm with random data. But it is extremely fast when the data
> are almost ordered, a situation that occurs with much more frequency than
> we imagine. This algorithm is highly valued by programmers and companies,
> and even the Java language uses it for its sort function.
>
> Ska_sort is very fast with the random data, but it is quite mediocre when
> the data are almost ordered.

According to your own benchmark results from a year ago, ska_sort does better on almost ordered data than spreadsort and pdqsort. But I confess that it's not a case that I optimized for, mainly because I don't know a good way to make radix sorts behave well on almost sorted data. But if this is a requirement for all new sorting algorithms, I would probably look into wrapping ska_sort in vergesort:

https://github.com/Morwenn/vergesort

However if I were to do that, it almost feels like it should be a separate algorithm. Just like pdqsort and spinsort are separate algorithms. Between those two, if users want something that works fast on almost sorted data, they can use spinsort, otherwise they can use pdqsort.

And if we follow that thinking, and we say that the version that works well on almost sorted data is a separate algorithm, then what's the benefit of doing that over just telling people to use spinsort on almost sorted data? (which is what the documentation currently does)

> 5) Having a hybrid algorithm like Spreadsort, having another hybrid
> algorithm is not very interesting, because with small data sets, the time
> differences, if any, will be small, and with large data sets, parallel
> algorithms are used, which provide much more speed than any single-stranded
> algorithm

I am also thinking about providing a parallel algorithm, but it's a large amount of work. These algorithms take me months to write. So before I do that, this submission of ska_sort is kind of my testing ground to see if this is something I'm interested in doing again. If this goes well, I may work on a parallel version in the future.

If you're worried about having to maintain two different sorting algorithms, I think that we might be able to deprecate spreadsort and to make it just call ska_sort internally. Since the ska_sort interface can sort more than the spreadsort interface, I think it should be possible to keep all old users working. (with the one complication that ska_sort uses C++17 of course...)

As for whether people care about single threaded performance: In my environment almost all sorts are single threaded. Most cores already have work allocated on them, so if you want to "go wide" for a piece of code, you need to know that there are gaps on other threads running in parallel. If you're not sure about that, just use a single threaded algorithm. Because if the other cores are busy, your parallel tasks get put at the end of the task queue and you've just increased your tail latency a lot. And the performance does really matter. We have used ska_sort to speed things up to run in 1.8 milliseconds instead of 3.3 milliseconds, and that was a big deal. In video games you only have 16ms to do all your work to get something on the screen, so a speedup by a whole millisecond is rare.

But there is a bigger point for why we should have a new algorithm: I believe that in the future everyone will use radix sort. There is a unique opportunity here for boost to define the interface that those sorting algorithms will use in the future. The spreadsort interface is certainly not the right interface for a generic radix sort, so we should use this opportunity to come up with one.

--
  Malte Skarupke
  [hidden email]

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost