[Math] float16 on ARM

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[Math] float16 on ARM

Boost - Dev mailing list
Hi,

There seems to be support for 16-bit floats on iOS, when using
Apple's XCode clang.  This seems to be a storage-only format
with the only instruction set support being conversions to
and from 32- and 64-bit floats.  Quick test:

#if __ARM_FP & 2
#warning "Have 16-bit FP"
#endif

void test()
{
  __fp16 a = 1.234;
  static_assert(sizeof(a)==2);
  __fp16 b = 2.345;
  auto sum = a+b;
  static_assert(sizeof(sum)==4);
}

There doesn't seem to be a std::numeric_limits specialisation.

I suspect that other platforms have something similar.  It would be
good to have a boost::float16_t typedef and a portable feature-test
macro (and maybe a software fallback).  As far as I can see,
boost/math/cstdfloat/cstdfloat_types.hpp checks the sizes of
float, double and long double, and checks for 128-bit floats provided
by the compiler.  Can this be extended to check for __fp16 ?

(P.S. it seems that gcc also has support, see
https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html )

Cheers, Phil.




_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [Math] float16 on ARM

Boost - Dev mailing list


> -----Original Message-----
> From: Boost <[hidden email]> On Behalf Of Phil Endecott via
Boost
> Sent: 14 October 2019 20:15
> To: [hidden email]
> Cc: Phil Endecott <[hidden email]>
> Subject: [boost] [Math] float16 on ARM
>
> Hi,
>
> There seems to be support for 16-bit floats on iOS, when using Apple's XCode
> clang.  This seems to be a storage-only format with the only instruction set
support

> being conversions to and from 32- and 64-bit floats.  Quick test:
>
> #if __ARM_FP & 2
> #warning "Have 16-bit FP"
> #endif
>
> void test()
> {
>   __fp16 a = 1.234;
>   static_assert(sizeof(a)==2);
>   __fp16 b = 2.345;
>   auto sum = a+b;
>   static_assert(sizeof(sum)==4);
> }
>
> There doesn't seem to be a std::numeric_limits specialisation.
>
> I suspect that other platforms have something similar.  It would be good to
have a
> boost::float16_t typedef and a portable feature-test macro (and maybe a
software
> fallback).  As far as I can see, boost/math/cstdfloat/cstdfloat_types.hpp
checks the
> sizes of float, double and long double, and checks for 128-bit floats provided
by
> the compiler.  Can this be extended to check for __fp16 ?
>
> (P.S. it seems that gcc also has support, see
> https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html )

I:\boost\libs\math\include\boost\math\cstdfloat\cstdfloat_types.hpp has some
mention for 16-bit floating-point double.

Is this the only 16-bit floating-point format in use?

The storage-only nature of the type is another complication.

But in principle, yes, I support this (but am uncertain how to implement it in
detail).

Paul

Paul A. Bristow
Prizet Farmhouse
Kendal, Cumbria
LA8 8AB           UK








_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [Math] float16 on ARM

Boost - Dev mailing list
>
>
> Is this the only 16-bit floating-point format in use?
>
> IEEE 16bit (fp16) and bfloat16 are both around, but bfloat16 seems to be
the new leader in modern implementations thanks to ML use. I haven't
experienced both used together but I wouldn't rule it out given bfloat16
may be accelerator specific.  Google and intel have support for bfloat16 in
some hardware. bfloat16 makes it easy to move to fp32 as they have the same
exponent size.

Refs: https://en.wikipedia.org/wiki/Bfloat16_floating-point_format
https://nickhigham.wordpress.com/2018/12/03/half-precision-arithmetic-fp16-versus-bfloat16/

--Matt.

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [Math] float16 on ARM

Boost - Dev mailing list


> -----Original Message-----
> From: Boost <[hidden email]> On Behalf Of Matt Hurd via Boost
> Sent: 15 October 2019 11:31
> To: [hidden email]
> Cc: Matt Hurd <[hidden email]>
> Subject: Re: [boost] [Math] float16 on ARM
>
> >
> >
> > Is this the only 16-bit floating-point format in use?
> >
> > IEEE 16bit (fp16) and bfloat16 are both around, but bfloat16 seems to
> > be
> the new leader in modern implementations thanks to ML use. I haven't
> experienced both used together but I wouldn't rule it out given bfloat16 may
be
> accelerator specific.  Google and intel have support for bfloat16 in some
hardware.
> bfloat16 makes it easy to move to fp32 as they have the same exponent size.
>
> Refs: https://en.wikipedia.org/wiki/Bfloat16_floating-point_format
> https://nickhigham.wordpress.com/2018/12/03/half-precision-arithmetic-fp16-
> versus-bfloat16/

Thanks for these useful references.

Are bloat16 and IEEE float16
I:\boost\libs\math\include\boost\math\cstdfloat\cstdfloat_types.hpp

the two layouts that we need to consider?

Paul



_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [Math] float16 on ARM

Boost - Dev mailing list
>
>
> Thanks for these useful references.
>
n.p.


> Are bloat16 and IEEE float16
> I:\boost\libs\math\include\boost\math\cstdfloat\cstdfloat_types.hpp
>
> the two layouts that we need to consider?
>

Arm also supports another in that it has two similar formats __fp16 and
_Float16 :-(

"ARM processors support (via a floating point control register
<https://en.wikipedia.org/wiki/Control_register> bit) an "alternative
half-precision" format, which does away with the special case for an
exponent value of 31 (111112).[10]
<https://en.wikipedia.org/wiki/Half-precision_floating-point_format#cite_note-10>
It
is almost identical to the IEEE format, but there is no encoding for
infinity or NaNs; instead, an exponent of 31 encodes normalized numbers in
the range 65536 to 131008." from wiki:
https://en.wikipedia.org/wiki/Half-precision_floating-point_format

Arm reference:
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.100067_0612_00_en/sex1519040854421.html
gcc ref: https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html

Strange complexity within Arm: "The ARM target provides hardware support
for conversions between __fp16 and float values as an extension to VFP and
NEON (Advanced SIMD), and from ARMv8-A provides hardware support for
conversions between __fp16 and double values. GCC generates code using
these hardware instructions if you compile with options to select an FPU
that provides them; for example, -mfpu=neon-fp16 -mfloat-abi=softfp, in
addition to the -mfp16-format option to select a half-precision format."

Unpleasant, sorry.

More bfloat16 FWIW (hardware support list is longer):
https://en.wikichip.org/wiki/brain_floating-point_format

So that makes at least 3 I guess.

Facebook has been experimenting with an alternate format Gustafson's post
which is quite neat and rational and perhaps better than ieee at 64 bit
too:
https://www.nextplatform.com/2019/07/08/new-approach-could-sink-floating-point-computation/
Facebook reference:
https://engineering.fb.com/ai-research/floating-point-math/
posit "land": https://posithub.org/news

but posit only lives in FPGA land AFAICT and not yet something to worry
about.


--Matt.






> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost
>

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [Math] float16 on ARM

Boost - Dev mailing list


> -----Original Message-----
> From: Boost <[hidden email]> On Behalf Of Matt Hurd via Boost
> Sent: 15 October 2019 13:05
> To: [hidden email]
> Cc: Matt Hurd <[hidden email]>
> Subject: Re: [boost] [Math] float16 on ARM
>
> >
> >
> > Thanks for these useful references.
> >
> n.p.
>
>
> > Are bloat16 and IEEE float16
> > I:\boost\libs\math\include\boost\math\cstdfloat\cstdfloat_types.hpp
> >
> > the two layouts that we need to consider?
> >
>
> Arm also supports another in that it has two similar formats __fp16 and
> _Float16 :-(
>
> "ARM processors support (via a floating point control register
> <https://en.wikipedia.org/wiki/Control_register> bit) an "alternative half-
> precision" format, which does away with the special case for an exponent value of
> 31 (111112).[10] <https://en.wikipedia.org/wiki/Half-precision_floating-
> point_format#cite_note-10>
> It
> is almost identical to the IEEE format, but there is no encoding for infinity or NaNs;
> instead, an exponent of 31 encodes normalized numbers in the range 65536 to
> 131008." from wiki:
> https://en.wikipedia.org/wiki/Half-precision_floating-point_format
>
> Arm reference:
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.100067_0612_00_
> en/sex1519040854421.html
> gcc ref: https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html
>
> Strange complexity within Arm: "The ARM target provides hardware support for
> conversions between __fp16 and float values as an extension to VFP and NEON
> (Advanced SIMD), and from ARMv8-A provides hardware support for conversions
> between __fp16 and double values. GCC generates code using these hardware
> instructions if you compile with options to select an FPU that provides them; for
> example, -mfpu=neon-fp16 -mfloat-abi=softfp, in addition to the -mfp16-format
> option to select a half-precision format."
>
> Unpleasant, sorry.

I really, really don't wish to know all that! 😉  Paul


>
> More bfloat16 FWIW (hardware support list is longer):
> https://en.wikichip.org/wiki/brain_floating-point_format
>
> So that makes at least 3 I guess.
>
> Facebook has been experimenting with an alternate format Gustafson's post which
> is quite neat and rational and perhaps better than ieee at 64 bit
> too:
> https://www.nextplatform.com/2019/07/08/new-approach-could-sink-floating-
> point-computation/
> Facebook reference:
> https://engineering.fb.com/ai-research/floating-point-math/
> posit "land": https://posithub.org/news
>
> but posit only lives in FPGA land AFAICT and not yet something to worry about.


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [Math] float16 on ARM

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
Matt Hurd wrote:
> IEEE 16bit (fp16) and bfloat16 are both around, but bfloat16 seems to be
> the new leader in modern implementations thanks to ML use. I haven't
> experienced both used together but I wouldn't rule it out given bfloat16
> may be accelerator specific.  Google and intel have support for bfloat16 in
> some hardware. bfloat16 makes it easy to move to fp32 as they have the same
> exponent size.
>
> Refs: https://en.wikipedia.org/wiki/Bfloat16_floating-point_format
> https://nickhigham.wordpress.com/2018/12/03/half-precision-arithmetic-fp16-versus-bfloat16/

According to section 4.1.2 of this ARM document:
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053d/IHI0053D_acle_2_1.pdf

implementations support both the IEEE format (1 sign, 5 exponent and 10
mantissa) and an alternative format which is similar except that it doesn't
support Inf and NaN, and gains slightly more range.  Apparently the bfloat16
format is supported in ARMv8.6-A, but I don't believe that is deployed anywhere
yet.

The other place where I've used 16-bit floats is in OpenGL textures,
(https://www.khronos.org/registry/OpenGL/extensions/OES/OES_texture_float.txt),
which use the 1-5-10 format.

I was a bit surprised by the 1-5-10 choice; the maximum value that can
be represented is only 65504, i.e. less than the maximum value for an
unsigned int of the same size.

bfloat16 can be trivially implemented (as a storage-only type) simply
by truncating a 32-bit float; perhaps support for that would be useful
too?


Regards, Phil.





_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [Math] float16 on ARM

Boost - Dev mailing list
BFloat16 conversion to Float32 is not 100% trivial, because of NaN and
rounding modes.  I think
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/lib/bfloat16/bfloat16.h#L178
does
a good job at documenting this (the linked code is Apache 2.0 licensed, in
case you worry about such issues).  However, if one ignores these
complexities, conversion is as simple as bitshifting.

The public Intel BF16 spec is
https://software.intel.com/sites/default/files/managed/40/8b/bf16-hardware-numerics-definition-white-paper.pdf,
which describes the details of the Intel definition.  There are some
implementation specific details in https://reviews.llvm.org/D60550.  I
can't comment on other hardware implementations.

The important distinction between __fp16 and _Float16 is the former is a
storage format type, not an arithmetic type, whereas the latter is an
arithmetic type.  The former is more easily implementable, e.g.Intel CPUs
since Ivy Bridge use the F16C to do fast conversion to/from the 16-bit
storage format to float32, but all arithmetic is done with float32 hardware.

Folks who are interested in this topic may enjoy reading
https://arxiv.org/abs/1904.06376.  The methods described therein are not
necessarily applicable to Boost Multiprecision, but may be relevant if
uBLAS gets involved.

Jeff, who works for Intel

On Tue, Oct 15, 2019 at 7:04 AM Phil Endecott via Boost <
[hidden email]> wrote:

> Matt Hurd wrote:
> > IEEE 16bit (fp16) and bfloat16 are both around, but bfloat16 seems to be
> > the new leader in modern implementations thanks to ML use. I haven't
> > experienced both used together but I wouldn't rule it out given bfloat16
> > may be accelerator specific.  Google and intel have support for bfloat16
> in
> > some hardware. bfloat16 makes it easy to move to fp32 as they have the
> same
> > exponent size.
> >
> > Refs: https://en.wikipedia.org/wiki/Bfloat16_floating-point_format
> >
> https://nickhigham.wordpress.com/2018/12/03/half-precision-arithmetic-fp16-versus-bfloat16/
>
> According to section 4.1.2 of this ARM document:
>
> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053d/IHI0053D_acle_2_1.pdf
>
> implementations support both the IEEE format (1 sign, 5 exponent and 10
> mantissa) and an alternative format which is similar except that it doesn't
> support Inf and NaN, and gains slightly more range.  Apparently the
> bfloat16
> format is supported in ARMv8.6-A, but I don't believe that is deployed
> anywhere
> yet.
>
> The other place where I've used 16-bit floats is in OpenGL textures,
> (
> https://www.khronos.org/registry/OpenGL/extensions/OES/OES_texture_float.txt
> ),
> which use the 1-5-10 format.
>
> I was a bit surprised by the 1-5-10 choice; the maximum value that can
> be represented is only 65504, i.e. less than the maximum value for an
> unsigned int of the same size.
>
> bfloat16 can be trivially implemented (as a storage-only type) simply
> by truncating a 32-bit float; perhaps support for that would be useful
> too?
>
>
> Regards, Phil.
>
>
>
>
>
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost
>


--
Jeff Hammond
[hidden email]
http://jeffhammond.github.io/

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [Math] float16 on ARM

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list

>> Strange complexity within Arm: "The ARM target provides hardware support for
>> conversions between __fp16 and float values as an extension to VFP and NEON
>> (Advanced SIMD), and from ARMv8-A provides hardware support for conversions
>> between __fp16 and double values. GCC generates code using these hardware
>> instructions if you compile with options to select an FPU that provides them; for
>> example, -mfpu=neon-fp16 -mfloat-abi=softfp, in addition to the -mfp16-format
>> option to select a half-precision format."
>>
>> Unpleasant, sorry.
> I really, really don't wish to know all that! 😉  Paul

Me neither.

My first reaction was "yes of course we should do that", but on
reflection I'm not so sure:

The intention of <boost/cstdmath.hpp> was to provide a *portable* set of
typedefs such that each is mapped to the corresponding IEEE defined type.

And indeed, if numeric_limits<float32_t>::is_iec559 happens to be false,
then the header will trigger a static_assert and complain that it's
incorrectly configured.

So... we could provide a float16_t typedef, but if we're being
consistent (and we should be), then it would only be for IEEE float16
compatible types.

My concern is quite how we detect/configure/test that.

I also don't think we currently have any tester running on arm (for
example), and we certainly don't have arm CI (is there any?).

Mostly thinking out loud yours, John.


--
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [Math] float16 on ARM

Boost - Dev mailing list


> -----Original Message-----
> From: Boost <[hidden email]> On Behalf Of John Maddock via
> Boost
> Sent: 15 October 2019 18:05
> To: Paul A Bristow via Boost <[hidden email]>
> Cc: John Maddock <[hidden email]>
> Subject: Re: [boost] [Math] float16 on ARM
>
>
> >> Strange complexity within Arm: "The ARM target provides hardware
> >> support for conversions between __fp16 and float values as an
> >> extension to VFP and NEON (Advanced SIMD), and from ARMv8-A provides
> >> hardware support for conversions between __fp16 and double values.
> >> GCC generates code using these hardware instructions if you compile
> >> with options to select an FPU that provides them; for example,
> >> -mfpu=neon-fp16 -mfloat-abi=softfp, in addition to the -mfp16-format option
> to select a half-precision format."
> >>
> >> Unpleasant, sorry.
> > I really, really don't wish to know all that! 😉  Paul
>
> Me neither.
>
> My first reaction was "yes of course we should do that", but on reflection I'm not
> so sure:
>
> The intention of <boost/cstdmath.hpp> was to provide a *portable* set of
> typedefs such that each is mapped to the corresponding IEEE defined type.
>
> And indeed, if numeric_limits<float32_t>::is_iec559 happens to be false, then the
> header will trigger a static_assert and complain that it's incorrectly configured.
>
> So... we could provide a float16_t typedef, but if we're being consistent (and we
> should be), then it would only be for IEEE float16 compatible types.
>
> My concern is quite how we detect/configure/test that.
>
> I also don't think we currently have any tester running on arm (for example), and
> we certainly don't have arm CI (is there any?).

I'm also worried about the multiple layout for 16-bit, and even more about the distinction between a storage type and computation type.

Mainly worrying out loud ☹

Paul


PS Perhaps we should focus on what you really want to gain from this.

  Is an implementation of numeric_limits the most important things?


Paul A. Bristow
Prizet Farmhouse
Kendal, Cumbria
LA8 8AB           UK





_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Reply | Threaded
Open this post in threaded view
|

Re: [Math] float16 on ARM

Boost - Dev mailing list
In reply to this post by Boost - Dev mailing list
John Maddock wrote:

> The intention of <boost/cstdmath.hpp> was to provide a *portable* set of
> typedefs such that each is mapped to the corresponding IEEE defined type.
>
> And indeed, if numeric_limits<float32_t>::is_iec559 happens to be false,
> then the header will trigger a static_assert and complain that it's
> incorrectly configured.
>
> So... we could provide a float16_t typedef, but if we're being
> consistent (and we should be), then it would only be for IEEE float16
> compatible types.
>
> My concern is quite how we detect/configure/test that.

#if (__ARM_FP & 2) && defined(__ARM_FP16_FORMAT_IEEE)
using float16_t = __fp16;
struct std::numeric_limits<float16_t> {
  const bool is_iec559 = true;
  etc. etc.
};
#endif

I think the difficult issues are
(a) Do we need a richer vocabulary than floatNN_t to describe the
different formats, when more than one is supported at the same time?
(b) Can we provide a software fallback for platforms where there
is no hardware support?

Right now, I'm trying to make a binary file containing these 16-bit
floats that will be read (memory-mapped) on an ARM system, built on
an x86 system.

> I also don't think we currently have any tester running on arm (for
> example)

I was running ARM64 testers when I had an account with Scaleway, but
I don't think anyone was looking at the results; certainly no-one
ever asked me anything about them or seemed to notice when I turned
them off.  I have considered reviving this now that AWS has ARM
instances, but their smallest ARM instances are quite a lot larger
(and more expensive) than their smallest x86 instances for some reason.
That may change eventually.  If anyone's interested, get in touch.


Regards, Phil.



_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost