Re: (x * x) % y

New Message Reply About this list Date view Thread view Subject view Author view

Bill Stewart (bill.stewart@pobox.com)
Mon, 26 Apr 1999 00:22:25 -0700


At 10:45 AM 4/23/99 +0200, Mok-Kong Shen wrote:
>Olivier Langlois wrote:
>> My question is : Is it possible to write a legal C construction to would
>> generate the correct assembler code ?

If you want code that's both optimally fast and portable across
hardware platforms and compilers, no.
Many C compilers now have 64-bit "long long" or similar types,
and some compilers for some platforms have 64-bit longs.
For those cases, you can write legal C that's dependable,
just not portable.

If you want a non-portable solution, you can play games with
your environment by trying various combinations of variable declarations,
casts, parentheses, operations order, etc., which can sometimes
be a big performance win, especially if you know things about
the range of input and output values that there isn't a good
way to express in C. Instead of doing "edit, compile, test",
you do "edit, compile, read assembler" :-)

A long time ago I was writing Mandelbrot set calculators that
ran in the background on the AT&T Blit terminals (10MHz 68000-based
1Kx1K graphics screen intelligent terminal with a really dumb C compiler).
Portable code was pretty slow - the compiler called a subroutine
to do the fixed-point multiplication, incurring subroutine call times and
doing lots of manipulation to be correct in the general case.
(Floating-point was even slower, since there was no float hardware :-)
The 68000 has 32-bit registers but 16-bit ints.
By putting the critical variables in registers and doing the
multiplies in correct order, it was possible to get the assembler
to just do the 16x16->32 multiply without calling the subroutine or truncating
back down to 16-bit, accumulate in 32-bit, and shift to renormalize.
Part of the process of playing around with it was to make sure
I could put enough variables in registers to do the fast work
without running out of registers - a much nicer process on 68000 than 8086.

>32 bit hardware usually provides an extra register that can be utilized
>for multiplication of two 32 bit operands. Since however high-level
>programming languages don't (and can't) specify the availability
>of results outside of the domain of 'integers', compilers don't
>take that extra register into account. Hence the answer to your
>question is NO. But you can certainly write (slow running)
>high-level code to do multi-precision arithmetics.

In many cases, the MPI package will already contain a wide
selection of tricks to get high performance out of Intel hardware.
Either reuse it or recycle the code...
                                Thanks!
                                        Bill
Bill Stewart, bill.stewart@pobox.com
PGP Fingerprint D454 E202 CBC8 40BF 3C85 B884 0ABE 4639


New Message Reply About this list Date view Thread view Subject view Author view

 
All trademarks and copyrights are the property of their respective owners.

Other Directory Sites: SeekWonder | Directory Owners Forum

The following archive was created by hippie-mail 7.98617-22 on Thu May 27 1999 - 23:44:22