especially for those...

Back to Subject List

Old thread has been locked -- no new posts accepted in this thread

???
04/18/05 09:56
Read: times

Msg Score: +1
+1 Informative

#91810 - especially for those...
Responding to: ???'s previous message

...who gave me -1 with "not useful" reason.

I did not wish to be either useful or not; I just replied on Jan`s question. Anyway if somebody for a reason did not understand what I have said and explained there then here are some pictures for those who sees my English non-understable.

First picture shows Altera multiplicator. Upper one is the not-clocked, no-pipelined pure combinatorial logic multiplicator. Bottom one is clocked multiplicator which utilizes clock input for pipeline operation.

Save This Code

To use pipeline, user must provide clock and enable its utilization. Next example shows 2-stage conveyor setup:

Save This Code

Now is about the differences between these two ways of multiplication.

No-pipelined multiplicator.
See the picture above. When one or more input pins of A[] or/and B[] are changed then output Q[] reflects the changes after a delay. This delay is depend on derivative, its speed, signed/unsigned mode, buses` widthes etc. Because it is not-clocked multiplicator so this delay time is not depend on system clock. Such multiplicator may be used in CPLD which does not utilize a clock in the whole.
Here is the time simulation of the process 0x3FF x 0x3FF:

Save This Code

You may see that input data are changed at 10ns mark and the valid result is provided at about 30,6ns mark. Before this time the result is not valid and changes many times due hard realization of cascade of summators and carryes nets.
Typical delays for no-pipelined 10x10 bits multiplicator realized in Altera Acex EP1K10-1 device are:

Save This Code

Clocked multiplicator with pipeline.
See the picture above once again. When one or more input pins of A[] or/and B[] are changed then output Q[] does not reflects immediately but only after the clock rising edge. Depend on that how many stages pipeline is deep, the valid result on output lines is provided after specified clock cycles.
One of pipeline ideas is that when the number of process stages is defined then the whole process may be divided into small parts per each stage. Small parts are executed quickly and do not request huge complex logic as it is done with pure realization. For example, for multiplication it may be possible to do function not as 10x10bits but as 10x5 + (10x5)<<5 instead. Such way decreases the lenght of carry chains and provides more fast response after final clock.
Here is the time simulation of the process 0x3FF x 0x3FF for multiplicator with 2-stage pipeline:

Save This Code

As you may see, the first clock at 20ns mark does not change the output. Only after second clock (30ns mark) and a delay the valid result appears at the output pins.
With pipelined multiplicator the result time is the sum of pipeline delay defined with number of its clocks + final delay. In fact, final delay is some depended on the number of pipeline stages as well. It is not clear with current example but if somebody is interesting then I may show the real table/diagram for a divider where it is very good seen.
For clocked 10x10 bits multiplicator realized in Altera Acex EP1K10-1 device with 2 stage pipeline the last_clock-to-result time delay is:

Save This Code

As you see, the max. delay is about 15ns when the delay for not-pipelined multiplicator is about 22ns. But we know that the full time from a data clocked into pipeline to valid result output includes one clock period (i.e. between 1st and 2nd clock), so it is about 10+15=25ns for example above where 100MHz system clock is utilized.

Now some notes I need to say.

1. The examples above use pipeline with 2-stages. In fact, for 10x10 bits multiplication it is enough to use one-stage pipeline. In this case one-stage pipelined multiplicator produces valid result after about 14,2ns after the clock rising edge. Here I used two stages pipeline only for demonstration of its features. But there are many applications where 2, 3 or even 6-stages pipelined functions produce result more fast than 1-stage ones. Need a little example? Well, take a divider 16:16 bits which produces 16-bit quotient and 16-bit remain.

Save This Code

System clock: 50MHz
----------+---------------------+---------+
Number of    Valid result time    LE used
pipeline     (including
stages        pipeline delay)
----------+---------------------+---------+
0            210ns                285
1            175ns                326
2            130ns                367
3            115ns                410
4            135ns                451
----------+---------------------+---------+

Regards,
Oleg

List of 66 messages in thread

Topic Author Date
Fast Square.              01/01/70 00:00
   Square dancing              01/01/70 00:00
      table lookup???              01/01/70 00:00
   code & algorithm              01/01/70 00:00
      16*16 bit is slower than what I want.              01/01/70 00:00
         How fast?              01/01/70 00:00
            Re: How Fast              01/01/70 00:00
               ... probably impossible in 15 cycles              01/01/70 00:00
                  why cycles ?              01/01/70 00:00
                     Re: Microseconds              01/01/70 00:00
                  table lookup              01/01/70 00:00
   Natsemi appnote or CORDIC              01/01/70 00:00
      Natsemi link to appnote              01/01/70 00:00
   (a+b)^2=a^2+2*a*b+b^2              01/01/70 00:00
      Thats Slow.              01/01/70 00:00
         faster need hardware              01/01/70 00:00
         How fast do you need?              01/01/70 00:00
            Re: How Fast.              01/01/70 00:00
               Just?              01/01/70 00:00
                  Incorrect              01/01/70 00:00
                     Correct?              01/01/70 00:00
                        Whooooopa... Sorry.              01/01/70 00:00
                           Thanks              01/01/70 00:00
                        I tried...              01/01/70 00:00
                  optimum? table driven              01/01/70 00:00
      Jan metod              01/01/70 00:00
   Hardware?              01/01/70 00:00
      CPLD?              01/01/70 00:00
   SILabs f12x does it in hardware              01/01/70 00:00
      Re: SiLabs F12x              01/01/70 00:00
   Price              01/01/70 00:00
      F12x price              01/01/70 00:00
         F12x MAC              01/01/70 00:00
            provided in the datasheet              01/01/70 00:00
   Just out of interest              01/01/70 00:00
      clarification              01/01/70 00:00
      CPLD?              01/01/70 00:00
         too expensive              01/01/70 00:00
            Absolute rubbish Oleg              01/01/70 00:00
               explain              01/01/70 00:00
               your right              01/01/70 00:00
            especially for those...              01/01/70 00:00
               I need to say this....              01/01/70 00:00
               By the way.....              01/01/70 00:00
   just a demo              01/01/70 00:00
      Hang on.              01/01/70 00:00
   Oh bollocks              01/01/70 00:00
   Well oleg              01/01/70 00:00
      Please check my answer.              01/01/70 00:00
         Here you go              01/01/70 00:00
            You're having me on.              01/01/70 00:00
               Pascal?              01/01/70 00:00
               Pascal?              01/01/70 00:00
            Why ?              01/01/70 00:00
               It was changed because,,,              01/01/70 00:00
               Its because              01/01/70 00:00
   For Jez              01/01/70 00:00
      For Michael              01/01/70 00:00
   simulation              01/01/70 00:00
   Re: Fast Square              01/01/70 00:00
   Prahlad, waithing for a conclusion              01/01/70 00:00
      just an exercise...              01/01/70 00:00
      Tricky              01/01/70 00:00
         Jez asked his cat, I asked my sheep              01/01/70 00:00
      Conclusion.              01/01/70 00:00
      SPI EEPROM              01/01/70 00:00

Back to Subject List

Topic	Author	Date
Fast Square.		01/01/70 00:00
Square dancing		01/01/70 00:00
table lookup???		01/01/70 00:00
code & algorithm		01/01/70 00:00
16*16 bit is slower than what I want.		01/01/70 00:00
How fast?		01/01/70 00:00
Re: How Fast		01/01/70 00:00
... probably impossible in 15 cycles		01/01/70 00:00
why cycles ?		01/01/70 00:00
Re: Microseconds		01/01/70 00:00
table lookup		01/01/70 00:00
Natsemi appnote or CORDIC		01/01/70 00:00
Natsemi link to appnote		01/01/70 00:00
(a+b)^2=a^2+2ab+b^2		01/01/70 00:00
Thats Slow.		01/01/70 00:00
faster need hardware		01/01/70 00:00
How fast do you need?		01/01/70 00:00
Re: How Fast.		01/01/70 00:00
Just?		01/01/70 00:00
Incorrect		01/01/70 00:00
Correct?		01/01/70 00:00
Whooooopa... Sorry.		01/01/70 00:00
Thanks		01/01/70 00:00
I tried...		01/01/70 00:00
optimum? table driven		01/01/70 00:00
Jan metod		01/01/70 00:00
Hardware?		01/01/70 00:00
CPLD?		01/01/70 00:00
SILabs f12x does it in hardware		01/01/70 00:00
Re: SiLabs F12x		01/01/70 00:00
Price		01/01/70 00:00
F12x price		01/01/70 00:00
F12x MAC		01/01/70 00:00
provided in the datasheet		01/01/70 00:00
Just out of interest		01/01/70 00:00
clarification		01/01/70 00:00
CPLD?		01/01/70 00:00
too expensive		01/01/70 00:00
Absolute rubbish Oleg		01/01/70 00:00
explain		01/01/70 00:00
your right		01/01/70 00:00
*especially for those...*		01/01/70 00:00
I need to say this....		01/01/70 00:00
By the way.....		01/01/70 00:00
just a demo		01/01/70 00:00
Hang on.		01/01/70 00:00
Oh bollocks		01/01/70 00:00
Well oleg		01/01/70 00:00
Please check my answer.		01/01/70 00:00
Here you go		01/01/70 00:00
You're having me on.		01/01/70 00:00
Pascal?		01/01/70 00:00
Pascal?		01/01/70 00:00
Why ?		01/01/70 00:00
It was changed because,,,		01/01/70 00:00
Its because		01/01/70 00:00
For Jez		01/01/70 00:00
For Michael		01/01/70 00:00
simulation		01/01/70 00:00
Re: Fast Square		01/01/70 00:00
Prahlad, waithing for a conclusion		01/01/70 00:00
just an exercise...		01/01/70 00:00
Tricky		01/01/70 00:00
Jez asked his cat, I asked my sheep		01/01/70 00:00
Conclusion.		01/01/70 00:00
SPI EEPROM		01/01/70 00:00