??? 04/18/05 09:56 Read: times Msg Score: +1 +1 Informative |
#91810 - especially for those... Responding to: ???'s previous message |
...who gave me -1 with "not useful" reason.
I did not wish to be either useful or not; I just replied on Jan`s question. Anyway if somebody for a reason did not understand what I have said and explained there then here are some pictures for those who sees my English non-understable. First picture shows Altera multiplicator. Upper one is the not-clocked, no-pipelined pure combinatorial logic multiplicator. Bottom one is clocked multiplicator which utilizes clock input for pipeline operation. To use pipeline, user must provide clock and enable its utilization. Next example shows 2-stage conveyor setup: Now is about the differences between these two ways of multiplication. No-pipelined multiplicator. See the picture above. When one or more input pins of A[] or/and B[] are changed then output Q[] reflects the changes after a delay. This delay is depend on derivative, its speed, signed/unsigned mode, buses` widthes etc. Because it is not-clocked multiplicator so this delay time is not depend on system clock. Such multiplicator may be used in CPLD which does not utilize a clock in the whole. Here is the time simulation of the process 0x3FF x 0x3FF: You may see that input data are changed at 10ns mark and the valid result is provided at about 30,6ns mark. Before this time the result is not valid and changes many times due hard realization of cascade of summators and carryes nets. Typical delays for no-pipelined 10x10 bits multiplicator realized in Altera Acex EP1K10-1 device are: Clocked multiplicator with pipeline. See the picture above once again. When one or more input pins of A[] or/and B[] are changed then output Q[] does not reflects immediately but only after the clock rising edge. Depend on that how many stages pipeline is deep, the valid result on output lines is provided after specified clock cycles. One of pipeline ideas is that when the number of process stages is defined then the whole process may be divided into small parts per each stage. Small parts are executed quickly and do not request huge complex logic as it is done with pure realization. For example, for multiplication it may be possible to do function not as 10x10bits but as 10x5 + (10x5)<<5 instead. Such way decreases the lenght of carry chains and provides more fast response after final clock. Here is the time simulation of the process 0x3FF x 0x3FF for multiplicator with 2-stage pipeline: As you may see, the first clock at 20ns mark does not change the output. Only after second clock (30ns mark) and a delay the valid result appears at the output pins. With pipelined multiplicator the result time is the sum of pipeline delay defined with number of its clocks + final delay. In fact, final delay is some depended on the number of pipeline stages as well. It is not clear with current example but if somebody is interesting then I may show the real table/diagram for a divider where it is very good seen. For clocked 10x10 bits multiplicator realized in Altera Acex EP1K10-1 device with 2 stage pipeline the last_clock-to-result time delay is: As you see, the max. delay is about 15ns when the delay for not-pipelined multiplicator is about 22ns. But we know that the full time from a data clocked into pipeline to valid result output includes one clock period (i.e. between 1st and 2nd clock), so it is about 10+15=25ns for example above where 100MHz system clock is utilized. Now some notes I need to say. 1. The examples above use pipeline with 2-stages. In fact, for 10x10 bits multiplication it is enough to use one-stage pipeline. In this case one-stage pipelined multiplicator produces valid result after about 14,2ns after the clock rising edge. Here I used two stages pipeline only for demonstration of its features. But there are many applications where 2, 3 or even 6-stages pipelined functions produce result more fast than 1-stage ones. Need a little example? Well, take a divider 16:16 bits which produces 16-bit quotient and 16-bit remain. System clock: 50MHz ----------+---------------------+---------+ Number of Valid result time LE used pipeline (including stages pipeline delay) ----------+---------------------+---------+ 0 210ns 285 1 175ns 326 2 130ns 367 3 115ns 410 4 135ns 451 ----------+---------------------+---------+ Regards, Oleg |
Topic | Author | Date |
Fast Square. | 01/01/70 00:00 | |
Square dancing | 01/01/70 00:00 | |
table lookup??? | 01/01/70 00:00 | |
code & algorithm | 01/01/70 00:00 | |
16*16 bit is slower than what I want. | 01/01/70 00:00 | |
How fast? | 01/01/70 00:00 | |
Re: How Fast | 01/01/70 00:00 | |
... probably impossible in 15 cycles | 01/01/70 00:00 | |
why cycles ? | 01/01/70 00:00 | |
Re: Microseconds | 01/01/70 00:00 | |
table lookup | 01/01/70 00:00 | |
Natsemi appnote or CORDIC | 01/01/70 00:00 | |
Natsemi link to appnote | 01/01/70 00:00 | |
(a+b)^2=a^2+2*a*b+b^2 | 01/01/70 00:00 | |
Thats Slow. | 01/01/70 00:00 | |
faster need hardware | 01/01/70 00:00 | |
How fast do you need? | 01/01/70 00:00 | |
Re: How Fast. | 01/01/70 00:00 | |
Just? | 01/01/70 00:00 | |
Incorrect | 01/01/70 00:00 | |
Correct? | 01/01/70 00:00 | |
Whooooopa... Sorry. | 01/01/70 00:00 | |
Thanks | 01/01/70 00:00 | |
I tried... | 01/01/70 00:00 | |
optimum? table driven | 01/01/70 00:00 | |
Jan metod | 01/01/70 00:00 | |
Hardware? | 01/01/70 00:00 | |
CPLD? | 01/01/70 00:00 | |
SILabs f12x does it in hardware | 01/01/70 00:00 | |
Re: SiLabs F12x | 01/01/70 00:00 | |
Price | 01/01/70 00:00 | |
F12x price | 01/01/70 00:00 | |
F12x MAC | 01/01/70 00:00 | |
provided in the datasheet | 01/01/70 00:00 | |
Just out of interest | 01/01/70 00:00 | |
clarification | 01/01/70 00:00 | |
CPLD? | 01/01/70 00:00 | |
too expensive | 01/01/70 00:00 | |
Absolute rubbish Oleg | 01/01/70 00:00 | |
explain | 01/01/70 00:00 | |
your right | 01/01/70 00:00 | |
especially for those... | 01/01/70 00:00 | |
I need to say this.... | 01/01/70 00:00 | |
By the way..... | 01/01/70 00:00 | |
just a demo | 01/01/70 00:00 | |
Hang on. | 01/01/70 00:00 | |
Oh bollocks | 01/01/70 00:00 | |
Well oleg | 01/01/70 00:00 | |
Please check my answer. | 01/01/70 00:00 | |
Here you go | 01/01/70 00:00 | |
You're having me on. | 01/01/70 00:00 | |
Pascal? | 01/01/70 00:00 | |
Pascal? | 01/01/70 00:00 | |
Why ? | 01/01/70 00:00 | |
It was changed because,,, | 01/01/70 00:00 | |
Its because | 01/01/70 00:00 | |
For Jez | 01/01/70 00:00 | |
For Michael | 01/01/70 00:00 | |
simulation | 01/01/70 00:00 | |
Re: Fast Square | 01/01/70 00:00 | |
Prahlad, waithing for a conclusion | 01/01/70 00:00 | |
just an exercise... | 01/01/70 00:00 | |
Tricky | 01/01/70 00:00 | |
Jez asked his cat, I asked my sheep | 01/01/70 00:00 | |
Conclusion. | 01/01/70 00:00 | |
SPI EEPROM![]() | 01/01/70 00:00 |