This paper shows how a general form of algorithms
consisting of a loop with loop-carried dependencies of
one can be mapped to a parametric hardware design with
pipelining and replication features. A technology-independent
parametric model of the proposed design is developed to
capture the variations of area and throughput with the number
of pipeline stages and replications. This model allows rapid
optimisation of design parameters by a few pre-synthesis operations.
We present an optimisation process based on this model,
and apply it to a Montgomery multiplier implementation on
a Xilinx XC5VLX50T FPGA. Our approach is shown to be
capable of accurately predicting the values of the parameters
that maximise the throughput of the multiplier. In particular,
up to 6.5 times fewer synthesis operations are required when
compared with a complete search through the design space of
the multiplier. This would speed up the synthesis process by
14 times, saving more than 23 hours of development time.
pubs.doc.ic.ac.uk: built & maintained by Ashok Argent-Katwala.