This paper describes an integer linear programming (ILP) based system called CHIPS that identifies custom instructions for critical code segments, given the available data bandwidth and transfer latencies between custom logic and a baseline processor with architecturally visible state registers. Our approach enables designers to optionally constrain the number of input and output operands for custom instructions. We describe a design flow to identify promising area, performance and code size tradeoffs. We study the effect of input/output constraints, register file ports, and compiler transformations such as if-conversion. Our experiments show that, in most cases, the solutions with the highest performance are identified when the input/output constraints are removed. However, input/output constraints help our algorithms identify frequently used code segments, reducing the overall area overhead. Results for 11 benchmarks covering cryptography and multimedia are shown, with speed-ups between 1.7 and 6.6 times, code size reductions between 6% and 72%, and area costs ranging between 12 and 256 adders for maximum speed-up. Our ILP based approach scales well: benchmarks with basic blocks consisting of more than 1000 instructions can be optimally solved, most of the time within a few seconds.
pubs.doc.ic.ac.uk: built & maintained by Ashok Argent-Katwala.