Multiplier & Divider Guide¶
Note
This guide helps you choose a multiplier or divider style for your Sky130 design. Each option has different trade-offs in speed, area, and implementation complexity.
Multipliers¶
Serial (Shift-and-Add)¶
How it works: Iteratively shifts the multiplicand and conditionally adds into an accumulator.
Unsigned: Straightforward; each bit of the multiplier is tested.
Signed: Requires sign-extension and two’s complement handling.
Hardware cost: - Very small (1 adder + shifter + control FSM). - Slow (N cycles for N-bit multiply).
Array Multiplier¶
How it works: Generates all partial products in parallel and sums them with adders.
Unsigned: Direct grid of AND gates feeding adders.
Signed: Needs two’s complement partial product correction.
Hardware cost: - Large area (O(N^2) adders). - Fast (1 cycle latency).
Wallace Tree / Dadda Tree¶
How it works: Partial products are reduced quickly using carry-save adders in a tree.
Unsigned: Direct implementation from AND partial products.
Signed: Adds complexity in partial product generation.
Hardware cost: - Area: Medium–High. - Speed: Faster than array (logarithmic reduction). - Often pipelined for high Fmax.
Booth Encoded Multiplier¶
How it works: Reduces the number of partial products using Booth recoding.
Unsigned: Not needed (already minimal).
Signed: Very effective for signed multiplications.
Hardware cost: - Area: Medium. - Speed: High when pipelined. - Popular for DSPs and CPUs.
—
Dividers¶
Iterative Restoring Divider¶
How it works: Subtract divisor from partial remainder, shift, repeat.
Unsigned: Simple subtract-shift loop.
Signed: Handle signs separately, apply two’s complement rules.
Hardware cost: - Small (one subtractor + FSM). - Slow (N cycles for N-bit division).
Non-Restoring Divider¶
How it works: Improves on restoring by avoiding unnecessary add-back steps.
Hardware cost: - Similar to restoring, slightly smaller latency.
Newton–Raphson / Reciprocal Multiply¶
How it works: Approximates reciprocal and multiplies (good for floating-point).
Unsigned: Uses LUTs + multipliers.
Signed: Extra sign/normalization handling.
Hardware cost: - Requires multipliers + LUTs. - High speed, higher area.
—
Signed vs Unsigned: Hardware Costs¶
Unsigned arithmetic: - Simpler logic. - Direct AND/OR/XOR building blocks.
Signed arithmetic (two’s complement): - Needs sign extension. - Extra correction logic (e.g., partial product negation in Booth). - Roughly 5–15% more area depending on architecture. - Same cycle count, but often slightly lower max clock frequency.
—
Choosing the Right Style¶
Guidelines
For small area (tiny cores, low-power): use serial multiplier or iterative divider.
For balanced area/speed: use Booth multiplier and non-restoring divider.
For maximum speed (DSP/CPU): use Wallace tree / pipelined Booth multiplier and Newton–Raphson divider.
Always check whether you really need signed division — it’s much costlier.
—
Next Steps¶
Check timing/area trade-offs using Yosys + OpenSTA before finalizing.