Bounds on the minimum number of data transfers (i.e., loads, stores, and copies) required by WFTA and FFT programs are presented. The analysis is applicable to those general-purpose computers with M general processor registers (e.g., the IBM 370, PDP-11, etc.) where

transform length. It is shown that the 1008- point WFTA requires about 21 percent more data transfers than the 1024-point radix-4 FFT; on the other hand, the 120-point WFTA has about the same number of data transfers as the mixed radix (4 × 4 × 4 × 2) version of the 128-point FFT and 22 percent fewer than the radix-2 version. Finally, comparisons of the "total" program execution times (multiplications, additions, and data transfers, but not indexing or permutations) are presented.