程式師世界 >> 編程語言 >> C語言 >> C++ >> 關於C++ >> Vdsp(bf561)中的浮點運算（12）：fract16加減運算

Vdsp(bf561)中的浮點運算（12）：fract16加減運算

編輯：關於C++

由於減法實際可以看成加上一個負數，因此我們只需要看加法操作。fract16的加法運算由add_fr1x16函數完成：

#pragma inline #pragma always_inline static fract16　 add_fr1x16(fract16　 __a, fract16　 __b) { fract16　 __rval = __builtin_add_fr1x16(__a, __b); return __rval; }

從這裡可以看出我們實際可以使用__builtin_add_fr1x16這一函數調用。

寫一個很簡單的程序：

typedef fract16 ft; ft calc(ft x, ft y) { ft r; r = __builtin_add_fr1x16(x, y); return r; }

這個函數展開後的匯編代碼為：

_calc: .LN_calc: //------------------------------------------------------------------- //　　 Procedure statistics: //　　 Frame size　　　　　　　　　　　 = 8 //　　 Scratch registers used:{R0.L,R0.H,R1.L,ASTAT0-ASTAT1} //　　 Call preserved registers used:{FP,SP,RETS} //------------------------------------------------------------------- // line ".\float_test.c":27 LINK 0; W[FP + 12] = R1; W[FP + 8] = R0; .LN0: // line 29 R0.L = R1.L + R0.L (S); R0 = R0.L (X); W[FP + 16] = R0; .LN1: // line 30 UNLINK; RTS; .LN._calc.end: ._calc.end: .global _calc; .type _calc,STT_FUNC;

可以發現，__builtin_add_fr1x16展開後的匯編代碼就是

R0.L = R1.L + R0.L (S);

因而完成這樣一個加法運算將只需要一個cycle的時間。

在VDSP的文檔裡這樣介紹這條指令：

Data Registers — 16-Bit Operands, 16-Bit Result Dreg_lo_hi = Dreg_lo_hi + Dreg_lo_hi (sat_flag) ;　　　 /* (b) */ Dreg_lo_hi: R7–0.L, R7–0.H sat_flag: nonoptional saturation flag, (S) or (NS) In the syntax, where sat_flag appears, substitute one of the following values. (S) – saturate the result (NS) – no saturation

由於這是一條帶符號的加法指令，因此它將有一個飽和問題：

Saturation is a technique used to contain the quantity within the values that the destination register can represent. When a value is computed that exceeds the capacity of the destination register, then the value written to the register is the largest value that the register can hold with the same sign as the original.

If an operation would otherwise cause a positive value to overflow and become negative, instead, saturation limits the result to the maximum positive value for the size register being used.

Conversely, if an operation would otherwise cause a negative value to overflow and become positive, saturation limits the result to the maximum negative value for the register size.

The maximum positive value in a 16-bit register is 0x7FFF. The maximum negative value is 0x8000. For a signed two’s-complement 1.15 fractional notation, the allowable range is –1 through (1–2–15).

當兩個數相加超過fract16所能表示的最大值時，它將直接返回fract16所能表示的最大值0x7fff，即0.999969482421875，當結果小於-1時，它將返回-1。

由於fract16不是內置類型，而是定義為short，我們看看如果不使用__builtin_add_fr1x16而是直接用加號會發生什麼：

typedef fract16 ft; ft calc(ft x, ft y) { ft r; r = x + y; return r; }

匯編輸出變成了：

// line 29 R0 = R0 + R1 (NS); R0 = R0.L (X); W[FP + 16] = R0;

區別還是很大的，編譯器把它變成了32位加法再取低16位，這樣就不會是返回飽和的結果了，而是返回一個不希望看到的數字，呵呵。

減法與此類似，不過改為調用下面的函數調用而已：

#pragma inline #pragma always_inline static fract16　 sub_fr1x16(fract16　 __a, fract16　 __b) { fract16　 __rval = __builtin_sub_fr1x16(__a, __b); return __rval; }