Board logo

标题: [专题讨论] 模拟器硬核研究系列4:浮点街上的噩梦 [打印本页]

作者: SSforME    时间: 2021-7-13 13:46     标题: 模拟器硬核研究系列4:浮点街上的噩梦

注意此文发布时间是2006年 7月 25日

在x86 CPU上很难模拟R5900浮点运算单元(FPU)和向量单元(Vector Unit VU),因为Playstation 2没有遵循IEEE标准。两个数的乘法运算在FPU、VU和x86处理器上面会得到三种不同的结果,结果之间相差好几个位(bit)。平方根和除法运算就更不准确了。

舍入模式是个问题,浮点数的无穷大值简直就是噩梦。IEEE标准这样规定,当一个数字溢出(也就是比3.4028234663852886E+38还大)的时候,这个结果就是无穷大。任何一个数字乘以无穷大还是无穷大(甚至 0 x 无穷大 = 无穷大)。这个规定看起来不错,当时当你发现VU不支持无穷大的时候就完全不是那么回事了,取而代之的是,它们将所有大数固定到可能的最大浮点数。这个差异造成很多游戏运行错误。
最简单的解决方案是固定当前指令的写入向量。这需要两个SSE操作,会很慢,而且有些时候还不解决问题。 最重要的是,你永远不能忽视游戏开发人员可能会一开始就在VU里面装载了溢出的浮点数,而有些游戏又用乘零来清零,这个时候VU不在乎里面这个溢出的值,但是x86在乎。
这两个问题使得浮点模拟难以又快又准,产生的问题是各种各样的,从渐渐淡出一个角色时的屏幕闪烁到常见的多边形毛刺症(spiky polygon syndrome 就是广为人知的SPS)

博客中心思想:比较两个浮点数a和b的时候,决不要用 a == b,要用类似如下的方式
fabs(a-b) < epsilon
作者: SSforME    时间: 2021-7-13 13:48

Nightmare on Floating-Point Street
It is very hard to emulate the floating-point calculations of the R5900 FPU and the Vector Units on an x86 CPU because the Playstation 2 does not follow the IEEE standard. Multiplying two numbers on the FPU, VU, and an x86 processor can give you 3 different results all differing by a couple of bits! Operations like square root and division are even more imprecise.

Originally, we thought that a couple of bits shouldn't matter, that game developers would be crazy to rely on such precise calculation. Floating points are mostly used for world transformations or interpolation calculations, so no one would care if their Holy Sword of Armageddon was 0.00001 meters off from the main player's hand. To put it shortly, we were wrong and game developers are crazier than we thought. Games started breaking just by changing the floating point rounding mode!
While rounding mode is a problem, the bigger nightmare is the floating-point infinities. The IEEE standard states that when a number overflows (meaning that it is larger than 3.4028234663852886E+38), the result will be infinity. Any number multiplied by infinity is infinity (even 0 * infinity = infinity). That sounds great until you figure out that the VUs don't support infinities. Instead they clamp all large numbers to the max floating point possible. This discrepancy breaks a lot of games!
For example, let's say a game developer tries to normalize a zero vector by dividing by its length, which is 0. On the VU, the end result will be (0,0,0). On x86/IEEE, the result will be (infinity, infinity, infinity). Now if the game developer uses this vector to perturb some faces for artificial hair or some type of animation, all final positions on the PS2 will remain the same. All final positions on x86 will go to infinity... and there goes the game's graphics, now figure out where the problem occurred.

The simplest solution is to clamp the written vector of the current instruction. This requires 2 SSE operations and is SLOW; and it doesn't work sometimes. To top it off, you can never dismiss the fact that game developers can be loading bad floating-point data to the VUs to begin with! Some games zero out vectors by multiplying them with a zero, so the VU doesn't care at all what kind of garbage the original vector's data has, x86 does care.

These two problems make floating-point emulation very hard to do fast and accurate. The range of bugs are from screen flickering when a fade occurs, to disappearing characters, to spiky polygon syndrome (the most common problem and widely known as SPS).
In the end Pcsx2 does all its floating-point operations with SSE since it is easier to cache the registers. Two different rounding modes are used for the FPU and VUs. Whenever a divide or rsqrt occur on the FPU, overflow is checked. Overflow is checked much more frequently with the VUs. The fact that VUs handle both integer and floating-point data in the same SSE register makes the checking a little longer. In the future, Pcsx2 will read the rounding mode and overflow settings from the patch files. This is so that all games can be accomodated with the best/fastest settings.

Moral of the blog When comparing two floating point numbers a and b, never use a == b. Instead use something along the lines of

fabs(a-b) < epsilon

where epsilon is some very small number.
作者: SSforME    时间: 2021-7-13 13:50




主要是当浮点数的指数超过127时,PC是非规格化表示 而PS2只是简单的将指数定为128





欢迎光临 TGFC Lifestyle ( Powered by Discuz! 6.0.0