TOP
魔王撒旦
查看详细资料
小黑屋
原帖由 hourousha 于 2011-2-4 01:34 发表 喷了,你给出的链接只是说明在alpha test时early-z会失效罢了(大多数正常情况不会失效)。这和你说的‘G70的HSR是历史笑话’‘fps会降到1/7’之类的逻辑联系在哪? 要说起来,你不如去关心PVR中你引以为豪的TB ...
原帖由 JimmyC 于 2011-2-4 14:30 发表 不至, 至少在这些情况下也会失效(fps降至1/10~30) -use kill/clip in pixelshader -change compare func -modify depth 好吧, 你要说这也算是完整的HSR我也没办法 那G80的官方文档和Nvidia GPU Programming Guide还真是写心酸的 USSE2的TBDR效能已经比USSE好了一倍(16z:8z) 一样是MBX, Sega的Aurora(2008产品)就有专门优化透明/不完整三角形 当年PowerVR2代, Dreamcast也是alpha test with HW front, 效能比同时脉的电脑版快一倍 难保SGX543MP4+不会有硬件加速alpha test, 就算没有, 也有64z, 即是Galaxy S的八倍 200MHz的Galaxy S(SGX540)比起240MHz的Tegra2 GPU效能差距多少? 就算不是N粉也可以参考Nvidia今年1月26日发出的宣传PDF, 说是110~150%, 实际约110~125% 然後Nvidia声称Tegra2的GPU效能是低阶G80(Tegra1是低阶Geforce6) 要喷, 请连NV一起喷, 好歹SGX543MP4+的同时脉效能是这"低阶G80"的八倍以上
For sprites with transparent areas, create polygons that are optimal for the visible area and exclude fragments that are completely transparent. If an application was to render a simple triangular shaped tree texture on a quad polygon, there would be large, empty areas that would need to be blended. A better approach in this situation would be to use a triangle that tightly fits the shape of the texture. By doing so, most of the empty area that would have to be blended when using a quad to render the tree sprite can be removed, which means there are fewer fragments to blend. Geometry used to tightly fit sprites in a given application should be kept as simple as possible while eliminating as many unwanted fragments as possible. Finding the balance between geometric complexity and the empty space that will be removed by using more complex geometry is a balance that is very application and platform specific. A tool such as the one described here: http://www.humus.name/index.php?page=Cool&ID=8 can be used to generate the geometry required. For further optimisation, when rendering sprites with partially transparent areas, break each object down into an area that can be rendered as an opaque sprite and a second area of partially transparency that can be blended. By taking this approach, the number of fragments that need to be blended for each sprite can be significantly reduced, which allows the HSR process to provide a "super" fill rate. In order to maintain sprite ordering, use of the depth buffer will be required - each sprite will need a unique offset to avoid artefacts. Generating the areas for this technique can be done with a similar tool to that mentioned above, but this time looking for opaque pixels instead of completely transparent. As stated previously, the opaque objects should be drawn first followed by the blended objects as this will allow the blended objects to gain the most benefit possible from the hardware's HSR process.
原帖由 hourousha 于 2011-2-4 15:33 发表 1:请给出fps降至1/10至1/30的出处,说起来你这个结论就很神,降至1/10的原始参照物在哪里?前提条件是什么?仅仅是一个alpha test时HSR失效就会让fps降到1/10,那岂不是说alpha test占了总渲染成本的90%以上且alp ...
原帖由 GTFC 于 2011-2-4 17:09 发表 技术大拿们能解释下怎么用7800阉割版做出战神3 GT5 KZ3 这些画面吗 纯为这些技术参数来喷有什么意义
原帖由 JimmyC 于 2011-2-4 16:39 发表 early-z exists since gf3, like mentioned before. it is disabled if you -enable alpha test -use kill/clip in pixelshader -change compare func in order to get speed again on G70, you need to work around your alpha-testing. this is critical, otherwise you pretty much run without optimization and then you're easily 10 to 30 times slower. 你自己搜索一下随便一个Dreamcast模拟器的说明 DC用的PowerVR2的指令分ZWrite和Alpha ZWrite等 後者可大幅强化fps数倍, 这硬体加速指令可是DC版的PowerVR2才有, 显卡的Neon250没有 Sega街机用的MBX也有这个指令, 但iphone2G/3G用的就没有 证明Imgtec一早就有解决方法但没全部采用 在还没清楚SGX543MP4+的规格前就喷这点会不会太早? PowerVR Insider那边的资料别说SGX543MP4+, 连SGX543的也没有, 也没有家用机芯片的资料 最近期的就是2007年发表的SGX540的开发建议 比起USSE, USSE2每管线shader/TBDR/隐面处理性能增加一倍, 8z>16z, 1D>2D, Vec2>Vec4, 同时支援更多硬体加速 难为你可以面不红气不喘地用2005年USSE的资料来喷2009年的USSE2 跑什麽题? RSX:G70(7800)阉割版(8:24:24:8) 时脉比SGX543MP4+高20%, 效能高10~25%的240MHz Tegra2:低阶G80, 最低阶的G80为8300GS(8:8:4) 前一点不敢喷, 说到同时脉效能为Tegra2八倍以上的SGX543MP4+效能接近8600GT(32:16:8)/RSX就要喷了 可笑的是连SGX543MP4+时脉多少还未知道 当2011Q1的OMAP4440(45nm)用的已是380MHz 还要拿着200MHz的数据来喷
原帖由 hourousha 于 2011-2-4 18:40 发表 敢情1/10-1/30是这么来的,彻底喷了,那人在论坛上红口白牙地一说,一无数据支持,二无环境说明,三无法证明此问题是由HSR失效导致,到了你这里就当真理宣传了,你真行…… 说RSX的HSR是笑话是假HSR的是你不是我; ...
原帖由 qjw363924793 于 2011-2-4 18:45 发表 ngp能领先手机2年半?的确啊因为ngp2年半后上市,所以yy的东西总是无比强大,2014年初,ngp机能如果能领先最高端手机,我死,如果没有领先楼上死,楼上的2b敢赌命不?2014年挖坟来
原帖由 JimmyC 于 2011-2-4 20:21 发表 G70及之前的只能coarse level Z and Stencil culling G80及以後的才能fine-grained Z and Stencil culling Course-grained Z: Course Z, Hierarchical Z, Hi-Z, or ZCULL Fine-grained Z: Fine Z, Early Z, Early Z Checking, Early Z Out 好吧, 这不是阉割, fine-grained Z and Stencil culling是多馀的 skip the shading of occluded pixels其实是没有用的垃圾功能 没有这的G70已经是完整的HSR 没有这的G70才是真HSR 有这的G80反而是假HSR 我这样说没错吧?
1/7, 1/10-30都是别人在G70使用HSR实际编程的结果, Nvidia自然不会说白慢多少, 但随便搜一下也有很多这方面的讨论
我放出讨论链结又被喷是搜回来的, 非官方不能作准 但我又不会写, 你怎样不自己写一点看看? 还有, MBX是五年前的产品 拿2005年USSE来喷2009年USSE2的不是你?
原帖由 hourousha 于 2011-2-4 22:50 发表 喷了,你这逻辑能力真成问题,G70的early-z有限制,但不是假HSR,更不是笑话,很简单,有37楼给出的测试结果为证,比你在这红口白牙地给HSR的真假与否,笑话与否胡乱下定义要强的多。 至于你说G80是假HSR,我只能 ...
证明有没有early-Z的方法, 就是要让z-cull失效. 方法很简单, 反转一下z test就可以了. 结果证明G8x几乎根本不受z-反转的影响, 而G70在测试反转后性能和完全没有occlusion一样.
魔神至尊