當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

优化着色器信息加载，或查看Yer数据！

發(fā)布時(shí)間：2024/3/12 编程问答 34 豆豆

生活随笔收集整理的這篇文章主要介紹了优化着色器信息加载，或查看Yer数据！小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

A story about a million shader variants, optimizing using Instruments and looking at the data to optimize some more.

關(guān)于一百萬個(gè)著色器變體的故事，使用樂器進(jìn)行優(yōu)化并查看數(shù)據(jù)以進(jìn)行更多優(yōu)化。

錯(cuò)誤報(bào)告 (The Bug Report)

The bug report I was looking into was along the lines of?“when we put these shaders into our project, then building a game becomes much slower – even if shaders aren’t being used”.

我正在研究的錯(cuò)誤報(bào)告大致是“當(dāng)我們將這些著色器放入我們的項(xiàng)目中時(shí)，即使不使用著色器，構(gòu)建游戲也會(huì)變得慢得多” 。

Indeed it was. Quick look revealed that for?ComplicatedReasons(tm)?we load information about?all shaders?during the game build – that explains why the slowdown was happening even if shaders were not actually used.

確實(shí)是這樣。快速瀏覽顯示，對(duì)于ComplicatedReasons(tm)，我們?cè)谟螒驑?gòu)建期間會(huì)加載有關(guān)所有著色器的信息，這解釋了為什么即使實(shí)際上未使用著色器也會(huì)出現(xiàn)速度下降的原因。

This issue must be fixed! There’s probably no really good reason we must know about all the shaders for a game build. But to fix it, I’ll need to pair up with someone who knows anything about game data build pipeline, our data serialization and so on. So that will be someday in the future.

必須解決此問題！我們必須了解游戲構(gòu)建的所有著色器的確沒有充分的理由。但是要解決此問題，我需要與對(duì)游戲數(shù)據(jù)構(gòu)建管道，我們的數(shù)據(jù)序列化等一無所知的人結(jié)對(duì)。所以這將是將來的一天。

Meanwhile… another problem was that loading the “information for a shader” was slow in this project. Did I say slow? It was?very slow.

同時(shí)，另一個(gè)問題是，在此項(xiàng)目中，加載“著色器信息”的速度很慢。我說的慢嗎？太慢了。

That’s a good thing to look at. Shader data is not only loaded while building the game; it’s also loaded when the shader is needed for the first time (e.g. clicking on it in Unity’s project view); or when we actually have a material that uses it etc. All these operations were quite slow in this project.

這是一件好事。著色器數(shù)據(jù)不僅在構(gòu)建游戲時(shí)加載；第一次需要著色器時(shí)也會(huì)加載它(例如，在Unity的項(xiàng)目視圖中單擊它)；或當(dāng)我們實(shí)際上有使用它的材料等時(shí)。在此項(xiàng)目中，所有這些操作都很慢。

Turns out this particular shader had massive internal variant count. In Unity, what looks like “a single shader” to the user often has?many variants inside?(to handle different lights, lightmaps, shadows, HDR and whatnot – typical ubershader setup). Usually shaders have from a few dozen to a few thousand variants. This shader had?1.9 million. And there were about ten shaders like that in the project.

事實(shí)證明，此特定著色器具有大量內(nèi)部變體計(jì)數(shù)。在Unity中，對(duì)用戶來說看起來像“單一著色器”的內(nèi)部通常有許多變體 (以處理不同的燈光，光照貼圖，陰影，HDR和其他(典型的ubershader設(shè)置))。通常，著色器有幾十種到幾千種。該著色器有190萬個(gè) 。項(xiàng)目中大約有十個(gè)這樣的著色器。

設(shè)置 (The Setup)

Let’s create several shaders with different variant counts for testing: 27 thousand, 111 thousand, 333 thousand and 1 million variants. I’ll call them 27k, 111k, 333k and 1M respectively. For reference, the new “Standard” shader in Unity 5.0 has about 33 thousand internal variants. I’ll do tests on MacBook Pro (2.3 GHz Core i7) using 64 bit Release build.

讓我們創(chuàng)建幾個(gè)具有不同變體計(jì)數(shù)的著色器進(jìn)行測試：2.7萬個(gè)，11.1萬個(gè)，33.3萬個(gè)和100萬個(gè)變體。我將它們分別稱為27k，111k，333k和1M。作為參考，Unity 5.0中新的“標(biāo)準(zhǔn)”著色器具有約3.3萬個(gè)內(nèi)部變體。我將使用64位Release版本在MacBook Pro(2.3 GHz Core i7)上進(jìn)行測試。

Things I’ll be measuring:

我要衡量的事情：

Import time. How much time it takes to reimport the shader in Unity editor.?Since Unity 4.5?this doesn’t do much of actual shader?compilation; it just extracts information about shader snippets that need compiling, and the variants that are there, etc.
導(dǎo)入時(shí)間。在Unity編輯器中重新導(dǎo)入著色器需要花費(fèi)多少時(shí)間。從Unity 4.5開始，實(shí)際著色器的編譯工作不多; 它只是提取有關(guān)需要編譯的著色器代碼片段以及其中存在的變體等信息。
Imported data size. How large is the imported shader data (serialized representation of actual shader asset; i.e. files that live in?Library/metadata?folder of a Unity project).
導(dǎo)入的數(shù)據(jù)大小。導(dǎo)入的著色器數(shù)據(jù)的大小(實(shí)際著色器資產(chǎn)的序列化表示；即，位于Unity項(xiàng)目的Library/metadata文件夾中的文件)。

So the data is:

所以數(shù)據(jù)是：

Shader Import Load Size27k 420ms 120ms 6.4MB111k 2013ms 492ms 27.9MB333k 7779ms 1719ms 89.2MB1M 16192ms 4231ms 272.4MB Shader Import Load Size27k 420ms 120ms 6.4MB111k 2013ms 492ms 27.9MB333k 7779ms 1719ms 89.2MB1M 16192ms 4231ms 272.4MB

23456

Shader?? Import????Load????Size27k????420ms?? 120ms????6.4MB111k?? 2013ms?? 492ms?? 27.9MB333k?? 7779ms??1719ms?? 89.2MB1M??16192ms??4231ms??272.4MB

2 3 4 5 6

Shader?? Import???? Load???? Size 27k ???? 420ms ?? 120ms ???? 6.4MB 111k ?? 2013ms ?? 492ms ?? 27.9MB 333k ?? 7779ms ?? 1719ms ?? 89.2MB 1M ?? 16192ms ?? 4231ms ?? 272.4MB

輸入樂器 (Enter Instruments)

Last time?we used xperf to do some profiling. We’re on a Mac this time, so let’s use?Apple Instruments. Just like xperf, Instruments can show a lot of interesting data. We’re looking at the most simple one, “Time Profiler”?(though profiling Zombies is very tempting!). You pick that instrument, attach to the executable, start recording, and get some results out.

上一次，我們使用xperf進(jìn)行了分析。這次我們使用的是Mac，因此讓我們使用Apple Instruments 。就像xperf一樣，Instruments可以顯示很多有趣的數(shù)據(jù)。我們正在尋找最簡單的工具，即“ Time Profiler” (盡管對(duì)Zombies進(jìn)行概要分析非常誘人！) 。您選擇該樂器，將其附加到可執(zhí)行文件，開始錄音，然后得出一些結(jié)果。

You then select the time range you’re interested in, and expand the stack trace. Protip: Alt-Click?(ok ok, Option-Click you Mac peoples)?expands full tree.

然后，選擇您感興趣的時(shí)間范圍，并展開堆棧跟蹤。提示：按住Alt鍵單擊(確定，Mac族中按住Option鍵單擊)，會(huì)展開整個(gè)樹。

So far the whole stack is just going deep into Cocoa stuff. “Hide System Libraries” is very helpful with that:

到目前為止，整個(gè)堆棧都只是深入到可可粉中。 “隱藏系統(tǒng)庫”在以下方面非常有幫助：

Another very useful feature is inverting the call tree, where the results are presented from the heaviest “self time” functions (we won’t be using that here though).

另一個(gè)非常有用的功能是反轉(zhuǎn)調(diào)用樹，其中的結(jié)果是通過最重的“自拍時(shí)間”函數(shù)顯示的(盡管這里我們不會(huì)使用它)。

When hovering over an item, an arrow is shown on the right (see image above). Clicking on that does “focus on subtree”, i.e. ignores everything outside of that item, and time percentages are shown relative to the item. Here we’ve focused on?ShaderCompilerPreprocess?(which does majority of shader “importing” work).

將鼠標(biāo)懸停在項(xiàng)目上時(shí)，右側(cè)會(huì)顯示一個(gè)箭頭(請(qǐng)參見上圖)。單擊該按鈕會(huì)“專注于子樹”，即忽略該項(xiàng)目以外的所有內(nèi)容，并顯示相對(duì)于該項(xiàng)目的時(shí)間百分比。在這里，我們集中于ShaderCompilerPreprocess (它完成了大部分著色器“導(dǎo)入”工作)。

Looks like we’re spending a lot of time appending to strings. That usually means strings did not have enough storage buffer reserved and are causing a lot of memory allocations. Code change:

看起來我們?cè)谧址匣撕芏鄷r(shí)間。這通常意味著字符串沒有預(yù)留足夠的存儲(chǔ)緩沖區(qū)，并導(dǎo)致大量內(nèi)存分配。代碼更改：

This small change has cut down shader importing time by 20-40%!?Very nice!

這個(gè)小小的變化將著色器的導(dǎo)入時(shí)間減少了20-40％！非常好！

I did a couple other small tweaks from looking at this profiling data – none of them resulted in any signifinant benefit though.

通過查看此概要分析數(shù)據(jù)，我做了其他一些小調(diào)整-盡管這些調(diào)整都沒有帶來任何顯著的好處。

Profiling shader load time also says that most of the time ends up being spent on loading editor related data that is arrays of arrays of strings and so on:

分析著色器的加載時(shí)間還表示，大部分時(shí)間最終都花在了加載與編輯器相關(guān)的數(shù)據(jù)上，這些數(shù)據(jù)是字符串?dāng)?shù)組的數(shù)組，等等：

I could have picked functions from the profiler results, went though each of them and optimized, and perhaps would have achieved a solid 2-3x improvement over initial results. Very often that’s enough to be proud!

我本可以從探查器結(jié)果中挑選出功能，逐一分析并進(jìn)行優(yōu)化，也許可以比初始結(jié)果提高2-3倍。很多時(shí)候，這足以令人感到驕傲！

However…

然而…

退后一步 (Taking a step back)

Or like?Mike Acton?would say, ”look at your data!” (check his CppCon2014?slides?or?video). Another saying is also applicable: ”think!”

或就像Mike Acton所說的那樣，“ 看看您的數(shù)據(jù)！ ”(查看他的CppCon2014 幻燈片或視頻 )。另一句話也適用：“ 思考！ ”

Why?do we have this problem to begin with?

為什么我們要從這個(gè)問題開始呢？

For example, in 333k variant shader case, we end up sending 610560 lines of shader variant information between shader compiler process & editor, with macro strings in each of them. In total we’re sending 91 megabytes of data over RPC pipe during shader import.

例如，在333k變量著色器的情況下，我們最終在著色器編譯器進(jìn)程和編輯器之間發(fā)送610560行著色器變量信息，并且每個(gè)變量中都包含宏字符串。在著色器導(dǎo)入期間，我們總共通過RPC管道發(fā)送了91兆字節(jié)的數(shù)據(jù)。

One possible area for improvement: the data we send over and store in imported shader data is a small set of macro strings repeated over and over and over again. Instead of sending or storing the strings, we could just send the set of strings used by a shader once, assign numbers to them, and then send & store the full set as lists of numbers (or fixed size bitmasks). This should cut down on the amount of string operations we do (massively cut down on number of small allocations), size of data we send, and size of data we store.

有一個(gè)可能需要改進(jìn)的地方：我們反復(fù)發(fā)送并存儲(chǔ)在導(dǎo)入的著色器數(shù)據(jù)中的數(shù)據(jù)是一小組一遍又一遍地重復(fù)的宏字符串。無需發(fā)送或存儲(chǔ)字符串，我們只需發(fā)送一次著色器使用的字符串集，為它們分配數(shù)字，然后將完整的集合發(fā)送并存儲(chǔ)為數(shù)字列表(或固定大小的位掩碼)。這樣可以減少我們執(zhí)行的字符串操作的數(shù)量(大大減少小分配的數(shù)量)，發(fā)送的數(shù)據(jù)大小和存儲(chǔ)的數(shù)據(jù)大小。

Another possible approach: right now we have source data in shader that indicate which variants to generate. This data is very small: just a list of on/off features, and some built-in variant lists (“all variants to handle lighting in forward rendering”). We do the full combinatorial explosion of that in the shader compiler process, send the full set over to the editor, and the editor stores that in imported shader data.

另一種可能的方法：現(xiàn)在，我們?cè)谥髦芯哂性磾?shù)據(jù)，指示要生成哪些變體。該數(shù)據(jù)非常小：只有開/關(guān)功能列表以及一些內(nèi)置的變體列表(“用于處理正向渲染中所有照明的所有變體”)。我們?cè)谥骶幾g器過程中進(jìn)行完整的組合分解，將完整的集發(fā)送給編輯器，然后編輯器將其存儲(chǔ)在導(dǎo)入的著色器數(shù)據(jù)中。

But the way we do the “explosion of source data into full set” is?always the same. We could just send the source data from shader compiler to the editor (a very small amount!), and furthermore, just store that in imported shader data. We can rebuild the full set when needed at any time.

但是我們“將源數(shù)據(jù)分解為完整數(shù)據(jù)集”的方式始終相同。我們可以將源數(shù)據(jù)從著色器編譯器發(fā)送到編輯器(非常少！)，此外，只需將其存儲(chǔ)在導(dǎo)入的著色器數(shù)據(jù)中即可。我們可以在需要時(shí)隨時(shí)重建完整集。

變更資料 (Changing the data)

So let’s try to do that. First let’s deal with RPC only, without changing serialized shader data. A few commits later…

因此，讓我們嘗試這樣做。首先，讓我們僅處理RPC，而不更改序列化著色器數(shù)據(jù)。稍后再提交…

This made shader importing over?twice as fast!

這使著色器的導(dǎo)入速度提高了兩倍！

Shader Import27k 419ms -> 200ms111k 1702ms -> 791ms333k 5362ms -> 2530ms1M 16784ms -> 8280ms Shader Import27k 419ms -> 200ms111k 1702ms -> 791ms333k 5362ms -> 2530ms1M 16784ms -> 8280ms

23456

Shader?? Import27k????419ms ->??200ms111k?? 1702ms ->??791ms333k?? 5362ms -> 2530ms1M??16784ms -> 8280ms

2 3 4 5 6

Shader?? Import 27k ???? 419ms - & gt ; ?? 200ms 111k ?? 1702ms - & gt ; ?? 791ms 333k ?? 5362ms - & gt ; 2530ms 1M ?? 16784ms - & gt ; 8280ms

Let’s do the other part too; where we change serialized shader variant data representation. Instead of storing full set of possible variants, we only store data needed to generate the full set:

我們也要做另一部分。在此更改序列化著色器變體數(shù)據(jù)表示形式。除了存儲(chǔ)全套可能的變量之外，我們僅存儲(chǔ)生成全套所需的數(shù)據(jù)：

Shader Import Load Size27k 200ms -> 285ms 103ms -> 396ms 6.4MB -> 55kB111k 791ms -> 1229ms 426ms -> 1832ms 27.9MB -> 55kB333k 2530ms -> 3893ms 1410ms -> 5892ms 89.2MB -> 56kB1M 8280ms -> 12416ms 4498ms -> 18949ms 272.4MB -> 57kB Shader Import Load Size27k 200ms -> 285ms 103ms -> 396ms 6.4MB -> 55kB111k 791ms -> 1229ms 426ms -> 1832ms 27.9MB -> 55kB333k 2530ms -> 3893ms 1410ms -> 5892ms 89.2MB -> 56kB1M 8280ms -> 12416ms 4498ms -> 18949ms 272.4MB -> 57kB

23456

Shader?? Import??????????????Load???????????????? Size27k????200ms ->?? 285ms????103ms ->????396ms???? 6.4MB -> 55kB111k????791ms ->??1229ms????426ms ->?? 1832ms????27.9MB -> 55kB333k?? 2530ms ->??3893ms?? 1410ms ->?? 5892ms????89.2MB -> 56kB1M?? 8280ms -> 12416ms?? 4498ms ->??18949ms?? 272.4MB -> 57kB

2 3 4 5 6

Shader?? Import?????????????? Load???????????????? Size 27k ???? 200ms - & gt ; ?? 285ms ???? 103ms - & gt ; ???? 396ms ???? 6.4MB - & gt ; 55kB 111k ???? 791ms - & gt ; ?? 1229ms ???? 426ms - & gt ; ?? 1832ms ???? 27.9MB - & gt ; 55kB 333k ?? 2530ms - & gt ; ?? 3893ms ?? 1410ms - & gt ; ?? 5892ms ???? 89.2MB - & gt ; 56kB 1M ?? 8280ms - & gt ; 12416ms ?? 4498ms - & gt ; ?? 18949ms ?? 272.4MB - & gt ; 57kB

Everything seems to work, and the serialized file size got massively decreased. But, both importing and loading got slower?! Clearly I did something stupid. Profile!

一切似乎正常，序列化的文件大小大大減少。但是，導(dǎo)入和加載都變慢了嗎？顯然我做了一些愚蠢的事情。個(gè)人資料！

Right. So after importing or loading the shader (from now a small file on disk), we generate the full set of shader variant data. Which right now is resulting in a lot of string allocations, since it is generating arrays of arrays of strings or somesuch.

對(duì)。因此，在導(dǎo)入或加載著色器之后(從磁盤上的一個(gè)小文件開始)，我們將生成完整的著色器變體數(shù)據(jù)集。現(xiàn)在哪個(gè)會(huì)導(dǎo)致大量的字符串分配，因?yàn)樗谏勺址當(dāng)?shù)組或類似的數(shù)組。

But we don’t really need the strings at this point; for example after loading the shader we only need the internal representation of “shader variant key” which is a fairly small bitmask. A couple of tweaks to fix that, and we’re at:

但是現(xiàn)在我們真的不需要字符串了。例如，在加載著色器后，我們只需要“著色器變體鍵”的內(nèi)部表示，這是一個(gè)相當(dāng)小的位掩碼。為了解決這個(gè)問題，我們進(jìn)行了一些調(diào)整，我們位于：

Shader Import Load27k 42ms 7ms111k 47ms 27ms333k 94ms 76ms1M 231ms 225ms Shader Import Load27k 42ms 7ms111k 47ms 27ms333k 94ms 76ms1M 231ms 225ms

23456

Shader??Import????Load27k????42ms???? 7ms111k????47ms????27ms333k????94ms????76ms1M?? 231ms?? 225ms

2 3 4 5 6

Shader?? Import???? Load 27k ???? 42ms ???? 7ms 111k ???? 47ms ???? 27ms 333k ???? 94ms ???? 76ms 1M ?? 231ms ?? 225ms

Look at that! Importing a 333k variant shader got?82 times?faster; loading its metadata got?22 times?faster, and the imported file size got?over a thousand times?smaller!

看那個(gè)！導(dǎo)入333k變體著色器的速度提高了82倍；加載元數(shù)據(jù)的速度提高了22倍，而導(dǎo)入的文件大小卻縮小了1000倍！

One final look at the profiler, just because:

最后看一下分析器，原因僅在于：

Weird, time is spent in memory allocation but there shouldn’t be any at this point in that function; we aren’t creating any new strings there. Ahh, implicit?std::string?to?UnityStr?(our own string class with better memory reporting) conversion operators?(long story…). Fix that, and we’ve got another 2x improvement:

奇怪的是，時(shí)間花在了內(nèi)存分配上，但是該函數(shù)此時(shí)不應(yīng)該有任何時(shí)間。我們沒有在那里創(chuàng)建任何新的字符串。啊，將std::string隱式轉(zhuǎn)換為UnityStr (我們自己的字符串類，具有更好的內(nèi)存報(bào)告)轉(zhuǎn)換運(yùn)算符( UnityStr ……) 。修復(fù)此問題，我們又有了2倍的改進(jìn)：

Shader Import Load27k 42ms 5ms111k 44ms 18ms333k 53ms 46ms1M 130ms 128ms Shader Import Load27k 42ms 5ms111k 44ms 18ms333k 53ms 46ms1M 130ms 128ms

23456

Shader??Import????Load27k????42ms???? 5ms111k????44ms????18ms333k????53ms????46ms1M?? 130ms?? 128ms

2 3 4 5 6

Shader?? Import???? Load 27k ???? 42ms ???? 5ms 111k ???? 44ms ???? 18ms 333k ???? 53ms ???? 46ms 1M ?? 130ms ?? 128ms

The code could still be optimized further, but there ain’t no easy fixes left I think. And at this point I’ll have more important tasks to do…

該代碼仍可以進(jìn)一步優(yōu)化，但是我認(rèn)為沒有簡單的修復(fù)方法。在這一點(diǎn)上，我將有更多重要的工作要做...

我們有什么 (What we’ve got)

So in total, here’s what we have so far:

因此，總的來說，這就是我們目前所擁有的：

Shader Import Load Size27k 420ms-> 42ms (10x) 120ms-> 5ms (24x) 6.4MB->55kB (119x)111k 2013ms-> 44ms (46x) 492ms-> 18ms (27x) 27.9MB->55kB (519x)333k 7779ms-> 53ms (147x) 1719ms-> 46ms (37x) 89.2MB->56kB (this is getting)1M 16192ms->130ms (125x) 4231ms->128ms (33x) 272.4MB->57kB (ridiculous!) Shader Import Load Size27k 420ms-> 42ms (10x) 120ms-> 5ms (24x) 6.4MB->55kB (119x)111k 2013ms-> 44ms (46x) 492ms-> 18ms (27x) 27.9MB->55kB (519x)333k 7779ms-> 53ms (147x) 1719ms-> 46ms (37x) 89.2MB->56kB (this is getting)1M 16192ms->130ms (125x) 4231ms->128ms (33x) 272.4MB->57kB (ridiculous!)

23456

Shader?? Import????????????????Load???????????????? Size27k????420ms-> 42ms (10x)????120ms->??5ms (24x)????6.4MB->55kB (119x)111k?? 2013ms-> 44ms (46x)????492ms-> 18ms (27x)?? 27.9MB->55kB (519x)333k?? 7779ms-> 53ms (147x)??1719ms-> 46ms (37x)?? 89.2MB->56kB (this is getting)1M??16192ms->130ms (125x)??4231ms->128ms (33x)??272.4MB->57kB (ridiculous!)

2 3 4 5 6

Shader?? Import???????????????? Load???????????????? Size 27k ???? 420ms - & gt ; 42ms ( 10x ) ???? 120ms - & gt ; ?? 5ms ( 24x ) ???? 6.4MB - & gt ; 55kB ( 119x ) 111k ?? 2013ms - & gt ; 44ms ( 46x ) ???? 492ms - & gt ; 18ms ( 27x ) ?? 27.9MB - & gt ; 55kB ( 519x ) 333k ?? 7779ms - & gt ; 53ms ( 147x ) ?? 1719ms - & gt ; 46ms ( 37x ) ?? 89.2MB - & gt ; 56kB ( this is getting ) 1M ?? 16192ms - & gt ; 130ms ( 125x ) ?? 4231ms - & gt ; 128ms ( 33x ) ?? 272.4MB - & gt ; 57kB ( ridiculous ! )

And a fairly small pull request to achieve all this (~400 lines of code changed, ~400 new added – out of which half were new unit tests I did to feel safer before I started changing things):

而實(shí)現(xiàn)這一切的請(qǐng)求很小(更改了約400行代碼，增加了約400行新代碼–其中一半是新的單元測試，在我開始進(jìn)行更改之前，我確實(shí)感到更加安全)：

Overall I’ve probably spent something like 8 hours on this – hard to say exactly since I did some breaks and other things. Also I was writing down notes & making sceenshots for the blog too :) The fix/optimization is already in?Unity 5.0 beta 20?by the way.

總體而言，我可能在此上花費(fèi)了大約8個(gè)小時(shí)-確切地說，因?yàn)槲易隽艘恍┬菹⒑推渌虑椤?另外，我也正在為博客寫下筆記并做截圖：)順便說一下，修復(fù)/優(yōu)化已在Unity 5.0 beta 20中進(jìn)行。

結(jié)論 (Conclusion)

Apple’s Instruments?is a nice profiling tool (and unlike xperf, the UI is not intimidating…).

Apple的Instruments是一個(gè)不錯(cuò)的分析工具(與xperf不同，UI并不令人生畏……)。

However,?Profiler Is Not A Replacement For Thinking!?I could have just looked at the profiling results and tried to optimize “what’s at top of the profiler” one by one, and maybe achieved 2-3x better performance. But by thinking about the?actual problem?and?why it happens, I got a way, way better result.

但是， Profiler不能代替思維！ 我本可以查看分析結(jié)果，然后嘗試一個(gè)一個(gè)地優(yōu)化“探查器頂部的功能”，也許可以將性能提高2-3倍。但是，通過考慮實(shí)際問題及其發(fā)生的原因，我得到了更好的結(jié)果。

Happy thinking!

思考愉快！

翻譯自: https://blogs.unity3d.com/2015/01/18/optimizing-shader-info-loading-or-look-at-yer-data/

總結(jié)

以上是生活随笔為你收集整理的优化着色器信息加载，或查看Yer数据！的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。