虚幻引擎之多线程渲染机制
一、前言
在 虚幻引擎编程基础(二) 中,笔者简单地介绍了虚幻引擎中的多线程的类型,共分为三类:
- 标准多线程实现FRunnable;
- 使用线程池的AsyncTask;
- TaskGraph;
UE中与图形绘制相关的线程有:
- 游戏线程(Game Thread);
- 渲染线程(Render Thread);
- RHI线程(RHI Thread);
其中:
- 游戏线程主要负责是场景物体等的逻辑计算;
- 渲染线程负责图元渲染指令的生成;
- RHI线程负责将渲染指令提交至GPU;
这篇博文将对线程间的交互进行一个简单的梳理,以下是笔者的一些笔记。如有错误,还请见谅。
二、游戏线程与渲染线程的交互
2.1 ENQUEUE_RENDER_COMMAND宏
ENQUEUE_RENDER_COMMAND 宏,是主线程向渲染线程入队一个命令的方式。
使用的方式如下:
ENQUEUE_RENDER_COMMAND(Type)(Lambda表达式)
让我们来看一下这个宏是如何工作的?
首先,宏的定义:
- 根据传入的Type字符串,拼接了一个新的结构体类。
- 调用函数
EnqueueUniqueRenderCommand<Type##Name>
接着,让我们看下 EnqueueUniqueRenderCommand 做了什么?
-
首先判断是否在渲染线程,如果是渲染线程,直接执行。 -
如果不是在渲染线程,要通过GraphTask创建一个Task。
传入的Lambda函数如何被调用的呢?
- 渲染线程这种特定名线程会从自己的队列中找到当前要执行的任务。
- 在TGraphTask的Execute函数被执行后,TTask的DoTask被执行,在其中调用了传入的Lambda表示式。
小结:
- ENQUEUE_RENDER_COMMAND用传入的lamda函数创造了一个GraphTask,这个GraphTask被加入到TaskGraph的一个队列中。
- 在渲染线程中,不断获取GraphTask并运行;
2.2 渲染线程
渲染线程的创建采用 标准多线程实现FRunnable 的方式。
在LauchEngineLoop.cpp的FEngineLoop::PreInitPreStartupScreen函数中会调用StartRenderingThread函数:
// Turn on the threaded rendering flag.
GIsThreadedRendering = true;
// Create the rendering thread.
// 创建渲染线程
GRenderingThreadRunnable = new FRenderingThread();
Trace::ThreadGroupBegin(TEXT("Render"));
PRAGMA_DISABLE_DEPRECATION_WARNINGS
GRenderingThread =
PRAGMA_ENABLE_DEPRECATION_WARNINGS
FRunnableThread::Create(GRenderingThreadRunnable,
*BuildRenderingThreadName(ThreadCount), 0,
FPlatformAffinity::GetRenderingThreadPriority(),
FPlatformAffinity::GetRenderingThreadMask(), FPlatformAffinity::GetRenderingThreadFlags());
FRenderingThread的Run函数,调用了的关键函数:RenderingThreadMain。
/** The rendering thread main loop */
void RenderingThreadMain( FEvent* TaskGraphBoundSyncEvent )
{
LLM_SCOPE(ELLMTag::RenderingThreadMemory);
ENamedThreads::Type RenderThread = ENamedThreads::Type(ENamedThreads::ActualRenderingThread);
ENamedThreads::SetRenderThread(RenderThread);
ENamedThreads::SetRenderThread_Local(ENamedThreads::Type(ENamedThreads::ActualRenderingThread_Local));
// 把当前线作为渲染线程挂接到TaskGraph
FTaskGraphInterface::Get().AttachToThread(RenderThread);
FPlatformMisc::MemoryBarrier();
// Inform main thread that the render thread has been attached to the taskgraph and is ready to receive tasks
if( TaskGraphBoundSyncEvent != NULL )
{
TaskGraphBoundSyncEvent->Trigger();
}
// set the thread back to real time mode
FPlatformProcess::SetRealTimeMode();
#if STATS
if (FThreadStats::WillEverCollectData())
{
FThreadStats::ExplicitFlush(); // flush the stats and set update the scope so we don't flush again until a frame update, this helps prevent fragmentation
}
#endif
FCoreDelegates::PostRenderingThreadCreated.Broadcast();
check(GIsThreadedRendering);
// 告诉TaskGraph系统,使用该线程一直处理渲染任务,直到请求退出
FTaskGraphInterface::Get().ProcessThreadUntilRequestReturn(RenderThread);
FPlatformMisc::MemoryBarrier();
check(!GIsThreadedRendering);
FCoreDelegates::PreRenderingThreadDestroyed.Broadcast();
#if STATS
if (FThreadStats::WillEverCollectData())
{
FThreadStats::ExplicitFlush(); // Another explicit flush to clean up the ScopeCount established above for any stats lingering since the last frame
}
#endif
ENamedThreads::SetRenderThread(ENamedThreads::GameThread);
ENamedThreads::SetRenderThread_Local(ENamedThreads::GameThread_Local);
FPlatformMisc::MemoryBarrier();
}
在这里有两个关键代码:
// 把当前线作为渲染线程挂接到TaskGraph
FTaskGraphInterface::Get().AttachToThread(ENamedThreads::RenderThread);
//...
// 告诉TaskGraph系统,使用该线程一直处理渲染任务,直到请求退出
FTaskGraphInterface::Get().ProcessThreadUntilRequestReturn(ENamedThreads::RenderThread);
第一句作用为:
- 为线程进行标记(有名线程),使得可以通过ENamedThreads::Type来操作对应的线程。
第二句ProcessThreadUntilRequestReturn :
virtual void ProcessThreadUntilRequestReturn(ENamedThreads::Type CurrentThread) final override
{
int32 QueueIndex = ENamedThreads::GetQueueIndex(CurrentThread);
CurrentThread = ENamedThreads::GetThreadIndex(CurrentThread);
check(CurrentThread >= 0 && CurrentThread < NumNamedThreads);
check(CurrentThread == GetCurrentThread());
Thread(CurrentThread).ProcessTasksUntilQuit(QueueIndex);
}
其中的ProcessTasksUntilQuit :是一个While循环。
virtual void ProcessTasksUntilQuit(int32 QueueIndex) override
{
check(Queue(QueueIndex).StallRestartEvent); // make sure we are started up
Queue(QueueIndex).QuitForReturn = false;
verify(++Queue(QueueIndex).RecursionGuard == 1);
const bool bIsMultiThread = FTaskGraphInterface::IsMultithread();
do
{
const bool bAllowStall = bIsMultiThread;
// 具体处理逻辑
ProcessTasksNamedThread(QueueIndex, bAllowStall);
} while (!Queue(QueueIndex).QuitForReturn && !Queue(QueueIndex).QuitForShutdown && bIsMultiThread); // @Hack - quit now when running with only one thread.
verify(!--Queue(QueueIndex).RecursionGuard);
}
在ProcessTasksNamedThread 进行具体的逻辑处理:
// 核心逻辑如下
while (!Queue(QueueIndex).QuitForReturn)
{
//...
// 从队伍中取出Task
FBaseGraphTask* Task = Queue(QueueIndex).StallQueue.Pop(0, bStallQueueAllowStall);
// 执行任务,会调用到DoTask
Task->Execute(NewTasks, ENamedThreads::Type(ThreadId | (QueueIndex << ENamedThreads::QueueIndexShift)));
}
在渲染线程中,获取Task,执行任务。
最终会调用到那些通过ENQUEUE_RENDER_COMMAND宏塞入的一条条渲染命令。
2.3 数据交互
为了解耦游戏线程和渲染线程,UE采用了数据拷贝的方式(即渲染线程拥有一份独立于游戏线程的数据)。
数据包括但不限于:光源数据、几何数据、材质数据。
那么,游戏线程的数据是如何向渲染线程进行更新的呢?
下面将进行介绍。
游戏线程在Tick时,会进入渲染模块的调用:
FRendererModule::BeginRenderingViewFamily ,是游戏线程触发渲染管线的接口。
该函数做了以下几个事情:
- 将游戏线程的图元数据、光源数据更新到渲染线程;
- 创建渲染器;
- 触发渲染管线流程;
2.3.1 数据更新
具体代码如下:
FScene* const Scene = ViewFamily->Scene->GetRenderScene();
if (Scene)
{
World = Scene->GetWorld();
if (World)
{
//guarantee that all render proxies are up to date before kicking off a BeginRenderViewFamily.
World->SendAllEndOfFrameUpdates();
}
}
SendAllEndOfFrameUpdates,会遍历组件,对渲染状态进行更新。
auto GTWork =
[this]()
{
QUICK_SCOPE_CYCLE_COUNTER(STAT_PostTickComponentUpdate_ForcedGameThread);
for (UActorComponent* Component : ComponentsThatNeedEndOfFrameUpdate_OnGameThread)
{
if (Component)
{
if (Component->IsRegistered() && !Component->IsTemplate() && !Component->IsPendingKill())
{
// 进行渲染状态更新
Component->DoDeferredRenderUpdates_Concurrent();
}
check(Component->IsPendingKill() || Component->GetMarkedForEndOfFrameUpdateState() == EComponentMarkedForEndOfFrameUpdateState::MarkedForGameThread);
FMarkComponentEndOfFrameUpdateState::Set(Component, INDEX_NONE, EComponentMarkedForEndOfFrameUpdateState::Unmarked);
}
}
ComponentsThatNeedEndOfFrameUpdate_OnGameThread.Reset();
ComponentsThatNeedEndOfFrameUpdate.Reset();
};
在DoDeferredRenderUpdates_Concurrent中会进行数据更新的操作。
if(bRenderStateDirty)
{
SCOPE_CYCLE_COUNTER(STAT_PostTickComponentRecreate);
// 重新创建渲染状态
RecreateRenderState_Concurrent();
checkf(!bRenderStateDirty, TEXT("Failed to route CreateRenderState_Concurrent (%s)"), *GetFullName());
}
else
{
SCOPE_CYCLE_COUNTER(STAT_PostTickComponentLW);
if(bRenderTransformDirty)
{
// Update the component's transform if the actor has been moved since it was last updated.
// 更新组件的Transform信息
SendRenderTransform_Concurrent();
}
if(bRenderDynamicDataDirty)
{
SendRenderDynamicData_Concurrent();
}
}
对于需要重新创建渲染状态的:
- 会调用RecreateRenderState_Concurrent。
- 先销毁之前的渲染状态,DestroyRenderState_Concurrent。
- 再创建新的渲染状态,CreateRenderState_Concurrent。
void UActorComponent::RecreateRenderState_Concurrent()
{
if(bRenderStateCreated)
{
check(IsRegistered()); // Should never have render state unless registered
DestroyRenderState_Concurrent();
checkf(!bRenderStateCreated, TEXT("Failed to route DestroyRenderState_Concurrent (%s)"), *GetFullName());
}
if(IsRegistered() && WorldPrivate->Scene)
{
CreateRenderState_Concurrent(nullptr);
checkf(bRenderStateCreated, TEXT("Failed to route CreateRenderState_Concurrent (%s)"), *GetFullName());
}
}
创建渲染状态的实现上,光源和图元二者实现不同。
对于光源组件而言:如果渲染状态是Dirty的,那么会重建渲染状态,实现如下:
void ULightComponent::CreateRenderState_Concurrent(FRegisterComponentContext* Context)
{
Super::CreateRenderState_Concurrent(Context);
if (bAffectsWorld)
{
UWorld* World = GetWorld();
const bool bHidden = !ShouldComponentAddToScene() || !ShouldRender() || Intensity <= 0.f;
if (!bHidden)
{
InitializeStaticShadowDepthMap();
// Add the light to the scene.
World->Scene->AddLight(this);
bAddedToSceneVisible = true;
}
//...
}
}
AddLight的接口实现如下:
- 创建FLightSceneProxy和FLightSceneInfo。
- 再通过ENQUEUE_RENDER_COMMAND宏,将数据从Game线程传递给渲染线程。
void FScene::AddLight(ULightComponent* Light)
{
LLM_SCOPE(ELLMTag::SceneRender);
// Create the light's scene proxy.
FLightSceneProxy* Proxy = Light->CreateSceneProxy();
if(Proxy)
{
// Associate the proxy with the light.
Light->SceneProxy = Proxy;
// Update the light's transform and position.
Proxy->SetTransform(Light->GetComponentTransform().ToMatrixNoScale(), Light->GetLightPosition());
// Create the light scene info.
Proxy->LightSceneInfo = new FLightSceneInfo(Proxy, true);
INC_DWORD_STAT(STAT_SceneLights);
// Adding a new light
++NumVisibleLights_GameThread;
// Send a command to the rendering thread to add the light to the scene.
FScene* Scene = this;
FLightSceneInfo* LightSceneInfo = Proxy->LightSceneInfo;
ENQUEUE_RENDER_COMMAND(FAddLightCommand)(
[Scene, LightSceneInfo](FRHICommandListImmediate& RHICmdList)
{
CSV_SCOPED_TIMING_STAT_EXCLUSIVE(Scene_AddLight);
FScopeCycleCounter Context(LightSceneInfo->Proxy->GetStatId());
Scene->AddLightSceneInfo_RenderThread(LightSceneInfo);
});
}
}
对于图元组件而言:如果渲染状态是Dirty的,那么会重建渲染状态,实现如下:
- 调用AddPrimitive函数将数据传到渲染线程。
void UPrimitiveComponent::CreateRenderState_Concurrent(FRegisterComponentContext* Context)
{
//...
Super::CreateRenderState_Concurrent(Context);
UpdateBounds();
// If the primitive isn't hidden and the detail mode setting allows it, add it to the scene.
if (ShouldComponentAddToScene())
{
if (Context != nullptr)
{
Context->AddPrimitive(this);
}
else
{
GetWorld()->Scene->AddPrimitive(this);
}
}
// ...
}
AddPrimitive的实现如下:
- 创建FPrimitiveSceneProxy和FPrimitiveSceneInfo。(具体的图元通过创建不同的FPrimitiveSceneProxy从而实现不同数据的传递)
- 再通过ENQUEUE_RENDER_COMMAND宏,将数据从Game线程传递给渲染线程。
void FScene::AddPrimitive(UPrimitiveComponent* Primitive)
{
// ...
// Create the primitive's scene proxy.
FPrimitiveSceneProxy* PrimitiveSceneProxy = Primitive->CreateSceneProxy();
Primitive->SceneProxy = PrimitiveSceneProxy;
if(!PrimitiveSceneProxy)
{
// Primitives which don't have a proxy are irrelevant to the scene manager.
return;
}
// Create the primitive scene info.
FPrimitiveSceneInfo* PrimitiveSceneInfo = new FPrimitiveSceneInfo(Primitive, this);
PrimitiveSceneProxy->PrimitiveSceneInfo = PrimitiveSceneInfo;
// Cache the primitives initial transform.
FMatrix RenderMatrix = Primitive->GetRenderMatrix();
FVector AttachmentRootPosition(0);
AActor* AttachmentRoot = Primitive->GetAttachmentRootActor();
if (AttachmentRoot)
{
AttachmentRootPosition = AttachmentRoot->GetActorLocation();
}
struct FCreateRenderThreadParameters
{
FPrimitiveSceneProxy* PrimitiveSceneProxy;
FMatrix RenderMatrix;
FBoxSphereBounds WorldBounds;
FVector AttachmentRootPosition;
FBoxSphereBounds LocalBounds;
};
FCreateRenderThreadParameters Params =
{
PrimitiveSceneProxy,
RenderMatrix,
Primitive->Bounds,
AttachmentRootPosition,
Primitive->CalcBounds(FTransform::Identity)
};
// ...
// Verify the primitive is valid (this will compile away to a nop without CHECK_FOR_PIE_PRIMITIVE_ATTACH_SCENE_MISMATCH)
VerifyProperPIEScene(Primitive, World);
// Increment the attachment counter, the primitive is about to be attached to the scene.
Primitive->AttachmentCounter.Increment();
// Create any RenderThreadResources required and send a command to the rendering thread to add the primitive to the scene.
FScene* Scene = this;
// If this primitive has a simulated previous transform, ensure that the velocity data for the scene representation is correct
TOptional<FTransform> PreviousTransform = FMotionVectorSimulation::Get().GetPreviousTransform(Primitive);
ENQUEUE_RENDER_COMMAND(AddPrimitiveCommand)(
[Params = MoveTemp(Params), Scene, PrimitiveSceneInfo, PreviousTransform = MoveTemp(PreviousTransform)](FRHICommandListImmediate& RHICmdList)
{
FPrimitiveSceneProxy* SceneProxy = Params.PrimitiveSceneProxy;
FScopeCycleCounter Context(SceneProxy->GetStatId());
SceneProxy->SetTransform(Params.RenderMatrix, Params.WorldBounds, Params.LocalBounds, Params.AttachmentRootPosition);
// Create any RenderThreadResources required.
SceneProxy->CreateRenderThreadResources();
Scene->AddPrimitiveSceneInfo_RenderThread(PrimitiveSceneInfo, PreviousTransform);
});
}
对于需要更新Transform数据的:
- 会调用SendRenderTransform_Concurrent。
由于图元的不同,则需要看具体图元的实现方式了。
最常见的UPrimitiveComponent实现如下:
void UPrimitiveComponent::SendRenderTransform_Concurrent()
{
UpdateBounds();
// If the primitive isn't hidden update its transform.
const bool bDetailModeAllowsRendering = DetailMode <= GetCachedScalabilityCVars().DetailMode;
if( bDetailModeAllowsRendering && (ShouldRender() || bCastHiddenShadow))
{
// Update the scene info's transform for this primitive.
GetWorld()->Scene->UpdatePrimitiveTransform(this);
}
Super::SendRenderTransform_Concurrent();
}
在UpdatePrimitiveTransform会通过ENQUEUE_RENDER_COMMAND,将需要更新的数据传入渲染线程的更新表中。
ENQUEUE_RENDER_COMMAND(UpdateTransformCommand)(
[UpdateParams](FRHICommandListImmediate& RHICmdList)
{
FScopeCycleCounter Context(UpdateParams.PrimitiveSceneProxy->GetStatId());
UpdateParams.Scene->UpdatePrimitiveTransform_RenderThread(UpdateParams.PrimitiveSceneProxy, UpdateParams.WorldBounds, UpdateParams.LocalBounds, UpdateParams.LocalToWorld, UpdateParams.AttachmentRootPosition, UpdateParams.PreviousTransform);
});
2.3.2 创建渲染器
每一帧,主线程都会重新创建渲染器。
// Construct the scene renderer. This copies the view family attributes into its own structures.
FSceneRenderer* SceneRenderer = FSceneRenderer::CreateSceneRenderer(ViewFamily, Canvas->GetHitProxyConsumer());
2.3.3 触发渲染管线
在创建好渲染器之后,会通过ENQUEUE_RENDER_COMMAND,触发渲染管线流程。
ENQUEUE_RENDER_COMMAND(FDrawSceneCommand)(
[SceneRenderer, DrawSceneEnqueue](FRHICommandListImmediate& RHICmdList)
{
const float StartDelayMillisec = FPlatformTime::ToMilliseconds(FPlatformTime::Cycles() - DrawSceneEnqueue);
CSV_CUSTOM_STAT_GLOBAL(DrawSceneCommand_StartDelay, StartDelayMillisec, ECsvCustomStatOp::Set);
// 触发渲染管线
RenderViewFamily_RenderThread(RHICmdList, SceneRenderer);
FlushPendingDeleteRHIResources_RenderThread();
});
其中,RenderViewFamily_RenderThread的核心功能就是触发渲染器的Render函数。
- 延迟渲染管线:FDeferredShadingSceneRenderer::Render。
- 移动平台:FMobileSceneRenderer::Render。
static void RenderViewFamily_RenderThread(FRHICommandListImmediate& RHICmdList, FSceneRenderer* SceneRenderer)
{
//...
SceneRenderer->Render(RHICmdList);
}
小结:
- 通过
ENQUEUE_RENDER_COMMAND 宏,主线程向渲染线程塞入的一条条命令。 - 在主线程Tick时,首先完成图元和光源数据等的更新,再创建渲染器,触发渲染管线。
2.4 同步
游戏线程不能领先于渲染线程超过一帧。因此需要进行同步的操作。
主线程和渲染线程的同步是通过TaskGraph的。
同步的逻辑是:
- 往渲染线程的任务队列里面加入一个空任务。
- 然后等待该任务结束,等到该任务结束的时候也就说明当前渲染线程已经处理完了之前加入的任务了。
同步的实现是在Tick函数的最后通过调用FFrameEndSync类的Sync函数实现的。
FFrameEndSync
- 封装了FRenderCommandFence,通过调用BeginFence函数进行同步。
void FRenderCommandFence::BeginFence(bool bSyncToRHIAndGPU)
{
if (!GIsThreadedRendering)
{
return;
}
else
{
// Render thread is a default trigger for the CompletionEvent
TriggerThreadIndex = ENamedThreads::ActualRenderingThread;
if (BundledCompletionEvent.GetReference() && IsInGameThread())
{
CompletionEvent = BundledCompletionEvent;
return;
}
int32 GTSyncType = CVarGTSyncType.GetValueOnAnyThread();
if (bSyncToRHIAndGPU)
{
// Don't sync to the RHI and GPU if GtSyncType is disabled, or we're not vsyncing
//@TODO: do this logic in the caller?
static auto CVarVsync = IConsoleManager::Get().FindConsoleVariable(TEXT("r.VSync"));
check(CVarVsync != nullptr);
if ( GTSyncType == 0 || CVarVsync->GetInt() == 0 )
{
bSyncToRHIAndGPU = false;
}
}
if (bSyncToRHIAndGPU)
{
if (IsRHIThreadRunning())
{
// Change trigger thread to RHI
TriggerThreadIndex = ENamedThreads::RHIThread;
}
// Create a task graph event which we can pass to the render or RHI threads.
CompletionEvent = FGraphEvent::CreateGraphEvent();
FGraphEventRef InCompletionEvent = CompletionEvent;
ENQUEUE_RENDER_COMMAND(FSyncFrameCommand)(
[InCompletionEvent, GTSyncType](FRHICommandListImmediate& RHICmdList)
{
if (IsRHIThreadRunning())
{
ALLOC_COMMAND_CL(RHICmdList, FRHISyncFrameCommand)(InCompletionEvent, GTSyncType);
RHICmdList.ImmediateFlush(EImmediateFlushType::DispatchToRHIThread);
}
else
{
FRHISyncFrameCommand Command(InCompletionEvent, GTSyncType);
Command.Execute(RHICmdList);
}
});
}
else
{
// Sync Game Thread with Render Thread only
DECLARE_CYCLE_STAT(TEXT("FNullGraphTask.FenceRenderCommand"),
STAT_FNullGraphTask_FenceRenderCommand,
STATGROUP_TaskGraphTasks);
CompletionEvent = TGraphTask<FNullGraphTask>::CreateTask(NULL, ENamedThreads::GameThread).ConstructAndDispatchWhenReady(
GET_STATID(STAT_FNullGraphTask_FenceRenderCommand), ENamedThreads::GetRenderThread());
}
}
}
三、渲染线程与RHI线程
3.1 RHI线程的创建
RHI线程的是在StartRenderingThread中创建的。
FRHIThread::Get().Start();
它的线程执行体函数如下:和渲染线程的方法类似,就是起了一个线程在不断跑TaskGraph中的RHI任务。
virtual uint32 Run() override
{
LLM_SCOPE(ELLMTag::RHIMisc);
#if CSV_PROFILER
FCsvProfiler::Get()->SetRHIThreadId(FPlatformTLS::GetCurrentThreadId());
#endif
FMemory::SetupTLSCachesOnCurrentThread();
FTaskGraphInterface::Get().AttachToThread(ENamedThreads::RHIThread);
FTaskGraphInterface::Get().ProcessThreadUntilRequestReturn(ENamedThreads::RHIThread);
FMemory::ClearAndDisableTLSCachesOnCurrentThread();
return 0;
}
3.2 RHI的任务
RHI线程负责将渲染指令提交至GPU。
那么,它执行的任务从哪里来?
答案就在FRHICommandList类中!
渲染线程最后提交的指令,会调用FRHICommandList的相关绘制接口。
例如:
DrawPrimitive
DrawIndexedPrimitive
//...
以DrawPrimitive为例:
FORCEINLINE_DEBUGGABLE void DrawPrimitive(uint32 BaseVertexIndex, uint32 NumPrimitives, uint32 NumInstances)
{
//check(IsOutsideRenderPass());
if (Bypass())
{
// 分支1
GetContext().RHIDrawPrimitive(BaseVertexIndex, NumPrimitives, NumInstances);
return;
}
// 分支2
ALLOC_COMMAND(FRHICommandDrawPrimitive)(BaseVertexIndex, NumPrimitives, NumInstances);
}
可以看出两个支路:
- 当Bypass为真时,渲染指令是直接触发,直接调用GPU进行执行命令。
- 当Bypass为假时,会通过ALLOC_COMMAND宏,分配对应的命令,向RHI线程提交指令。
对于分支1:
- 不开启RHI线程时,就是渲染线程直接提交指令到GPU,就不展开了。
对于分支2:
看一下ALLOC_COMMAND宏。
#define ALLOC_COMMAND(...) new ( AllocCommand(sizeof(__VA_ARGS__), alignof(__VA_ARGS__)) ) __VA_ARGS__
这里,用的是placement new的语法,即:
- address:就是一个地址;
- type:就是类型;
- initializer:就是构造函数;
new (address) (type) initializer
// As we can see, we can specify an address where we want a new object of given type to be constructed.
因此,allocCommand函数,分配一个FRHICommandBase内存 ,返回地址。
FORCEINLINE_DEBUGGABLE void* AllocCommand(int32 AllocSize, int32 Alignment)
{
checkSlow(!IsExecuting());
FRHICommandBase* Result = (FRHICommandBase*) MemManager.Alloc(AllocSize, Alignment);
// 命令增加
++NumCommands;
// 尾插法
*CommandLink = Result;
CommandLink = &Result->Next;
return Result;
}
那么DrawPrimitive的本质为:构造一个FRHICommandDrawPrimitive,其定义如下。
FRHICOMMAND_MACRO(FRHICommandDrawPrimitive)
{
uint32 BaseVertexIndex;
uint32 NumPrimitives;
uint32 NumInstances;
FORCEINLINE_DEBUGGABLE FRHICommandDrawPrimitive(uint32 InBaseVertexIndex, uint32 InNumPrimitives, uint32 InNumInstances)
: BaseVertexIndex(InBaseVertexIndex)
, NumPrimitives(InNumPrimitives)
, NumInstances(InNumInstances)
{
}
RHI_API void Execute(FRHICommandListBase& CmdList);
};
使用到了FRHICOMMAND_MACRO宏。
#define FRHICOMMAND_MACRO(CommandName) \
struct PREPROCESSOR_JOIN(CommandName##String, __LINE__) \
{ \
static const TCHAR* TStr() { return TEXT(#CommandName); } \
}; \
struct CommandName final : public FRHICommand<CommandName, PREPROCESSOR_JOIN(CommandName##String, __LINE__)>
FRHICommand又继承自FRHICommandBase。
- 其ExecuteAndDestruct用来调用命令的Execute函数。
template<typename TCmd, typename NameType = FUnnamedRhiCommand>
struct FRHICommand : public FRHICommandBase
{
#if RHICOMMAND_CALLSTACK
uint64 StackFrames[16];
FRHICommand()
{
FPlatformStackWalk::CaptureStackBackTrace(StackFrames, 16);
}
#endif
void ExecuteAndDestruct(FRHICommandListBase& CmdList, FRHICommandListDebugContext& Context) override final
{
TRACE_CPUPROFILER_EVENT_SCOPE_ON_CHANNEL_STR(NameType::TStr(), RHICommandsChannel);
TCmd *ThisCmd = static_cast<TCmd*>(this);
#if RHI_COMMAND_LIST_DEBUG_TRACES
ThisCmd->StoreDebugInfo(Context);
#endif
// 执行命令
ThisCmd->Execute(CmdList);
ThisCmd->~TCmd();
}
virtual void StoreDebugInfo(FRHICommandListDebugContext& Context) {};
};
例如,FRHICommandDrawPrimitive的Execute执行函数如下:
void FRHICommandDrawPrimitive::Execute(FRHICommandListBase& CmdList)
{
RHISTAT(DrawPrimitive);
INTERNAL_DECORATOR(RHIDrawPrimitive)(BaseVertexIndex, NumPrimitives, NumInstances);
}
那么,可以看出上述方式实现的就是将渲染指令转换成为对应的FRHICommand。
在RHICommandList.h 文件中,定义封装了大量的预先声明并实现好的FRHICommand命令。
那么接下来一个问题就是:这些命令如何触发执行?
这里笔者没有仔细地去看了。
关键的应该是以下两个Task:
- FDispatchRHIThreadTask
- FExecuteRHIThreadTask
在合适的时机创建上述的Task,最终RHI线程会调用到下面这个函数:
FRHICommandListExecutor::ExecuteInner_DoExecute(*RHICmdList);
在这个函数中,会通过While循环遍历命令,进行真正的提交。
// 这个下面函数用的NextCommand函数,获取下一个命令。
FORCEINLINE_DEBUGGABLE FRHICommandBase* NextCommand()
{
FRHICommandBase* RHICmd = CmdPtr;
CmdPtr = RHICmd->Next;
NumCommands++;
return RHICmd;
}
void FRHICommandListExecutor::ExecuteInner_DoExecute(FRHICommandListBase& CmdList)
{
// ...
while (Iter.HasCommandsLeft())
{
FRHICommandBase* Cmd = Iter.NextCommand();
GCurrentCommand = Cmd;
//FPlatformMisc::Prefetch(Cmd->Next);
Cmd->ExecuteAndDestruct(CmdList, DebugContext);
}
}
小结一下:
- 渲染线程会通过各个FRHICommandList的接口,生成对应FRHICOMMAND。
- FRHICOMMAND通过PlacementNew的方式串联成为链表。
- 在合适的时机,会创建相应的Task(DispatchRHIThread、FExecuteRHIThreadTask),从而触发RHI线程执行RHICOMMAND队列。
参考文章
|