當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

人物肖像速写_深度视频肖像

發(fā)布時(shí)間：2023/12/15 编程问答 45 豆豆

生活随笔收集整理的這篇文章主要介紹了人物肖像速写_深度视频肖像小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

人物肖像速寫

Synthesizing and editing video portraits—i.e., videos framed to show a person’s head and upper body—is an important problem in computer graphics, with applications in video editing and movie postproduction, visual effects, visual dubbing, virtual reality, and telepresence, among others.

合成和編輯視頻肖像(即，成幀顯示人的頭部和上半身的視頻)是計(jì)算機(jī)圖形學(xué)中的一個(gè)重要問題，在視頻編輯和電影后期制作，視覺效果，配音，虛擬現(xiàn)實(shí)和遠(yuǎn)程呈現(xiàn)等應(yīng)用中。

The problem of synthesizing a photo-realistic video portrait of a target actor that mimics the actions of a source actor—and especially where the source and target actors can be different subjects—is still an open problem.

合成模仿源演員的動(dòng)作的目標(biāo)演員的逼真的視頻肖像的問題(尤其是在源演員和目標(biāo)演員可以是不同主體的情況下)仍然是一個(gè)懸而未決的問題。

There hasn’t been an approach that enables one to take full control of the rigid head pose, face expressions, and eye motion of the target actor; even face identity can be modified to some extent. Until now.

還沒有一種方法可以完全控制目標(biāo)演員的硬頭姿勢(shì)，面部表情和眼睛動(dòng)作。甚至人臉的身份都可以在一定程度上進(jìn)行修改。到現(xiàn)在。

In this post, I’m going to review “Deep Video Portraits”, which presents a novel approach that enables photo-realistic re-animation of portrait videos using only an input video.

在這篇文章中，我將回顧“ Deep Video Portraits” ，它提出了一種新穎的方法，該方法僅使用輸入視頻就可以對(duì)肖像視頻進(jìn)行逼真的重新動(dòng)畫處理。

In this post, I’ll cover two things: First, a short definition of a DeepFake. Second, an overview of the paper “Deep Video Portraits” in the words of the authors.

在本文中，我將介紹兩件事：首先，是DeepFake的簡(jiǎn)短定義。其次，用作者的話概述“深度視頻肖像”一文。

1.定義DeepFakes (1. Defining DeepFakes)

The word DeepFake combines the terms “deep learning” and “fake”, and refers to manipulated videos or other digital representations that produce fabricated images and sounds that appear to be real but have in fact been generated by deep neural networks.

DeepFake這個(gè)詞結(jié)合了“深度學(xué)習(xí)”和“偽造”兩個(gè)術(shù)語，指的是經(jīng)過操縱的視頻或其他數(shù)字表示形式，它們產(chǎn)生的偽造圖像和聲音看上去是真實(shí)的，但實(shí)際上是由深度神經(jīng)網(wǎng)絡(luò)生成的。

2.深度視頻肖像 (2. Deep Video Portraits)

2.1概述 (2.1 Overview)

The core method presented in the paper provides full control over the head of a target actor by transferring the rigid head pose, facial expressions, and eye motion of a source actor, while preserving the target’s identity and appearance.

本文中介紹的核心方法是通過傳遞源頭演員的硬頭姿勢(shì)，面部表情和眼睛動(dòng)作來完全控制目標(biāo)演員的頭部，同時(shí)保留目標(biāo)的身份和外觀。

On top of that, full video of the target is synthesized, including consistent upper body posture, hair, and background.

最重要的是，可以合成目標(biāo)的完整視頻，包括一致的上身姿勢(shì)，頭發(fā)和背景。

Figure 1. Facial reenacement results from “DVP”. Expressions from the source are transferred from source to target actor, while retaining the head pose (rotation and translation) as well as the eye gaze of the target actor圖1.“ DVP”的面部重演結(jié)果。源中的表情從源轉(zhuǎn)移到目標(biāo)演員，同時(shí)保留頭部姿勢(shì)(旋轉(zhuǎn)和平移)以及目標(biāo)演員的目光

The overall architecture of the paper’s framework is illustrated below in Figure 2.

本文框架的總體架構(gòu)如下圖2所示。

First, the source and target actors are being tracked using a state-of-the-art face reconstruction approach from a single image, and a 3D morphable model (3DMM) is derived to best fit the source and target actors.

首先，使用最先進(jìn)的人臉重構(gòu)方法從單個(gè)圖像中跟蹤源角色和目標(biāo)角色，然后導(dǎo)出3D可變形模型(3DMM)以最適合源角色和目標(biāo)角色。

The resulting sequence of low-dimensional parameter vectors represents the actor’s identity, head pose, expression, eye gaze, and the scene lighting for every video frame.

所得的低維參數(shù)向量序列表示每個(gè)視頻幀的演員身份，頭部姿勢(shì)，表情，視線和場(chǎng)景照明。

Then, the head pose, expressions and/or eye gaze parameters from the source are taken and mixed with the illumination and identity parameters of the target. This allows the network to generate a full-head reenactment while preserving the actor’s identity and look.

然后，從源頭獲取頭部姿勢(shì)，表情和/或視線參數(shù)，并將其與目標(biāo)的照明和標(biāo)識(shí)參數(shù)混合。這允許網(wǎng)絡(luò)在保留演員的身份和外觀的同時(shí)生成完整的重新制定。

Next, new synthetic renderings of the target actor are generated based on the mixed parameters. These renderings are the input to the paper’s novel “rendering-to-video translation network”, which is trained to convert the synthetic input into photo-realistic output.

接下來，基于混合參數(shù)生成目標(biāo)演員的新合成渲染。這些渲染是論文新穎的“視頻渲染翻譯網(wǎng)絡(luò)”的輸入，該網(wǎng)絡(luò)經(jīng)過訓(xùn)練可以將合成的輸入轉(zhuǎn)換為逼真的輸出。

Figure. 2. Deep video portraits enable a source actor to fully control a target video portrait. First, a low-dimensional parametric representation (let) of both videos is obtained using monocular face reconstruction. The head pose, expression and eye gaze can now be transferred in parameter space (middle). Finally, render conditioning input images that are converted to a photo-realistic video portrait of the target actor (right). Obama video courtesy of the White House (public domain)數(shù)字。 2.深視頻肖像使源Actor能夠完全控制目標(biāo)視頻肖像。首先，使用單眼人臉重建獲得兩個(gè)視頻的低維參數(shù)表示(let)。現(xiàn)在可以在參數(shù)空間(中間)中傳輸頭部姿勢(shì)，表情和視線。最后，渲染條件輸入圖像，將其轉(zhuǎn)換為目標(biāo)演員的逼真的視頻肖像(右)。奧巴馬的視頻由白宮提供(公共領(lǐng)域)

2.2從單個(gè)圖像重建臉部 (2.2 Face Reconstruction from a single image)

3D morphable models are used for face analysis because the intrinsic properties of 3D faces provide a representation that’s immune to intra-personal variations, such as pose and illumination. Given a single facial input image, a 3DMM can recover 3D face (shape and texture) and scene properties (pose and illumination) via a fitting process.

3D可變形模型用于人臉分析，因?yàn)?D人臉的內(nèi)在屬性可提供不受人體內(nèi)部變化(例如姿勢(shì)和光照)影響的表示。給定單個(gè)面部輸入圖像，3DMM可以通過擬合過程恢復(fù)3D面部(形狀和紋理)和場(chǎng)景屬性(姿勢(shì)和照明)。

The authors employ a state-of-the-art dense face reconstruction approach that fits a parametric model of the face and illumination to each video frame. It obtains a meaningful parametric face representation for both the source and the target, given an input video sequence.

作者采用了最先進(jìn)的密集面部重建方法，該方法將面部和照明的參數(shù)模型擬合到每個(gè)視頻幀。在給定輸入視頻序列的情況下，它為源和目標(biāo)都獲得了有意義的參數(shù)面部表示。

Equation 1. source actor video sequence where N_s denotes the total number of source frames.公式1.源演員視頻序列，其中N_s表示源幀的總數(shù)。

The meaningful parametric face representation consists of a set of parameters P. , which could be denoted as the corresponding parameter sequence that fully describes the source or target facial performance.

有意義的參數(shù)人臉表示由一組參數(shù)P組成 。，可以表示為完整描述來源或目標(biāo)面部效果的相應(yīng)參數(shù)序列。

Equation 2. A meaningful parametric face representation best describes each frame in the input video sequence.公式2。有意義的參數(shù)面部表示可以最好地描述輸入視頻序列中的每個(gè)幀。

The set of reconstructed parameters P encode the rigid head pose, facial identity coefficients, expressions coefficients, gaze direction for both eyes, and spherical harmonics illumination coefficients. Overall, the face reconstruction process estimates 261 parameters per video frame.

重建參數(shù)集P編碼了剛性頭部的姿勢(shì)，面部識(shí)別系數(shù)，表情系數(shù)，兩只眼睛的注視方向以及球諧照明度系數(shù)。總體而言，人臉重建過程估計(jì)每個(gè)視頻幀261個(gè)參數(shù)。

Below are more details on the parametric face representation and the fitting process.

以下是有關(guān)參數(shù)化面部表示和擬合過程的更多詳細(xì)信息。

2.2.1 Parametric Face Representation

2.2.1參數(shù)人臉表示

The paper represents the space of facial identity based on a parametric head model, and the space of facial expressions via an affine model. Mathematically, they model geometry variation through an affine model v∈ R^(3N) that stacks per-vertex deformations of the underlying template mesh with N vertices, as follows:

本文基于參數(shù)化頭部模型來表示面部識(shí)別的空間，并通過仿射模型來表示面部表情的空間。在數(shù)學(xué)上，他們通過仿射模型v∈R ^(3N)對(duì)幾何變化進(jìn)行建模，該仿射模型堆疊具有N個(gè)頂點(diǎn)的基礎(chǔ)模板網(wǎng)格的每個(gè)頂點(diǎn)變形，如下所示：

Equation 3. per-vertex deformations of the underlying template mesh with N vertices公式3.具有N個(gè)頂點(diǎn)的基礎(chǔ)模板網(wǎng)格的每個(gè)頂點(diǎn)變形

Where a_{geo} ∈ R^(3N) stores the average facial geometry. The geometry bases b_k for the geometry has been computed by applying principal component analysis (PCA) to 200 high-quality face scans, and b_k for the expressions has been obtained in the same manner on blendshapes.

其中a_ {geo}∈R ^(3N)存儲(chǔ)平均面部幾何形狀。通過將主成分分析(PCA)應(yīng)用于200個(gè)高質(zhì)量的面部掃描，可以計(jì)算出幾何的幾何基數(shù)b_k，并且在混合形狀上以相同的方式獲得了表達(dá)式的b_k。

2.2.2 Image Formation Model

2.2.2圖像形成模式 l

To render synthetic head images, a full perspective camera is assumed that maps model-space 3D points v via camera space to 2D points on the image plane. The perspective mapping Π contains the multiplication with the camera intrinsics and the perspective division.

為了渲染合成頭部圖像，假定使用全透視相機(jī)，將通過相機(jī)空間將模型空間3D點(diǎn)v映射到圖像平面上的2D點(diǎn)。透視圖映射Π包含與相機(jī)內(nèi)在函數(shù)的乘積和透視圖除法。

In addition, based on a distant illumination assumption, spherical harmonics basis functions are used to approximate the incoming radiance B from the environment.

另外，基于遠(yuǎn)處的照明假設(shè)，使用球諧函數(shù)基函數(shù)來估計(jì)來自環(huán)境的入射輻射B。

Equation 4. A spherical harmonics basis functions are used to approximate the incoming radiance B from the environment公式4。球諧函數(shù)基函數(shù)用于從環(huán)境近似估算入射輻射B

Where B is the number of spherical harmonics bands, ?_b the spherical harmonics coefficients, and r_i and n_i the reflectance and unit normal vector of the i-th vertex, respectively.

其中B是球諧頻帶的數(shù)量，?_b是球諧系數(shù)，r_i和n_i分別是第i個(gè)頂點(diǎn)的反射率和單位法向矢量。

2.3綜合條件輸入 (2.3 Synthetic Conditioning Input)

Using the face reconstruction approach described above, a face is reconstructed in each frame of the source and target video. Next, the rigid head pose, expression, and eye gaze of the target actor is modified. All parameters are copied in a relative manner from the source to the target.

使用上述的面部重構(gòu)方法，在源視頻和目標(biāo)視頻的每個(gè)幀中重構(gòu)面部。接下來，修改目標(biāo)演員的硬頭姿勢(shì)，表情和視線。所有參數(shù)都以相對(duì)方式從源復(fù)制到目標(biāo)。

Then the authors render synthetic conditioning images of the target actor’s face model under the modified parameters using hardware rasterization.

然后，作者使用硬件光柵化在修改后的參數(shù)下渲染目標(biāo)演員的面部模型的合成條件圖像。

For each frame, three different conditioning inputs are generated: a color rendering, a correspondence image, and an eye gaze image.

對(duì)于每一幀，將生成三個(gè)不同的條件輸入：彩色渲染，對(duì)應(yīng)圖像和視線圖像。

Figure 3. The synthetic input used for conditioning the rendering-to-video translation network: (1) colored face rendering under target illumination, (2) correspondence image, and (3) the eye gaze image圖3.用于調(diào)節(jié)渲染視頻轉(zhuǎn)換網(wǎng)絡(luò)的合成輸入：(1)在目標(biāo)照明下的彩色面部渲染，(2)對(duì)應(yīng)圖像，以及(3)視線圖像

The color rendering shows the modified target actor model under the estimated target illumination, while keeping the target identity (geometry and skin reflectance) fixed. This image provides a good starting point for the following rendering-to-video translation, since in the face region only the delta to a real image has to be learned.

彩色渲染在估計(jì)的目標(biāo)照明下顯示修改后的目標(biāo)演員模型，同時(shí)保持目標(biāo)身份(幾何形狀和皮膚反射率)固定。該圖像為隨后的渲染到視頻的轉(zhuǎn)換提供了一個(gè)很好的起點(diǎn)，因?yàn)樵诿娌繀^(qū)域中，只需學(xué)習(xí)真實(shí)圖像的增量即可。

A correspondence image encoding the index of the parametric model’s vertex that projects into each pixel is also rendered to keep the 3D information.

還渲染了編碼投影到每個(gè)像素中的參數(shù)模型頂點(diǎn)索引的對(duì)應(yīng)圖像，以保留3D信息。

Finally, a gaze map is provided to provide information about the eye gaze direction and blinking.

最后，提供凝視圖以提供有關(guān)眼睛凝視方向和眨眼的信息。

All of the images are stacked to obtain the input to the rendering-to-video translation network.

將所有圖像堆疊在一起，以獲得輸入到視頻轉(zhuǎn)譯網(wǎng)絡(luò)的輸入。

2.4渲染到視頻翻譯 (2.4 Rendering-To-Video Translation)

The generated conditioning space-time stacked images are the input to the rendering-to-video translation network.

生成的條件時(shí)空疊加圖像是渲染到視頻翻譯網(wǎng)絡(luò)的輸入。

The network learns to convert the synthetic input into full frames of a photo-realistic target video, in which the target actor now mimics the head motion, facial expression, and eye gaze of the synthetic input.

網(wǎng)絡(luò)學(xué)習(xí)如何將合成輸入轉(zhuǎn)換為逼真的目標(biāo)視頻的全幀，目標(biāo)演員現(xiàn)在可以模仿合成輸入的頭部運(yùn)動(dòng)，面部表情和視線。

The network learns to synthesize the entire actor in the foreground, i.e., the face for which conditioning input exists, but also all other parts of the actor, such as hair and body, so that they comply with the target head pose.

網(wǎng)絡(luò)學(xué)習(xí)合成前景中的整個(gè)演員，即存在條件輸入的面部，以及演員的所有其他部分(例如頭發(fā)和身體)，以使其符合目標(biāo)頭部姿勢(shì)。

It also synthesizes the appropriately modified and filled-in background, even including some consistent lighting effects between the foreground and background.

它還可以合成經(jīng)過適當(dāng)修改和填充的背景，甚至包括前景和背景之間的一些一致的照明效果。

The network shown in Figure 4 follows an encoder-decoder architecture and is trained in an adversarial manner.

圖4所示的網(wǎng)絡(luò)遵循編碼器-解碼器架構(gòu)，并以對(duì)抗性方式進(jìn)行訓(xùn)練。

Figure 4. Architecture of the rendering-to-video translation network follows an encoder-decoder architecture圖4.渲染視頻轉(zhuǎn)換網(wǎng)絡(luò)的架構(gòu)遵循編碼器-解碼器架構(gòu)

The training objective function comprises a conditioned adversarial loss and L1 photometric loss.

訓(xùn)練目標(biāo)函數(shù)包括條件對(duì)抗損失和L1光度損失。

Equation 5. Rendering-To-Video Translation objective function公式5：渲染視頻轉(zhuǎn)換目標(biāo)函數(shù)

During adversarial training, the discriminator D tries to get better at classifying given images as real or synthetic, while the transformation network T tries to improve in fooling the discriminator. The L1 loss penalizes the distance between the synthesized image T(x) and the ground truth image Y, which encourages the sharpness of the synthesized output:

在對(duì)抗訓(xùn)練中，鑒別器D試圖在將給定圖像分類為真實(shí)或合成圖像方面做得更好，而變換網(wǎng)絡(luò)T試圖在欺騙鑒別器方面進(jìn)行改進(jìn)。 L1損失會(huì)懲罰合成圖像T(x)與地面真實(shí)圖像Y之間的距離，這會(huì)鼓勵(lì)合成輸出的清晰度：

Equation 6. L1 loss公式6. L1損耗

Deep learning is rapidly moving closer to where data is collected — edge devices. Subscribe to the Fritz AI Newsletter to learn more about this transition and how it can help scale your business.

深度學(xué)習(xí)正Swift向收集數(shù)據(jù)的地方(邊緣設(shè)備)靠近。訂閱Fritz AI新聞簡(jiǎn)報(bào)，以了解有關(guān)此過渡及其如何幫助您擴(kuò)展業(yè)務(wù)的更多信息。

3.實(shí)驗(yàn)與結(jié)果 (3. Experiments & Results)

This approach enables us to take full control of the rigid head pose, facial expression, and eye motion of a target actor in a video portrait, thus opening up a wide range of video rewrite applications.

這種方法使我們能夠完全控制視頻肖像中目標(biāo)演員的硬頭姿勢(shì)，面部表情和眼睛動(dòng)作，從而開辟了廣泛的視頻重寫應(yīng)用程序。

3.1在完全掌控的情況下重演 (3.1 Reenactment under full head control)

This approach is the first that can photo-realistically transfer the full 3D head pose (spatial position and rotation), facial expression, as well as eye gaze and eye blinking of a captured source actor to a target actor video.

這種方法是第一種可以真實(shí)地將完整的3D頭部姿勢(shì)(空間位置和旋轉(zhuǎn))，面部表情以及捕獲的源演員的目光和眨眼轉(zhuǎn)移到目標(biāo)演員視頻的方法。

Figure 5 shows some examples of full-head reenactment between different source and target actors. Here, the authors use the full target video for training and the source video as the driving sequence.

圖5顯示了不同源角色和目標(biāo)角色之間的完整頭像重現(xiàn)的一些示例。在這里，作者使用完整的目標(biāo)視頻進(jìn)行訓(xùn)練，并使用源視頻作為驅(qū)動(dòng)序列。

As can be seen, the output of their approach achieves a high level of realism and faithfully mimics the driving sequence, while still retaining the mannerisms of the original target actor.

可以看出，他們的方法的輸出實(shí)現(xiàn)了很高的真實(shí)感，并忠實(shí)地模仿了駕駛順序，同時(shí)仍然保留了原始目標(biāo)演員的舉止。

Figure 5. qualitative results of full-head reenactment圖5.全頭重演的定性結(jié)果

3.2面部重演和視頻配音 (3.2 Facial Reenactment and Video Dubbing)

Besides full-head reenactment, the approach also enables facial reenactment. In this experiment, the authors replaced the expression coefficients of the target actor with those of the source actor before synthesizing the conditioning input to the rendering-to-video translation network.

除了全頭重演外，該方法還可以進(jìn)行面部重演。在這個(gè)實(shí)驗(yàn)中，作者在合成渲染到視頻翻譯網(wǎng)絡(luò)的條件輸入之前，將目標(biāo)演員的表達(dá)系數(shù)替換為源演員的表達(dá)系數(shù)。

Here, the head pose and position and eye gaze remain unchanged. Figure 6 shows facial reenactment results.

在此，頭部的姿勢(shì)和位置以及視線保持不變。圖6顯示了面部重現(xiàn)結(jié)果。

Figure 6. Facial reenactment results圖6.面部重演結(jié)果

Video dubbing could also be applied by modifying the facial motion of actors speaking originally in another language to an ensign translation, spoken by a professional dubbing actor in a dubbing studio.

視頻配音也可以通過將最初用另一種語言說的演員的面部動(dòng)作修改為少尉翻譯來進(jìn)行，而配音則由配音工作室中的專業(yè)配音演員說出。

More precisely, the captured facial expressions of the dubbing actor could be transferred to the target actor, while leaving the original target gaze and eye blinks intact.

更準(zhǔn)確地說，可以將捕獲的配音演員的面部表情轉(zhuǎn)移到目標(biāo)演員，同時(shí)保持原始目標(biāo)注視和眨眼完好無損。

4。討論 (4. Discussion)

In this post, I presented Deep Video Portraits, a novel approach that enables photo-realistic re-animation of portrait videos using only an input video.

在這篇文章中，我介紹了Deep Video Portraits，這是一種新穎的方法，可以僅使用輸入視頻就可以對(duì)肖像視頻進(jìn)行逼真的重新動(dòng)畫處理。

In contrast to existing approaches that are restricted to manipulations of facial expressions only, the authors are the first to transfer the full 3D head position, head rotation, face expression, eye gaze, and eye blinking from a source actor to a portrait video of a target actor.

與僅限于面部表情操作的現(xiàn)有方法相比，作者是第一個(gè)將完整3D頭部位置，頭部旋轉(zhuǎn)，面部表情，眼睛凝視和眨眼從原始演員轉(zhuǎn)移到人像視頻的人。目標(biāo)演員。

The authors have shown, through experiments and a user study, that their method outperforms prior work, both in terms of model performance and expanded capabilities. This opens doors to many applications, like video reenactment for virtual reality and telepresence, interactive video editing, and visual dubbing.

作者通過實(shí)驗(yàn)和用戶研究表明，在模型性能和擴(kuò)展功能方面，他們的方法均優(yōu)于先前的工作。這為許多應(yīng)用打開了大門，例如用于虛擬現(xiàn)實(shí)和遠(yuǎn)程呈現(xiàn)的視頻重現(xiàn)，交互式視頻編輯以及可視化配音。

5。結(jié)論 (5. Conclusions)

As always, if you have any questions or comments feel free to leave your feedback below or you can always reach me on LinkedIn.

與往常一樣，如果您有任何疑問或意見，請(qǐng)隨時(shí)在下面留下您的反饋，或者隨時(shí)可以通過LinkedIn與我聯(lián)系。

Till then, see you in the next post! 😄

到那時(shí)，在下一篇文章中見！ 😄

For the enthusiastic reader:For more details on “Deep Video Portraits” check out the formal project page or check out their video demo.

對(duì)于熱情的讀者： 有關(guān) “深度視頻肖像”的更多詳細(xì)信息，請(qǐng)查看正式的項(xiàng)目頁面或查看其視頻演示。

Editor’s Note: Heartbeat is a contributor-driven online publication and community dedicated to exploring the emerging intersection of mobile app development and machine learning. We’re committed to supporting and inspiring developers and engineers from all walks of life.

編者注：心跳是由貢獻(xiàn)者驅(qū)動(dòng)的在線出版物和社區(qū)，致力于探索移動(dòng)應(yīng)用程序開發(fā)和機(jī)器學(xué)習(xí)的新興交集。我們致力于為各行各業(yè)的開發(fā)人員和工程師提供支持和啟發(fā)。

Editorially independent, Heartbeat is sponsored and published by Fritz AI, the machine learning platform that helps developers teach devices to see, hear, sense, and think. We pay our contributors, and we don’t sell ads.

Heartbeat在編輯上是獨(dú)立的，由以下機(jī)構(gòu)贊助和發(fā)布 Fritz AI ，一種機(jī)器學(xué)習(xí)平臺(tái)，可幫助開發(fā)人員教設(shè)備看，聽，感知和思考。我們向貢獻(xiàn)者付款，并且不出售廣告。

If you’d like to contribute, head on over to our call for contributors. You can also sign up to receive our weekly newsletters (Deep Learning Weekly and the Fritz AI Newsletter), join us on Slack, and follow Fritz AI on Twitter for all the latest in mobile machine learning.

如果您想做出貢獻(xiàn)，請(qǐng)繼續(xù)我們的 呼吁捐助者 。您還可以注冊(cè)以接收我們的每周新聞通訊(《 深度學(xué)習(xí)每周》 和《 Fritz AI新聞通訊》 )，并加入我們 Slack ，然后繼續(xù)關(guān)注Fritz AI Twitter 提供了有關(guān)移動(dòng)機(jī)器學(xué)習(xí)的所有最新信息。

翻譯自: https://heartbeat.fritz.ai/deep-video-portraits-f0f4a136546a