🐛 fix(GenerationContext.py): fix import statements and add support for captioning engine

 feat(GenerationContext.py): add support for captioning engine in the GenerationContext class
The import statement for the `moviepy.editor` module is changed to `moviepy.editor as mp` to improve code readability. Additionally, the `gradio` module is imported as `gr` to improve code readability. The `GenerationContext` class now includes a `captioningengine` parameter and initializes a `captioningengine` attribute. The `setup_dir` method is modified to include a call to create a directory for the output files. The `get_file_path` method is modified to return the file path based on the output directory. The `process` method is modified to include additional steps for captioning. The `timed_script` attribute is added to store the result of the `ttsengine.synthesize` method. The `captioningengine` is used to generate captions and store them in the `captions` attribute. The final video is rendered using the `moviepy` library and saved as "final.mp4" in the output directory.
This commit is contained in:
2024-02-17 18:47:30 +01:00
parent eedbc99121
commit e3229518d4
12 changed files with 261 additions and 34 deletions

View File

@@ -5,8 +5,9 @@ import os
import torch
from .BaseTTSEngine import BaseTTSEngine
from .BaseTTSEngine import BaseTTSEngine, Word
from ...utils.prompting import get_prompt
class CoquiTTSEngine(BaseTTSEngine):
voices = [
@@ -122,8 +123,10 @@ class CoquiTTSEngine(BaseTTSEngine):
)
if self.to_force_duration:
self.force_duration(float(self.duration), path)
return self.time_with_whisper(path)
@classmethod
def get_options(cls) -> list:
options = [
@@ -131,7 +134,7 @@ class CoquiTTSEngine(BaseTTSEngine):
label="Voice",
choices=cls.voices,
max_choices=1,
value=cls.voices[0],
value="Damien Black",
),
gr.Dropdown(
label="Language",
@@ -145,6 +148,7 @@ class CoquiTTSEngine(BaseTTSEngine):
label="Force duration",
info="Force the duration of the generated audio to be at most the specified value",
value=False,
show_label=True,
)
duration = gr.Number(
label="Duration [s]", value=57, step=1, minimum=10, visible=False