1️⃣Text-to-Speech: TTS

Text-to-Speech: TTS

Text-to-Speech(TTS) 작업은 기계 학습 모델을 사용하여 서면 텍스트를 음성 단어로 변환하는 작업입니다.

모델은 텍스트 입력에서 자연스러운 음성을 생성할 수 있으므로 음성 비서, 오디오북, 접근성 도구 등의 애플리케이션에 유용합니다.

TEXT = "And you know what they call a... a... a Quarter Pounder with Cheese in Seoul?"

from transformers import pipeline
from IPython.display import Audio

pipe = pipeline(model="suno/bark-small")
output = pipe(TEXT)
print(output)
config.json:   0%|          | 0.00/8.80k [00:00<?, ?B/s]



pytorch_model.bin:   0%|          | 0.00/1.68G [00:00<?, ?B/s]



generation_config.json:   0%|          | 0.00/4.91k [00:00<?, ?B/s]



tokenizer_config.json:   0%|          | 0.00/353 [00:00<?, ?B/s]



vocab.txt:   0%|          | 0.00/996k [00:00<?, ?B/s]



tokenizer.json:   0%|          | 0.00/2.92M [00:00<?, ?B/s]



special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:10000 for open-end generation.


{'audio': array([[-0.00355643, -0.00254505, -0.00216578, ...,  0.00740955,
         0.00741918,  0.00732478]], dtype=float32), 'sampling_rate': 24000}

T5 Model

Last updated