• Y
  • List All
  • Feedback
    • This Project
    • All Projects
Profile Account settings Log out
  • Favorite
  • Project
  • All
Loading...
  • Log in
  • Sign up
yjyoon / whisper_streaming_Deprecated star
  • Project homeH
  • CodeC
  • IssueI
  • Pull requestP
  • Review R
  • MilestoneM
  • BoardB 1
  • Files
  • Commit
  • Branches
whisper_streaming_Deprecatedmic_test_whisper_streaming.py
Download as .zip file
File name
Commit message
Commit date
.gitignore
Initial commit
2023-04-05
LICENSE
Initial commit
2023-04-05
README.md
README update auto language detection
2024-02-06
line_packet.py
sending line packets without zero padding
2024-02-21
mic_test_whisper_simple.py
vad
2023-12-10
mic_test_whisper_streaming.py
vad
2023-12-10
microphone_stream.py
use of silero model instead of silero VadIterator
2023-12-07
voice_activity_controller.py
VAC
2024-01-04
whisper_online.py
increasing timestamps fixed
2024-02-07
whisper_online_server.py
bugfix
2024-05-28
whisper_online_vac.py
increasing timestamps fixed
2024-02-07
Rodrigo 2023-12-10 9bf8954 vad UNIX
Raw Open in browser Change history
from microphone_stream import MicrophoneStream from voice_activity_controller import VoiceActivityController from whisper_online import * import numpy as np import librosa import io import soundfile import sys SAMPLING_RATE = 16000 model = "large-v2" src_lan = "en" # source language tgt_lan = "en" # target language -- same as source for ASR, "en" if translate task is used use_vad_result = True min_sample_length = 1 * SAMPLING_RATE asr = FasterWhisperASR(src_lan, model) # loads and wraps Whisper model tokenizer = create_tokenizer(tgt_lan) # sentence segmenter for the target language online = OnlineASRProcessor(asr, tokenizer) # create processing object microphone_stream = MicrophoneStream() vad = VoiceActivityController(use_vad_result = use_vad_result) complete_text = '' final_processing_pending = False out = [] out_len = 0 for iter in vad.detect_user_speech(microphone_stream): # processing loop: raw_bytes= iter[0] is_final = iter[1] if raw_bytes: sf = soundfile.SoundFile(io.BytesIO(raw_bytes), channels=1,endian="LITTLE",samplerate=SAMPLING_RATE, subtype="PCM_16",format="RAW") audio, _ = librosa.load(sf,sr=SAMPLING_RATE) out.append(audio) out_len += len(audio) if (is_final or out_len >= min_sample_length) and out_len>0: a = np.concatenate(out) online.insert_audio_chunk(a) if out_len > min_sample_length: o = online.process_iter() print('-----'*10) complete_text = complete_text + o[2] print('PARTIAL - '+ complete_text) # do something with current partial output print('-----'*10) out = [] out_len = 0 if is_final: o = online.finish() # final_processing_pending = False print('-----'*10) complete_text = complete_text + o[2] print('FINAL - '+ complete_text) # do something with current partial output print('-----'*10) online.init() out = [] out_len = 0

          
        
    
    
Copyright Yona authors & © NAVER Corp. & NAVER LABS Supported by NAVER CLOUD PLATFORM

or
Sign in with github login with Google Sign in with Google
Reset password | Sign up