

Removed duplicate variable self.last_chunked_at
I tried to find the difference between self.last_chunked_at and self.buffer_time_offset, and it took me a while to understand that they are exactly the same. I think it's better to get rid of one of the duplicates to make the code more readable.
@98c5dd4609dc892a7ff7309468e6fbb1acc647ce
--- whisper_online.py
+++ whisper_online.py
... | ... | @@ -328,7 +328,6 @@ |
328 | 328 |
|
329 | 329 |
self.transcript_buffer = HypothesisBuffer(logfile=self.logfile) |
330 | 330 |
self.commited = [] |
331 |
- self.last_chunked_at = 0 |
|
332 | 331 |
|
333 | 332 |
self.silence_iters = 0 |
334 | 333 |
|
... | ... | @@ -340,7 +339,7 @@ |
340 | 339 |
"context" is the commited text that is inside the audio buffer. It is transcribed again and skipped. It is returned only for debugging and logging reasons. |
341 | 340 |
""" |
342 | 341 |
k = max(0,len(self.commited)-1) |
343 |
- while k > 0 and self.commited[k-1][1] > self.last_chunked_at: |
|
342 |
+ while k > 0 and self.commited[k-1][1] > self.buffer_time_offset: |
|
344 | 343 |
k -= 1 |
345 | 344 |
|
346 | 345 |
p = self.commited[:k] |
... | ... | @@ -451,7 +450,6 @@ |
451 | 450 |
cut_seconds = time - self.buffer_time_offset |
452 | 451 |
self.audio_buffer = self.audio_buffer[int(cut_seconds*self.SAMPLING_RATE):] |
453 | 452 |
self.buffer_time_offset = time |
454 |
- self.last_chunked_at = time |
|
455 | 453 |
|
456 | 454 |
def words_to_sentences(self, words): |
457 | 455 |
"""Uses self.tokenizer for sentence segmentation of words. |
Add a comment
Delete comment
Once you delete this comment, you won't be able to recover it. Are you sure you want to delete this comment?