Commit @4e7db33d055ff92d42881c2af839f6f6d49a3ca2 - yjyoon/whisper_server

Fedir Zadniprovskyi 2024-12-22

chore: format README.md

@4e7db33d055ff92d42881c2af839f6f6d49a3ca2

5ad8603

4e7db33

README.md

--- README.md

+++ README.md


 # Faster Whisper Server
+
 `faster-whisper-server` is an OpenAI API-compatible transcription server which uses [faster-whisper](https://github.com/SYSTRAN/faster-whisper) as its backend.
 Features:
+
 - GPU and CPU support.
 - Easily deployable using Docker.
 - **Configurable through environment variables (see [config.py](./src/faster_whisper_server/config.py))**.

 Please create an issue if you find a bug, have a question, or a feature suggestion.
 
 ## OpenAI API Compatibility ++
+
 See [OpenAI API reference](https://platform.openai.com/docs/api-reference/audio) for more information.
+
 - Audio file transcription via `POST /v1/audio/transcriptions` endpoint.
-    - Unlike OpenAI's API, `faster-whisper-server` also supports streaming transcriptions (and translations). This is useful for when you want to process large audio files and would rather receive the transcription in chunks as they are processed, rather than waiting for the whole file to be transcribed. It works similarly to chat messages when chatting with LLMs.
+  - Unlike OpenAI's API, `faster-whisper-server` also supports streaming transcriptions (and translations). This is useful for when you want to process large audio files and would rather receive the transcription in chunks as they are processed, rather than waiting for the whole file to be transcribed. It works similarly to chat messages when chatting with LLMs.
 - Audio file translation via `POST /v1/audio/translations` endpoint.
--  Live audio transcription via `WS /v1/audio/transcriptions` endpoint.
-    - LocalAgreement2 ([paper](https://aclanthology.org/2023.ijcnlp-demo.3.pdf) | [original implementation](https://github.com/ufal/whisper_streaming)) algorithm is used for live transcription.
-    - Only transcription of a single channel, 16000 sample rate, raw, 16-bit little-endian audio is supported.
+- Live audio transcription via `WS /v1/audio/transcriptions` endpoint.
+  - LocalAgreement2 ([paper](https://aclanthology.org/2023.ijcnlp-demo.3.pdf) | [original implementation](https://github.com/ufal/whisper_streaming)) algorithm is used for live transcription.
+  - Only transcription of a single channel, 16000 sample rate, raw, 16-bit little-endian audio is supported.
 
 ## Quick Start
+
 [Hugging Face Space](https://huggingface.co/spaces/Iatalking/fast-whisper-server)
 
 ![image](https://github.com/fedirz/faster-whisper-server/assets/76551385/6d215c52-ded5-41d2-89a5-03a6fd113aa0)
 
-Using Docker Compose (Recommended)
+### Using Docker Compose (Recommended)
+
 NOTE: I'm using newer Docker Compsose features. If you are using an older version of Docker Compose, you may need need to update.
 
 ```bash

 docker compose --file compose.cpu.yaml up --detach
 ```
 
-Using Docker
+### Using Docker
+
 ```bash
 # for GPU support
 docker run --gpus=all --publish 8000:8000 --volume ~/.cache/huggingface:/root/.cache/huggingface --detach fedirz/faster-whisper-server:latest-cuda

 docker run --publish 8000:8000 --volume ~/.cache/huggingface:/root/.cache/huggingface --env WHISPER__MODEL=Systran/faster-whisper-small --detach fedirz/faster-whisper-server:latest-cpu
 ```
 
-Using Kubernetes: [tutorial](https://substratus.ai/blog/deploying-faster-whisper-on-k8s)
+### Using Kubernetes
+
+Follow [this tutorial](https://substratus.ai/blog/deploying-faster-whisper-on-k8s)
 
 ## Usage
+
 If you are looking for a step-by-step walkthrough, check out [this](https://www.youtube.com/watch?app=desktop&v=vSN-oAl6LVs) YouTube video.
 
 ### OpenAI API CLI
+
 ```bash
 export OPENAI_API_KEY="cant-be-empty"
 export OPENAI_BASE_URL=http://localhost:8000/v1/
 ```
+
 ```bash
 openai api audio.transcriptions.create -m Systran/faster-distil-whisper-large-v3 -f audio.wav --response-format text
 
 openai api audio.translations.create -m Systran/faster-distil-whisper-large-v3 -f audio.wav --response-format verbose_json
 ```
+
 ### OpenAI API Python SDK
+
 ```python
 from openai import OpenAI
 

 ```
 
 ### cURL
+
 ```bash
 # If `model` isn't specified, the default model is used
 curl http://localhost:8000/v1/audio/transcriptions -F "file=@audio.wav"

 ```
 
 ### Live Transcription (using WebSocket)
+
 From [live-audio](./examples/live-audio) example
 
 https://github.com/fedirz/faster-whisper-server/assets/76551385/e334c124-af61-41d4-839c-874be150598f
 
 [websocat](https://github.com/vi/websocat?tab=readme-ov-file#installation) installation is required.
 Live transcription of audio data from a microphone.
+
 ```bash
 ffmpeg -loglevel quiet -f alsa -i default -ac 1 -ar 16000 -f s16le - | websocat --binary ws://localhost:8000/v1/audio/transcriptions
 ```

Add a comment

Open 0
Closed 0

List

...	...	@@ -1,6 +1,8 @@
1	1	# Faster Whisper Server
	2	+
2	3	`faster-whisper-server` is an OpenAI API-compatible transcription server which uses [faster-whisper](https://github.com/SYSTRAN/faster-whisper) as its backend.
3	4	Features:
	5	+
4	6	- GPU and CPU support.
5	7	- Easily deployable using Docker.
6	8	- Configurable through environment variables (see [config.py](./src/faster_whisper_server/config.py)).
...	...	@@ -12,20 +14,24 @@
12	14	Please create an issue if you find a bug, have a question, or a feature suggestion.
13	15
14	16	## OpenAI API Compatibility ++
	17	+
15	18	See [OpenAI API reference](https://platform.openai.com/docs/api-reference/audio) for more information.
	19	+
16	20	- Audio file transcription via `POST /v1/audio/transcriptions` endpoint.
17		- - Unlike OpenAI's API, `faster-whisper-server` also supports streaming transcriptions (and translations). This is useful for when you want to process large audio files and would rather receive the transcription in chunks as they are processed, rather than waiting for the whole file to be transcribed. It works similarly to chat messages when chatting with LLMs.
	21	+ - Unlike OpenAI's API, `faster-whisper-server` also supports streaming transcriptions (and translations). This is useful for when you want to process large audio files and would rather receive the transcription in chunks as they are processed, rather than waiting for the whole file to be transcribed. It works similarly to chat messages when chatting with LLMs.
18	22	- Audio file translation via `POST /v1/audio/translations` endpoint.
19		-- Live audio transcription via `WS /v1/audio/transcriptions` endpoint.
20		- - LocalAgreement2 ([paper](https://aclanthology.org/2023.ijcnlp-demo.3.pdf) \| [original implementation](https://github.com/ufal/whisper_streaming)) algorithm is used for live transcription.
21		- - Only transcription of a single channel, 16000 sample rate, raw, 16-bit little-endian audio is supported.
	23	+- Live audio transcription via `WS /v1/audio/transcriptions` endpoint.
	24	+ - LocalAgreement2 ([paper](https://aclanthology.org/2023.ijcnlp-demo.3.pdf) \| [original implementation](https://github.com/ufal/whisper_streaming)) algorithm is used for live transcription.
	25	+ - Only transcription of a single channel, 16000 sample rate, raw, 16-bit little-endian audio is supported.
22	26
23	27	## Quick Start
	28	+
24	29	[Hugging Face Space](https://huggingface.co/spaces/Iatalking/fast-whisper-server)
25	30
26	31	![image](https://github.com/fedirz/faster-whisper-server/assets/76551385/6d215c52-ded5-41d2-89a5-03a6fd113aa0)
27	32
28		-Using Docker Compose (Recommended)
	33	+### Using Docker Compose (Recommended)
	34	+
29	35	NOTE: I'm using newer Docker Compsose features. If you are using an older version of Docker Compose, you may need need to update.
30	36
31	37	```bash
...	...	@@ -39,7 +45,8 @@
39	45	docker compose --file compose.cpu.yaml up --detach
40	46	```
41	47
42		-Using Docker
	48	+### Using Docker
	49	+
43	50	```bash
44	51	# for GPU support
45	52	docker run --gpus=all --publish 8000:8000 --volume ~/.cache/huggingface:/root/.cache/huggingface --detach fedirz/faster-whisper-server:latest-cuda
...	...	@@ -47,22 +54,29 @@
47	54	docker run --publish 8000:8000 --volume ~/.cache/huggingface:/root/.cache/huggingface --env WHISPER__MODEL=Systran/faster-whisper-small --detach fedirz/faster-whisper-server:latest-cpu
48	55	```
49	56
50		-Using Kubernetes: [tutorial](https://substratus.ai/blog/deploying-faster-whisper-on-k8s)
	57	+### Using Kubernetes
	58	+
	59	+Follow [this tutorial](https://substratus.ai/blog/deploying-faster-whisper-on-k8s)
51	60
52	61	## Usage
	62	+
53	63	If you are looking for a step-by-step walkthrough, check out [this](https://www.youtube.com/watch?app=desktop&v=vSN-oAl6LVs) YouTube video.
54	64
55	65	### OpenAI API CLI
	66	+
56	67	```bash
57	68	export OPENAI_API_KEY="cant-be-empty"
58	69	export OPENAI_BASE_URL=http://localhost:8000/v1/
59	70	```
	71	+
60	72	```bash
61	73	openai api audio.transcriptions.create -m Systran/faster-distil-whisper-large-v3 -f audio.wav --response-format text
62	74
63	75	openai api audio.translations.create -m Systran/faster-distil-whisper-large-v3 -f audio.wav --response-format verbose_json
64	76	```
	77	+
65	78	### OpenAI API Python SDK
	79	+
66	80	```python
67	81	from openai import OpenAI
68	82
...	...	@@ -76,6 +90,7 @@
76	90	```
77	91
78	92	### cURL
	93	+
79	94	```bash
80	95	# If `model` isn't specified, the default model is used
81	96	curl http://localhost:8000/v1/audio/transcriptions -F "file=@audio.wav"
...	...	@@ -89,12 +104,14 @@
89	104	```
90	105
91	106	### Live Transcription (using WebSocket)
	107	+
92	108	From [live-audio](./examples/live-audio) example
93	109
94	110	https://github.com/fedirz/faster-whisper-server/assets/76551385/e334c124-af61-41d4-839c-874be150598f
95	111
96	112	[websocat](https://github.com/vi/websocat?tab=readme-ov-file#installation) installation is required.
97	113	Live transcription of audio data from a microphone.
	114	+
98	115	```bash
99	116	ffmpeg -loglevel quiet -f alsa -i default -ac 1 -ar 16000 -f s16le - \| websocat --binary ws://localhost:8000/v1/audio/transcriptions
100	117	```

Delete comment