Skip to content

Commit

Permalink
[docs] update docs on gated models
Browse files Browse the repository at this point in the history
  • Loading branch information
chrisbrickhouse committed Apr 10, 2024
1 parent 7899076 commit b22f8e1
Show file tree
Hide file tree
Showing 10 changed files with 884 additions and 61 deletions.
21 changes: 17 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ buffer.close = lambda: None
tg.write(buffer)
print(buffer.getvalue())
```
## HuggingFace models used
## Using gated models
Artifical Intelegence models are powerful and in the wrong hands can be dangerous. The models used by fave-asr are cost-free, but you need to accept additional terms of use.

To use these models:
Expand All @@ -80,10 +80,23 @@ To use these models:
Keep track of your token and keep it safe (e.g. don't accidentally upload it to GitHub).
We suggest creating an environment variable for your token so that you don't need to paste it into your files.

### Creating an environment variable for your token
#### Linux and Mac
1. Open `~/.bashrc` in a text editor
## Creating an environment variable for your token
Storing your tokens as environment variables is a good way to avoid accidentally leaking them. Instead of typing the token into your code and deleting it before you commit, you can use `os.environ["HF_TOKEN"]` to access it from Python instead. This also makes your code more readable since it's obvious what `HF_TOKEN` is while a string of numbers and letters isn't clear.

### Linux and Mac
On Linux and Mac you can store your token in `.bashrc`

1. Open `$HOME/.bashrc` in a text editor
2. At the end of that file, add the following `HF_TOKEN='<your token>' ; export HF_TOKEN` replacing `<your token>` with [your HuggingFace token](https://hf.co/settings/tokens)
3. Add the changes to your current session using `source $HOME/.bashrc`

### Windows
On Windows, use the `setx` command to create an environment variable.
```
setx HF_TOKEN <your token>
```

You need to restart the command line afterwards to make the environment variable available for use. If you try to use the variable in the same window you set the variable, you will run into problems.

### Other software required
* `ffmpeg`
Expand Down
26 changes: 14 additions & 12 deletions doc_src/_site/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -261,19 +261,20 @@ <h2 class="anchored" data-anchor-id="not-another-transcription-service">Not anot
<section id="example" class="level3">
<h3 class="anchored" data-anchor-id="example">Example</h3>
<p>As an example, we’ll transcribe an audio interview of Snoop Dogg by the 85 South Media podcast and output it as a TextGrid.</p>
<div id="f21cb3ee" class="cell" data-execution_count="1">
<div class="sourceCode cell-code" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> fave_asr</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a>data <span class="op">=</span> fave_asr.transcribe_and_diarize(</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> audio_file <span class="op">=</span> <span class="st">'usage/resources/SnoopDogg_85SouthMedia.wav'</span>,</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> hf_token <span class="op">=</span> <span class="st">''</span>,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> model_name <span class="op">=</span> <span class="st">'small.en'</span>,</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> device <span class="op">=</span> <span class="st">'cpu'</span></span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> )</span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a>tg <span class="op">=</span> fave_asr.to_TextGrid(data)</span>
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a>tg.write(<span class="st">'SnoopDogg_85SouthMedia.TextGrid'</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div id="e716fb8b" class="cell" data-execution_count="1">
<div class="sourceCode cell-code" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> os</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> fave_asr</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a>data <span class="op">=</span> fave_asr.transcribe_and_diarize(</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> audio_file <span class="op">=</span> <span class="st">'usage/resources/SnoopDogg_85SouthMedia.wav'</span>,</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> hf_token <span class="op">=</span> os.environ[<span class="st">"HF_TOKEN"</span>],</span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> model_name <span class="op">=</span> <span class="st">'small.en'</span>,</span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> device <span class="op">=</span> <span class="st">'cpu'</span></span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a> )</span>
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a>tg <span class="op">=</span> fave_asr.to_TextGrid(data)</span>
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a>tg.write(<span class="st">'SnoopDogg_85SouthMedia.TextGrid'</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<div id="ffc61573" class="cell" data-execution_count="2">
<div id="e32edec6" class="cell" data-execution_count="2">
<div class="cell-output cell-output-stdout">
<pre><code>File type = "ooTextFile"
Object class = "TextGrid"
Expand Down Expand Up @@ -346,6 +347,7 @@ <h3 class="anchored" data-anchor-id="example">Example</h3>
<h2 class="anchored" data-anchor-id="for-more">For more</h2>
<ul>
<li>To start jumping in, check out <a href="./usage/index.html">the quickstart</a></li>
<li>To learn how to set up and use the gated models, check out <a href="./usage/gated_models.html">the gated model documentation</a></li>
</ul>
<p>You can also directly read up on <a href="./reference/index.html">the function and class references</a>.</p>

Expand Down
1 change: 0 additions & 1 deletion doc_src/_site/robots.txt

This file was deleted.

29 changes: 25 additions & 4 deletions doc_src/_site/search.json

Large diffs are not rendered by default.

19 changes: 0 additions & 19 deletions doc_src/_site/sitemap.xml

This file was deleted.

Loading

0 comments on commit b22f8e1

Please sign in to comment.