Skip to content

Commit

Permalink
Deploying to gh-pages from @ b9a60d6 🚀
Browse files Browse the repository at this point in the history
  • Loading branch information
vince62s committed Feb 22, 2024
1 parent f815c08 commit 37de1b2
Show file tree
Hide file tree
Showing 28 changed files with 1,734 additions and 1,229 deletions.
269 changes: 209 additions & 60 deletions _modules/onmt/decoders/transformer.html

Large diffs are not rendered by default.

28 changes: 28 additions & 0 deletions _modules/onmt/encoders/transformer.html
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,19 @@ <h1>Source code for onmt.encoders.transformer</h1><div class="highlight"><pre>
<span class="sd"> dropout (float): dropout probability(0-1.0).</span>
<span class="sd"> pos_ffn_activation_fn (ActivationFunction):</span>
<span class="sd"> activation function choice for PositionwiseFeedForward layer</span>
<span class="sd"> add_qkvbias (bool): whether to add bias to the Key/Value nn.Linear</span>
<span class="sd"> num_kv (int): number of heads for KV when different vs Q (multiquery)</span>
<span class="sd"> add_ffnbias (bool): whether to add bias to the FF nn.Linear</span>
<span class="sd"> parallel_residual (bool): Use parallel residual connections in each layer block, as used</span>
<span class="sd"> by the GPT-J and GPT-NeoX models</span>
<span class="sd"> layer_norm (string): type of layer normalization standard/rms</span>
<span class="sd"> norm_eps (float): layer norm epsilon</span>
<span class="sd"> use_ckpting (List): layers for which we checkpoint for backward</span>
<span class="sd"> parallel_gpu (int): Number of gpu for tensor parallelism</span>
<span class="sd"> rotary_interleave (bool): Interleave the head dimensions when rotary</span>
<span class="sd"> embeddings are applied</span>
<span class="sd"> rotary_theta (int): rotary base theta</span>
<span class="sd"> rotary_dim (int): rotary dim when different to dim per head</span>
<span class="sd"> &quot;&quot;&quot;</span>

<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span>
Expand All @@ -252,6 +265,9 @@ <h1>Source code for onmt.encoders.transformer</h1><div class="highlight"><pre>
<span class="n">norm_eps</span><span class="o">=</span><span class="mf">1e-6</span><span class="p">,</span>
<span class="n">use_ckpting</span><span class="o">=</span><span class="p">[],</span>
<span class="n">parallel_gpu</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
<span class="n">rotary_interleave</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
<span class="n">rotary_theta</span><span class="o">=</span><span class="mf">1e4</span><span class="p">,</span>
<span class="n">rotary_dim</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
<span class="p">):</span>
<span class="nb">super</span><span class="p">(</span><span class="n">TransformerEncoderLayer</span><span class="p">,</span> <span class="bp">self</span><span class="p">)</span><span class="o">.</span><span class="fm">__init__</span><span class="p">()</span>

Expand All @@ -262,6 +278,9 @@ <h1>Source code for onmt.encoders.transformer</h1><div class="highlight"><pre>
<span class="n">is_decoder</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span>
<span class="n">max_relative_positions</span><span class="o">=</span><span class="n">max_relative_positions</span><span class="p">,</span>
<span class="n">relative_positions_buckets</span><span class="o">=</span><span class="n">relative_positions_buckets</span><span class="p">,</span>
<span class="n">rotary_interleave</span><span class="o">=</span><span class="n">rotary_interleave</span><span class="p">,</span>
<span class="n">rotary_theta</span><span class="o">=</span><span class="n">rotary_theta</span><span class="p">,</span>
<span class="n">rotary_dim</span><span class="o">=</span><span class="n">rotary_dim</span><span class="p">,</span>
<span class="n">attn_type</span><span class="o">=</span><span class="s2">&quot;self&quot;</span><span class="p">,</span>
<span class="n">add_qkvbias</span><span class="o">=</span><span class="n">add_qkvbias</span><span class="p">,</span>
<span class="n">num_kv</span><span class="o">=</span><span class="n">num_kv</span><span class="p">,</span>
Expand Down Expand Up @@ -366,6 +385,9 @@ <h1>Source code for onmt.encoders.transformer</h1><div class="highlight"><pre>
<span class="n">norm_eps</span><span class="o">=</span><span class="mf">1e-6</span><span class="p">,</span>
<span class="n">use_ckpting</span><span class="o">=</span><span class="p">[],</span>
<span class="n">parallel_gpu</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
<span class="n">rotary_interleave</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
<span class="n">rotary_theta</span><span class="o">=</span><span class="mf">1e4</span><span class="p">,</span>
<span class="n">rotary_dim</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
<span class="p">):</span>
<span class="nb">super</span><span class="p">(</span><span class="n">TransformerEncoder</span><span class="p">,</span> <span class="bp">self</span><span class="p">)</span><span class="o">.</span><span class="fm">__init__</span><span class="p">()</span>

Expand All @@ -389,6 +411,9 @@ <h1>Source code for onmt.encoders.transformer</h1><div class="highlight"><pre>
<span class="n">norm_eps</span><span class="o">=</span><span class="n">norm_eps</span><span class="p">,</span>
<span class="n">use_ckpting</span><span class="o">=</span><span class="n">use_ckpting</span><span class="p">,</span>
<span class="n">parallel_gpu</span><span class="o">=</span><span class="n">parallel_gpu</span><span class="p">,</span>
<span class="n">rotary_interleave</span><span class="o">=</span><span class="n">rotary_interleave</span><span class="p">,</span>
<span class="n">rotary_theta</span><span class="o">=</span><span class="n">rotary_theta</span><span class="p">,</span>
<span class="n">rotary_dim</span><span class="o">=</span><span class="n">rotary_dim</span><span class="p">,</span>
<span class="p">)</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">num_layers</span><span class="p">)</span>
<span class="p">]</span>
Expand Down Expand Up @@ -426,6 +451,9 @@ <h1>Source code for onmt.encoders.transformer</h1><div class="highlight"><pre>
<span class="n">parallel_gpu</span><span class="o">=</span><span class="n">opt</span><span class="o">.</span><span class="n">world_size</span>
<span class="k">if</span> <span class="n">opt</span><span class="o">.</span><span class="n">parallel_mode</span> <span class="o">==</span> <span class="s2">&quot;tensor_parallel&quot;</span>
<span class="k">else</span> <span class="mi">1</span><span class="p">,</span>
<span class="n">rotary_interleave</span><span class="o">=</span><span class="n">opt</span><span class="o">.</span><span class="n">rotary_interleave</span><span class="p">,</span>
<span class="n">rotary_theta</span><span class="o">=</span><span class="n">opt</span><span class="o">.</span><span class="n">rotary_theta</span><span class="p">,</span>
<span class="n">rotary_dim</span><span class="o">=</span><span class="n">opt</span><span class="o">.</span><span class="n">rotary_dim</span><span class="p">,</span>
<span class="p">)</span></div>

<div class="viewcode-block" id="TransformerEncoder.forward"><a class="viewcode-back" href="../../../onmt.modules.html#onmt.encoders.TransformerEncoder.forward">[docs]</a> <span class="k">def</span> <span class="nf">forward</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">src</span><span class="p">,</span> <span class="n">src_len</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
Expand Down
17 changes: 14 additions & 3 deletions _modules/onmt/inputters/dynamic_iterator.html
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@ <h1>Source code for onmt.inputters.dynamic_iterator</h1><div class="highlight"><
<span></span><span class="sd">&quot;&quot;&quot;Module that contain iterator used for dynamic data.&quot;&quot;&quot;</span>
<span class="kn">import</span> <span class="nn">torch</span>
<span class="kn">from</span> <span class="nn">itertools</span> <span class="kn">import</span> <span class="n">cycle</span>
<span class="kn">from</span> <span class="nn">onmt.constants</span> <span class="kn">import</span> <span class="n">CorpusTask</span>
<span class="kn">from</span> <span class="nn">onmt.constants</span> <span class="kn">import</span> <span class="n">CorpusTask</span><span class="p">,</span> <span class="n">ModelTask</span>
<span class="kn">from</span> <span class="nn">onmt.inputters.text_corpus</span> <span class="kn">import</span> <span class="n">get_corpora</span><span class="p">,</span> <span class="n">build_corpora_iters</span>
<span class="kn">from</span> <span class="nn">onmt.inputters.text_utils</span> <span class="kn">import</span> <span class="p">(</span>
<span class="n">text_sort_key</span><span class="p">,</span>
Expand Down Expand Up @@ -367,6 +367,10 @@ <h1>Source code for onmt.inputters.dynamic_iterator</h1><div class="highlight"><
<span class="bp">self</span><span class="o">.</span><span class="n">skip_empty_level</span> <span class="o">=</span> <span class="n">skip_empty_level</span>
<span class="bp">self</span><span class="o">.</span><span class="n">random_shuffler</span> <span class="o">=</span> <span class="n">RandomShuffler</span><span class="p">()</span>
<span class="bp">self</span><span class="o">.</span><span class="n">bucket_idx</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">if</span> <span class="n">task</span> <span class="o">!=</span> <span class="n">CorpusTask</span><span class="o">.</span><span class="n">TRAIN</span> <span class="ow">and</span> <span class="n">vocabs</span><span class="p">[</span><span class="s2">&quot;data_task&quot;</span><span class="p">]</span> <span class="o">==</span> <span class="n">ModelTask</span><span class="o">.</span><span class="n">LANGUAGE_MODEL</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">left_pad</span> <span class="o">=</span> <span class="kc">True</span>
<span class="k">else</span><span class="p">:</span>
<span class="bp">self</span><span class="o">.</span><span class="n">left_pad</span> <span class="o">=</span> <span class="kc">False</span>

<div class="viewcode-block" id="DynamicDatasetIter.from_opt"><a class="viewcode-back" href="../../../onmt.inputters.html#onmt.inputters.DynamicDatasetIter.from_opt">[docs]</a> <span class="nd">@classmethod</span>
<span class="k">def</span> <span class="nf">from_opt</span><span class="p">(</span>
Expand Down Expand Up @@ -557,7 +561,9 @@ <h1>Source code for onmt.inputters.dynamic_iterator</h1><div class="highlight"><
<span class="c1"># within the batch</span>
<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">task</span> <span class="o">==</span> <span class="n">CorpusTask</span><span class="o">.</span><span class="n">TRAIN</span><span class="p">:</span>
<span class="n">minibatch</span><span class="o">.</span><span class="n">sort</span><span class="p">(</span><span class="n">key</span><span class="o">=</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="bp">self</span><span class="o">.</span><span class="n">sort_key</span><span class="p">(</span><span class="n">x</span><span class="p">[</span><span class="mi">0</span><span class="p">]),</span> <span class="n">reverse</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
<span class="n">tensor_batch</span> <span class="o">=</span> <span class="n">tensorify</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">vocabs</span><span class="p">,</span> <span class="n">minibatch</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">device</span><span class="p">)</span>
<span class="n">tensor_batch</span> <span class="o">=</span> <span class="n">tensorify</span><span class="p">(</span>
<span class="bp">self</span><span class="o">.</span><span class="n">vocabs</span><span class="p">,</span> <span class="n">minibatch</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">device</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">left_pad</span>
<span class="p">)</span>
<span class="k">yield</span> <span class="p">(</span><span class="n">tensor_batch</span><span class="p">,</span> <span class="n">bucket_idx</span><span class="p">)</span></div>


Expand All @@ -569,7 +575,12 @@ <h1>Source code for onmt.inputters.dynamic_iterator</h1><div class="highlight"><
<span class="k">def</span> <span class="fm">__iter__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">for</span> <span class="p">(</span><span class="n">tensor_batch</span><span class="p">,</span> <span class="n">bucket_idx</span><span class="p">)</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">data_iter</span><span class="p">:</span>
<span class="k">for</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">tensor_batch</span><span class="o">.</span><span class="n">keys</span><span class="p">():</span>
<span class="k">if</span> <span class="n">key</span> <span class="ow">not</span> <span class="ow">in</span> <span class="p">[</span><span class="s2">&quot;src_ex_vocab&quot;</span><span class="p">,</span> <span class="s2">&quot;cid&quot;</span><span class="p">]:</span>
<span class="k">if</span> <span class="n">key</span> <span class="ow">not</span> <span class="ow">in</span> <span class="p">[</span>
<span class="s2">&quot;src_ex_vocab&quot;</span><span class="p">,</span>
<span class="s2">&quot;cid&quot;</span><span class="p">,</span>
<span class="s2">&quot;ind_in_bucket&quot;</span><span class="p">,</span>
<span class="s2">&quot;cid_line_number&quot;</span><span class="p">,</span>
<span class="p">]:</span>
<span class="n">tensor_batch</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="n">tensor_batch</span><span class="p">[</span><span class="n">key</span><span class="p">]</span><span class="o">.</span><span class="n">to</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">device</span><span class="p">)</span>
<span class="k">yield</span> <span class="p">(</span><span class="n">tensor_batch</span><span class="p">,</span> <span class="n">bucket_idx</span><span class="p">)</span>

Expand Down
Loading

0 comments on commit 37de1b2

Please sign in to comment.