search.xml

<?xml version="1.0" encoding="utf-8"?>
<search> 
  
    
    <entry>
      <title>linux远程服务器</title>
      <link href="/2019/11/02/linux%E8%BF%9C%E7%A8%8B%E6%9C%8D%E5%8A%A1%E5%99%A8/"/>
      <url>/2019/11/02/linux%E8%BF%9C%E7%A8%8B%E6%9C%8D%E5%8A%A1%E5%99%A8/</url>
      
        <content type="html"><![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>做实验的时候，用自己实验室的台式机（windows系统）在公用的linux服务器上跑程序，记录一些常用的命令以及用起来更方便的配置方法。实为比较琐碎的心得体会。</p><h2 id="上传文件到远程服务器"><a href="#上传文件到远程服务器" class="headerlink" title="上传文件到远程服务器"></a>上传文件到远程服务器</h2><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">pscp -r localDir rootuser@remoteip:/fileDirectory  # 拷贝文件夹</span><br></pre></td></tr></table></figure><a id="more"></a><h2 id="实现本地pycharm远程调试程序"><a href="#实现本地pycharm远程调试程序" class="headerlink" title="实现本地pycharm远程调试程序"></a>实现本地pycharm远程调试程序</h2><p><img src="https://ae01.alicdn.com/kf/H88850c9da71241f0ae6fb4aa706a1807b.png" alt=""></p><p><img src="https://ae01.alicdn.com/kf/He73b36d3f219464c8f51dbafab15a8fbx.png" alt=""></p><p>下面设置解释器。</p><p><img src="https://ae01.alicdn.com/kf/H9ef2bde9255145428bf8f26ab682058cr.png" alt=""></p><p><img src="https://ae01.alicdn.com/kf/H31df1603925340ecad60188aa798d949Q.png" alt=""></p><p>之后勾选Tools -&gt; Deployment -&gt; Automatic Upload (always)。由此实现在本地更改过代码之后，自动同步到远程服务器，运行的时候即为最新的代码。</p><h2 id="用户管理"><a href="#用户管理" class="headerlink" title="用户管理"></a>用户管理</h2><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">adduser songfish  # 添加新用户</span><br><span class="line">passwd songfish # 更改用户密码</span><br></pre></td></tr></table></figure><h2 id="实时查看gpu使用情况"><a href="#实时查看gpu使用情况" class="headerlink" title="实时查看gpu使用情况"></a>实时查看gpu使用情况</h2><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">nvidia-smi # 方法一</span><br><span class="line">watch --color -n1 gpustat -cpu # 方法二，实时查看</span><br></pre></td></tr></table></figure><h2 id="解决远程无法画图的问题"><a href="#解决远程无法画图的问题" class="headerlink" title="解决远程无法画图的问题"></a>解决远程无法画图的问题</h2><p><strong>参考：</strong></p><p><a href="https://blog.csdn.net/qq_22194315/article/details/77985441" target="_blank" rel="noopener">https://blog.csdn.net/qq_22194315/article/details/77985441</a></p><p><a href="https://blog.csdn.net/mkosto/article/details/80348626" target="_blank" rel="noopener">https://blog.csdn.net/mkosto/article/details/80348626</a></p><p><a href="https://blog.csdn.net/u013554213/article/details/79885792" target="_blank" rel="noopener">https://blog.csdn.net/u013554213/article/details/79885792</a></p><p><strong>总结如下：</strong></p><p>1.下载Xming，打开默认display是0</p><p>2.打开putty，enable X11 forwarding ，location可以不写</p><p>3.命令行env—–&gt;查看DISPLAY的值</p><p>命令行 python—-&gt;import matplotlib                              </p><p>​                                print(matplotlib.get_backend())                      </p><p>​                                ——&gt;Qt5Agg</p><p>4.根据3的内容，设置pycharm几个地方</p><p>（1）settings—–&gt; Python Scientific —–&gt;取消show plots in toolwindow的勾选 </p><p>（2）Run—-&gt;Edit Configurations —–&gt;Environment variables——&gt;DISPLAY=localhost:10.0</p><p>（3）代码中 import matplotlib matplotlib.use(‘Qt5Agg’)</p>]]></content>
      
      
      <categories>
          
          <category> 一些摸索 </category>
          
      </categories>
      
      
        <tags>
            
            <tag> linux </tag>
            
        </tags>
      
    </entry>
    
    <entry>
      <title>Next主题的简单优化(一)</title>
      <link href="/2019/09/14/Next%E4%B8%BB%E9%A2%98%E4%BC%98%E5%8C%96/"/>
      <url>/2019/09/14/Next%E4%B8%BB%E9%A2%98%E4%BC%98%E5%8C%96/</url>
      
        <content type="html"><![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>之前匆匆忙忙建站，没有加评论、搜索、数据统计与分析、搜索功能等等，这些功能对于搭建博客也是很重要的。参考了很多大佬的博客，受益匪浅，以下是我的一些摸索。</p><p>实际上<a href="https://theme-next.iissnan.com/getting-started.html" target="_blank" rel="noopener">Next主题的官方文档</a>非常详细了，建议多查看。</p><p>Next主题版本：Muse v6.3.0</p><h2 id="评论系统"><a href="#评论系统" class="headerlink" title="评论系统"></a>评论系统</h2><p>一开始按照<a href="https://theme-next.iissnan.com/getting-started.html" target="_blank" rel="noopener">Next主题的官方文档</a>配置<a href="https://livere.com/" target="_blank" rel="noopener">来必力</a>评论系统，但是后来发现来必力加载速度有点慢。于是转用基于<a href="https://leancloud.cn/dashboard/login.html#/signup" target="_blank" rel="noopener">LeanCloud</a>的评论系统Valine，Valine也是有<a href="https://valine.js.org/" target="_blank" rel="noopener">官方文档</a>的（看官方文档可是个好习惯）。</p><a id="more"></a><p><strong>简要步骤如下：</strong></p><p><strong>1.获取APP ID和APP Key</strong>。首先在<a href="https://leancloud.cn/dashboard/login.html#/signup" target="_blank" rel="noopener">LeanCloud</a>注册自己的账号。进入<a href="https://leancloud.cn/dashboard/applist.html#/apps" target="_blank" rel="noopener">控制台</a>创建应用。应用创建好以后，进入刚创建的应用，选择<code>设置</code>&gt;<code>应用Key</code>，就能看到<code>APP ID</code>和<code>APP Key</code>了：</p><p><img src="https://ae01.alicdn.com/kf/H7baa1fe5543c4aa6858745d9df7e91fbS.png" alt=""></p><p><strong>2.设置安全域名 :</strong></p><p><img src="https://ae01.alicdn.com/kf/Hc054263aa15c469dbce354c41b9ca25eW.png" alt=""></p><p><strong>3.修改<code>主题配置文件</code>中的Valine部分 :</strong></p><p>（未开邮件提醒​​）</p><p>文件位置：<code>themes/next/_config.yml</code></p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Valine.</span></span><br><span class="line"><span class="comment"># You can get your appid and appkey from https://leancloud.cn</span></span><br><span class="line"><span class="comment"># more info please open https://valine.js.org</span></span><br><span class="line"><span class="attr">valine:</span></span><br><span class="line"><span class="attr">  enable:</span> <span class="literal">true</span></span><br><span class="line"><span class="attr">  appid:</span> <span class="string">your</span> <span class="string">APP</span> <span class="string">ID</span></span><br><span class="line"><span class="attr">  appkey:</span> <span class="string">your</span> <span class="string">Key</span></span><br><span class="line"><span class="attr">  notify:</span> <span class="literal">false</span> <span class="comment"># mail notifier , https://github.com/xCss/Valine/wiki</span></span><br><span class="line"><span class="attr">  verify:</span> <span class="literal">false</span> <span class="comment"># Verification code</span></span><br><span class="line"><span class="attr">  placeholder:</span> <span class="string">Just</span> <span class="string">go</span> <span class="string">go</span> <span class="comment"># comment box placeholder</span></span><br><span class="line"><span class="attr">  avatar:</span> <span class="string">monsterid</span> <span class="comment"># gravatar style</span></span><br><span class="line"><span class="attr">  guest_info:</span> <span class="string">nick,mail,link</span> <span class="comment"># custom comment header</span></span><br><span class="line"><span class="attr">  pageSize:</span> <span class="number">10</span> <span class="comment"># pagination size</span></span><br></pre></td></tr></table></figure><p><strong>4.如需取消某个页面/文章 的评论，在 md 文件的 <a href="https://hexo.io/docs/front-matter.html" target="_blank" rel="noopener">front-matter </a>中增加 <code>comments: false</code>。</strong></p><h2 id="数据统计与分析"><a href="#数据统计与分析" class="headerlink" title="数据统计与分析"></a>数据统计与分析</h2><h3 id="文章阅读量统计"><a href="#文章阅读量统计" class="headerlink" title="文章阅读量统计"></a>文章阅读量统计</h3><p>1.仍然使用LeanCloud。按下图创建<code>Class</code>，<code>Class</code>名称必须为<code>Counter</code>。</p><p><img src="https://ae01.alicdn.com/kf/He8aeab19cab94cab9548419204036df1Z.png" alt=""></p><p>2.修改<code>主题配置文件</code>中的<code>leancloud_visitors</code>配置项：</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">leancloud_visitors:</span></span><br><span class="line"><span class="attr">  enable:</span> <span class="literal">true</span></span><br><span class="line"><span class="attr">  app_id:</span> </span><br><span class="line"><span class="attr">  app_key:</span></span><br></pre></td></tr></table></figure><h3 id="博客访问量统计"><a href="#博客访问量统计" class="headerlink" title="博客访问量统计"></a>博客访问量统计</h3><p>用的是<code>不蒜子统计</code>，修改<code>主题配置文件</code>中的<code>busuanzi_count</code>的配置项，当<code>enable: true</code>时，代表开启全局开关。</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Show Views/Visitors of the website/page with busuanzi.</span></span><br><span class="line"><span class="comment"># Get more information on http://ibruce.info/2015/04/04/busuanzi/</span></span><br><span class="line"><span class="attr">busuanzi_count:</span></span><br><span class="line"><span class="attr">  enable:</span> <span class="literal">true</span></span><br><span class="line"><span class="attr">  total_visitors:</span> <span class="literal">true</span></span><br><span class="line"><span class="attr">  total_visitors_icon:</span> <span class="string">user</span></span><br><span class="line"><span class="attr">  total_views:</span> <span class="literal">true</span></span><br><span class="line"><span class="attr">  total_views_icon:</span> <span class="string">eye</span></span><br><span class="line"><span class="attr">  post_views:</span> <span class="literal">false</span></span><br><span class="line"><span class="attr">  post_views_icon:</span> <span class="string">eye</span></span><br></pre></td></tr></table></figure><h2 id="博客图标"><a href="#博客图标" class="headerlink" title="博客图标"></a>博客图标</h2><p>网站的默认图标不是特别好看，因此换成了现在的小鱼。</p><p><strong>修改方法：</strong></p><p>1.到这个神奇的网站<a href="http://www.easyicon.net/" target="_blank" rel="noopener">EasyIcon</a>找心仪的图标，下载<code>32PX</code>和<code>16PX</code>的<code>ICO</code>格式，并把它们放在<code>/themes/next/source/images</code>里。</p><p><img src="https://ae01.alicdn.com/kf/H3b2c1ae464404d6382c9d8cee52d9ce1P.png" alt=""></p><p>2.修改<code>主题配置文件</code>中的<code>favicon</code>配置项，其中<code>small</code>对应<code>16px</code>的图标路径，<code>medium</code>对应<code>32px</code>的图标路径。</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">favicon:</span></span><br><span class="line"><span class="attr">  small:</span> <span class="string">/images/favicon-16x16.ico</span></span><br><span class="line"><span class="attr">  medium:</span> <span class="string">/images/favicon-32x32.ico</span></span><br><span class="line"><span class="attr">  apple_touch_icon:</span> <span class="string">/images/apple-touch-icon-next.png</span></span><br><span class="line"><span class="attr">  safari_pinned_tab:</span> <span class="string">/images/logo.svg</span></span><br><span class="line">  <span class="comment">#android_manifest: /images/manifest.json</span></span><br><span class="line">  <span class="comment">#ms_browserconfig: /images/browserconfig.xml</span></span><br></pre></td></tr></table></figure><h2 id="博客运行时间"><a href="#博客运行时间" class="headerlink" title="博客运行时间"></a>博客运行时间</h2><p>来源<a href="https://reuixiy.github.io/" target="_blank" rel="noopener">reuixiy</a>的<a href="https://reuixiy.github.io/technology/computer/computer-aided-art/2017/06/09/hexo-next-optimization.html#%E5%A5%BD%E7%8E%A9%E7%9A%84%E5%86%99%E4%BD%9C%E6%A0%B7%E5%BC%8F" target="_blank" rel="noopener">博客</a>。</p><p>文件位置：<code>themes/next/layout/_custom/sidebar.swig</code>（其中的<code>BirthDay</code>改成自己的）</p><figure class="highlight html"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="tag">&lt;<span class="name">div</span> <span class="attr">id</span>=<span class="string">"days"</span>&gt;</span><span class="tag">&lt;/<span class="name">div</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">script</span>&gt;</span><span class="undefined"></span></span><br><span class="line"><span class="undefined">function show_date_time()&#123;</span></span><br><span class="line"><span class="undefined">window.setTimeout("show_date_time()", 1000);</span></span><br><span class="line"><span class="undefined">BirthDay=new Date("05/20/2018 15:13:14");</span></span><br><span class="line"><span class="undefined">today=new Date();</span></span><br><span class="line"><span class="undefined">timeold=(today.getTime()-BirthDay.getTime());</span></span><br><span class="line"><span class="undefined">sectimeold=timeold/1000</span></span><br><span class="line"><span class="undefined">secondsold=Math.floor(sectimeold);</span></span><br><span class="line"><span class="undefined">msPerDay=24*60*60*1000</span></span><br><span class="line"><span class="undefined">e_daysold=timeold/msPerDay</span></span><br><span class="line"><span class="undefined">daysold=Math.floor(e_daysold);</span></span><br><span class="line"><span class="undefined">e_hrsold=(e_daysold-daysold)*24;</span></span><br><span class="line"><span class="undefined">hrsold=setzero(Math.floor(e_hrsold));</span></span><br><span class="line"><span class="undefined">e_minsold=(e_hrsold-hrsold)*60;</span></span><br><span class="line"><span class="undefined">minsold=setzero(Math.floor((e_hrsold-hrsold)*60));</span></span><br><span class="line"><span class="undefined">seconds=setzero(Math.floor((e_minsold-minsold)*60));</span></span><br><span class="line"><span class="undefined">document.getElementById('days').innerHTML="已运行"+daysold+"天"+hrsold+"小时"+minsold+"分"+seconds+"秒";</span></span><br><span class="line"><span class="undefined">&#125;</span></span><br><span class="line"><span class="undefined">function setzero(i)&#123;</span></span><br><span class="line"><span class="undefined">if (i&lt;10)</span></span><br><span class="line"><span class="undefined">&#123;i="0" + i&#125;;</span></span><br><span class="line"><span class="undefined">return i;</span></span><br><span class="line"><span class="undefined">&#125;</span></span><br><span class="line"><span class="undefined">show_date_time();</span></span><br><span class="line"><span class="undefined"></span><span class="tag">&lt;/<span class="name">script</span>&gt;</span></span><br></pre></td></tr></table></figure><p>文件位置：<code>themes/next/layout/_macro/sidebar.swig</code>  (其中加上带加号的那句)</p><figure class="highlight html"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"> &#123;# Blogroll #&#125;</span><br><span class="line">        &#123;% if theme.links %&#125;</span><br><span class="line">          <span class="tag">&lt;<span class="name">div</span> <span class="attr">class</span>=<span class="string">"links-of-blogroll motion-element &#123;&#123; "</span><span class="attr">links-of-blogroll-</span>" + <span class="attr">theme.links_layout</span> | <span class="attr">default</span>('<span class="attr">inline</span>') &#125;&#125;"&gt;</span></span><br><span class="line">            <span class="tag">&lt;<span class="name">div</span> <span class="attr">class</span>=<span class="string">"links-of-blogroll-title"</span>&gt;</span></span><br><span class="line">              <span class="tag">&lt;<span class="name">i</span> <span class="attr">class</span>=<span class="string">"fa  fa-fw fa-&#123;&#123; theme.links_icon | default('globe') | lower &#125;&#125;"</span>&gt;</span><span class="tag">&lt;/<span class="name">i</span>&gt;</span></span><br><span class="line">              &#123;&#123; theme.links_title &#125;&#125;&amp;nbsp;</span><br><span class="line">              <span class="tag">&lt;<span class="name">i</span> <span class="attr">class</span>=<span class="string">"fa  fa-fw fa-&#123;&#123; theme.links_icon | default('globe') | lower &#125;&#125;"</span>&gt;</span><span class="tag">&lt;/<span class="name">i</span>&gt;</span></span><br><span class="line">            <span class="tag">&lt;/<span class="name">div</span>&gt;</span></span><br><span class="line">            <span class="tag">&lt;<span class="name">ul</span> <span class="attr">class</span>=<span class="string">"links-of-blogroll-list"</span>&gt;</span></span><br><span class="line">              &#123;% for name, link in theme.links %&#125;</span><br><span class="line">                <span class="tag">&lt;<span class="name">li</span> <span class="attr">class</span>=<span class="string">"links-of-blogroll-item"</span>&gt;</span></span><br><span class="line">                  <span class="tag">&lt;<span class="name">a</span> <span class="attr">href</span>=<span class="string">"&#123;&#123; link &#125;&#125;"</span> <span class="attr">title</span>=<span class="string">"&#123;&#123; name &#125;&#125;"</span> <span class="attr">target</span>=<span class="string">"_blank"</span>&gt;</span>&#123;&#123; name &#125;&#125;<span class="tag">&lt;/<span class="name">a</span>&gt;</span></span><br><span class="line">                <span class="tag">&lt;/<span class="name">li</span>&gt;</span></span><br><span class="line">              &#123;% endfor %&#125;</span><br><span class="line">            <span class="tag">&lt;/<span class="name">ul</span>&gt;</span></span><br><span class="line">+        &#123;% include '../_custom/sidebar.swig' %&#125; </span><br><span class="line">          <span class="tag">&lt;/<span class="name">div</span>&gt;</span></span><br><span class="line">         &#123;% endif %&#125;</span><br></pre></td></tr></table></figure><h2 id="搜索功能"><a href="#搜索功能" class="headerlink" title="搜索功能"></a>搜索功能</h2><p>文件位置：<code>themes/next/_config.yml</code></p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Local search</span></span><br><span class="line"><span class="comment"># Dependencies: https://github.com/theme-next/hexo-generator-searchdb</span></span><br><span class="line"><span class="attr">local_search:</span></span><br><span class="line"><span class="attr">  enable:</span> <span class="string">ture</span></span><br></pre></td></tr></table></figure><p>安装插件</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span> npm install hexo-generator-search --save</span><br></pre></td></tr></table></figure><p>但是我在安装插件的时候一直报错</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">npm ERR! path /home/song/hexo/test/node_modules/babylon</span><br><span class="line">npm ERR! code ENOENT</span><br><span class="line">npm ERR! errno -2</span><br><span class="line">npm ERR! syscall access</span><br><span class="line">npm ERR! enoent ENOENT: no such file or directory, access '/home/song/hexo/test/node_modules/babylon'</span><br><span class="line">npm ERR! enoent This is related to npm not being able to find a file.</span><br><span class="line">npm ERR! enoent </span><br><span class="line"></span><br><span class="line">npm ERR! A complete log of this run can be found in:</span><br><span class="line">npm ERR!     /home/song/.npm/_logs/2018-11-11T06_59_34_564Z-debug.log</span><br></pre></td></tr></table></figure><p><a href="https://blog.csdn.net/h416756139/article/details/50812109" target="_blank" rel="noopener">解决办法</a>：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span> npm install -g cnpm --registry=http://registry.npm.taobao.org</span><br><span class="line"><span class="meta">$</span> cnpm install hexo-generator-search --save</span><br></pre></td></tr></table></figure><h2 id="预告"><a href="#预告" class="headerlink" title="预告"></a>预告</h2><p>1.关于更新主题</p><p>2.关于如何推广博客</p><p>3.评论邮件提醒</p>]]></content>
      
      
      <categories>
          
          <category> 一些摸索 </category>
          
      </categories>
      
      
        <tags>
            
            <tag> hexo </tag>
            
        </tags>
      
    </entry>
    
    <entry>
      <title>异常行为检测文献整理(顶会顶刊)</title>
      <link href="/2019/05/11/%E5%BC%82%E5%B8%B8%E8%A1%8C%E4%B8%BA%E6%A3%80%E6%B5%8B%E6%96%87%E7%8C%AE%E6%95%B4%E7%90%86-%E9%A1%B6%E4%BC%9A%E9%A1%B6%E5%88%8A/"/>
      <url>/2019/05/11/%E5%BC%82%E5%B8%B8%E8%A1%8C%E4%B8%BA%E6%A3%80%E6%B5%8B%E6%96%87%E7%8C%AE%E6%95%B4%E7%90%86-%E9%A1%B6%E4%BC%9A%E9%A1%B6%E5%88%8A/</url>
      
        <content type="html"><![CDATA[<p><center>Timeline</center></p><p> <strong>2019.12.09  更新阅读笔记   来自CVPR2019《Object-centric Auto-encoders and Dummy Anomalies for Abnormal Event Detection in Video》</strong></p><p> <strong>2019.12.10  更新阅读笔记  来自WACV2019 《Detecting Abnormal Events in Video Using Narrowed Normality Clusters》</strong></p><p><strong>2019.12.23 更新阅读笔记 来自ICCV2019 《Anomaly Detection in Video Sequence With Appearance-Motion Correspondence》</strong></p><p><strong>2020.3.26 更新阅读笔记 来自WACV2019 《Multi-timescale Trajectory Prediction for Abnormal Human Activity Detection》</strong></p><hr><h2 id="2020"><a href="#2020" class="headerlink" title="2020"></a>2020</h2><h3 id="Multi-timescale-Trajectory-Prediction-for-Abnormal-Human-Activity-Detection"><a href="#Multi-timescale-Trajectory-Prediction-for-Abnormal-Human-Activity-Detection" class="headerlink" title="Multi-timescale Trajectory Prediction for Abnormal Human Activity Detection"></a>Multi-timescale Trajectory Prediction for Abnormal Human Activity Detection</h3><p><strong>来源</strong>：WACV2020</p><p><strong>创新点</strong>：提出了一个multi-timescale的模型来捕获不同timescale（就翻译成时间尺度咯）的时间动态。因为不同的异常行为持续的时间不同，比如jumping和loitering。之前的研究要么是单帧要么是几个固定帧处理的。</p><p><strong>贡献</strong>：1. 提出了一个双向（过去和将来）预测的框架，输入姿态轨迹（pose trajectory），预测不同时间尺度的姿态轨迹。2.提了一个新的数据集，包括单人、多人和群体的异常。</p><a id="more"></a><p><strong>网络结构</strong>：</p><p><img src="https://pic.downk.cc/item/5e7c641d504f4bcb0494bddd.jpg" alt="image-20200326154551039"></p><p>在特定的时间尺度上，将来自两个模型（过去和将来）的预测组合在一起，生成每个时刻的预测。 例如，要生成时间尺度为1的未来预测（在作者的设置中，时间尺度1表示3步的持续时间），模型首先将序列分成较小的子序列（长度为3），然后进行对将来的预测（对于下个3 步）。 将这些预测组合起来，可以得出在此时间范围内完整输入序列的未来预测。 为了获得过去的预测，反转输入序列并将其传递给过去的预测模型。 两种模型具有相同的体系结构，但训练方式不同。</p><p>对于任何时间尺度的<strong>任何时刻</strong>，该模型都会生成多个预测，因为它会在输入信号上运行滑动窗口，对这多个预测取平均就获得这个时刻的最终预测。</p><p><img src="https://pic.downk.cc/item/5e7c643e504f4bcb0494d246.jpg" alt="image-20200326155328103"></p><p>作者设置的时间尺度为3，5，13，25。</p><p><strong>损失函数</strong>：损失包括两种类型，node level和layer level。有M_j个nodes的第j层的损失表示为：<img src="https://pic.downk.cc/item/5e7c6464504f4bcb0494e9e9.jpg" alt="image-20200326160317512"></p><p><strong>新数据集</strong>：</p><p><img src="https://pic.downk.cc/item/5e7c6482504f4bcb0494fac3.jpg" alt="image-20200326160729865"></p><p><strong>结果</strong>：</p><p><img src="https://pic.downk.cc/item/5e7c649a504f4bcb04950ab5.jpg" alt="image-20200326160748573"></p><p><strong>数据集</strong>：HR-ShanghaiTech、HR-Avenue（这些是和2019CVPRLearning Regularity in Skeleton Trajectories for Anomaly Detection in Videos保持一致的。）</p><h2 id="2019"><a href="#2019" class="headerlink" title="2019"></a>2019</h2><h3 id="Anomaly-Detection-in-Video-Sequence-With-Appearance-Motion-Correspondence"><a href="#Anomaly-Detection-in-Video-Sequence-With-Appearance-Motion-Correspondence" class="headerlink" title="Anomaly Detection in Video Sequence With Appearance-Motion Correspondence"></a>Anomaly Detection in Video Sequence With Appearance-Motion Correspondence</h3><p><strong>链接</strong>：<a href="https://github.com/nguyetn89/Anomaly_detection_ICCV2019" target="_blank" rel="noopener">code</a></p><p><strong>来源：</strong>ICCV2019</p><p><strong>创新点：</strong>提出了一种深度卷积神经网络（CNN），学习常见对象外观（例如行人，背景，树木等）及其关联运动之间的对应关系。 所设计的网络由共享相同编码器的重建网络和图像翻译模型组成。 </p><p><strong>和前人的不同</strong>：和2016Hasan等人的方法不同点在于，本文的Conv-AE的输入是单帧，时间因素是通过u-net来考虑的，而Hasan等人的Conv-AE的输入要么是手工特征要么是10帧。和2017Ravanbakhsh等人不同的地方在于，本文的网络将帧翻译为光流（但是不用pix2pix GAN），同时用共享encoding flow的Conv-AE代替另外一个U-Net，而Ravanbaksh等人的方法的两个相同的CNN可能太冗余了。2018liuwen等人的方法用U-Net来预测帧，本文直接从单帧来预测光流为了确定场景外观和典型运动之间的关系。</p><p><strong>贡献</strong>：1.设计了一个结合Conv-AE和U-Net的CNN。可端到端训练。2.在输入层之后整合了一个Inception模块【Rethinking the inception architecture for computer vision】来减少网络深度的影响。3.提出了一个patch-based 方案来估计帧级别的normality分数，减少模型输出中出现的噪声影响。</p><p><strong>网络结构</strong>：</p><p><img src="https://pic.superbed.cn/item/5e0324ea76085c328902463d.jpg" alt="img"></p><p>【生成器部分】网络包括两个处理流，第一个是通过Conv-AE学习正常事件中的共同外观空间结构，第二个是确定每个输入的pattern和它的对应的运动（由三通道的光流表示）之间的关联。U-Net中的跳层连接对于图像翻译来说是有用的因为能够直接将低层次的特征从原始域转换为编码的特征。这样的跳层连接不用在外观流中，因为网络会让输入的信息通过这些连接，就不能通过bottleneck来强调underlying的属性了。网络不用全连接层，所以（理论上）可以处理任何分辨率的图像。</p><p><img src="https://pic1.superbed.cn/item/5e00bb3776085c3289de1384.jpg" alt="img"></p><p>【判别器部分】这个部分的网络不用在测试阶段。</p><p><strong>损失函数</strong>:</p><p>Appearance 部分和motion部分的loss function</p><p><img src="https://pic2.superbed.cn/item/5e00ba0476085c3289dd88d6.jpg" alt="img"></p><p><img src="https://pic.superbed.cn/item/5e00bec476085c3289dfafe3.jpg" alt="img"></p><p>最后GAN网络的loss function</p><p><img src="https://pic2.superbed.cn/item/5e00bf7276085c3289dffccf.jpg" alt="img"></p><p><img src="https://pic1.superbed.cn/item/5e00bfad76085c3289e01cbf.jpg" alt="img"></p><p><strong>结果</strong>：<img src="https://pic2.superbed.cn/item/5e00c22e76085c3289e1f135.jpg" alt="img"></p><p>最后一列可借鉴，很直观。第一列是输入的帧和原始的光流，第二列是重建帧和预测的光流。</p><p><img src="https://pic1.superbed.cn/item/5e00c35176085c3289e32f5b.jpg" alt="img"></p><p><strong>数据集</strong>：Avenue，UCSD Ped2, Subway Entrance and Exit gates, Traffic-Belleview, Traffic-Train。不用UCSD Ped1的原因是，FlowNet2不适合于距离摄像机太远的小而瘦的行人，另外就是在训练集中标注为正常的行为在测试集中却是异常。</p><p><strong>新的评估方法</strong>：【仅考虑小块而不是整个frame】frame-level score</p><p><img src="https://pic3.superbed.cn/item/5e00c88376085c3289e64567.jpg" alt="img"></p><h3 id="Object-centric-Auto-encoders-and-Dummy-Anomalies-for-Abnormal-Event-Detection-in-Video"><a href="#Object-centric-Auto-encoders-and-Dummy-Anomalies-for-Abnormal-Event-Detection-in-Video" class="headerlink" title="Object-centric Auto-encoders and Dummy Anomalies for Abnormal Event Detection in Video"></a>Object-centric Auto-encoders and Dummy Anomalies for Abnormal Event Detection in Video</h3><p><strong>链接：</strong><a href="https://github.com/fjchange/object_centric_VAD" target="_blank" rel="noopener">unofficial code</a></p><p><strong>来源</strong>：CVPR2019</p><p><strong>作者</strong>：作者和Detecting Abnormal Events in Video Using Narrowed Normality Clusters【WACV2019】、Deep Appearance Features for Abnormal Behavior Detection in Video【ICIAP2017】、Unmasking the abnormal events in video【ICCV2017】为同一个团队。</p><p><strong>创新点</strong>：第一次将异常检测当成可区分的多类分类问题（discriminative multi-class classification problem）。</p><p><strong>贡献</strong>：1.引入了一种基于对象为中心的卷积自编码器 (object-centric convolutional auto-encoders) 的无监督特征学习框架，以编码运动和外观信息。2. 我们提出了一种基于训练样本聚类为正常簇 (normality clusters) 的监督分类方法 [即对正常行为分了类]。用一个one-versus-rest的异常事件分类器将每个正常聚类和其他的分开。</p><p><strong>数据集：</strong>Avenue、ShanghaiTech、UCSD ped2 和 UMN</p><p><strong>灵感来源</strong>：R. Hinami, T. Mei, and S. Satoh. Joint Detection and Recounting of Abnormal Events by Learning Deep Generic Knowledge. In <em>Proceedings of ICCV</em>, pages 3639–3647, 2017. 和这篇文章的不同在于这篇文章用了ssd检测。在特征提取阶段，hinami等人在多个视觉任务上微调了Fast R-CNN模型的分类分支，以利用语义信息来检测和描述异常事件，而这篇论文用卷积自编码器来学习无监督的深度特征。</p><p><strong>方法</strong>：整个网络分成四个阶段：<strong>目标检测阶段</strong>、<strong>特征学习阶段</strong>、<strong>训练</strong>和<strong>测试</strong>阶段。<strong>目标检测阶段</strong>——用包围框将目标切下来，然后切下来的图片送到特征学习阶段，同时计算代表运动的梯度，然后把梯度也送到特征学习阶段，见下图。<strong>特征学习阶段</strong>——将两个梯度和剪裁后的图片分别送到三个卷积自编码器中。每个隐藏层为8x8x16，最后的特征向量为3072维。<strong>训练阶段</strong>——通过构建一个context，在这个context中，正常样本中的一个子集相对另一个子集相当于虚拟异常样本，来弥补真实异常样本缺少的问题，方法就是用k-means进行聚类。<strong>测试阶段</strong>—— 每个测试样本x被k个SVM模型分类，最高的被认为是异常样本，由下式计算。</p><p><img src="https://ae01.alicdn.com/kf/H790c645575554ff89e4876b53660a45bM.jpg" alt="img"></p><p><img src="https://ae01.alicdn.com/kf/H6fe6349c078f4a8e8a6573875aa72f0f6.jpg" alt="img"></p><p><strong>结果</strong>：frame-level AUC—-Avenue90.4  ShanghaiTech84.9  Ped2 97.8 UMN 99.6。帧率11fps（这个有点太慢了，时间用来检测了）。</p><p><strong>可以借鉴的地方</strong>：消融实验的部分—做了这么几个实验。1.frame-level自编码器+ocsvm【one-class SVM】，说明提取对象为中心的特征和用one-versus-rest SVM有用；2. frame-level自编码器+one-versus-rest SVM，说明one-versus-rest SVM确实有用。3.预训练的ssd特征+one-versus-rest SVM，说明自编码器学习特征的重要性；4.object-centric CAE只保留外观或者运动特征，说明运动特征和外观特征的相关性。5.object-centric CAE+ocSVM，说明把异常检测当成多分类任务有用。</p><p><strong>不足之处</strong>：检测的时候如果出现遮挡，如两人重叠，就会误报（正常被判断为异常，false positive）。</p><h3 id="Latent-Space-Autoregression-for-Novelty-Detection"><a href="#Latent-Space-Autoregression-for-Novelty-Detection" class="headerlink" title="Latent Space Autoregression for Novelty Detection"></a>Latent Space Autoregression for Novelty Detection</h3><p><strong>链接</strong>：<a href="https://arxiv.org/pdf/1807.01653.pdf" target="_blank" rel="noopener">paper</a>、<a href="https://github.com/aimagelab/novelty-detection" target="_blank" rel="noopener">code</a></p><p><strong>来源：</strong>CVPR2019</p><p><strong>摘要：</strong>将一个参数密度估计器（parametric density estimator）加在自编码器上。参数密度估计器通过自回归过程（autoregressive procedure）学习潜在表示的概率分布。与正常样本的重建相结合进行优化的最大似然目标，通过最小化潜在向量分布的差分熵（differential entropy），有效地充当了调节器（regularizer）。</p><p><strong>创新点</strong>：第一次将entropy minimization的方法用在显著性检测中，之前是用在（deep neural<br>compression）中。</p><p><strong>数据集：</strong>UCSD Ped2、shanghaitech</p><p><strong>网络结构</strong>：</p><p><img src="https://ae01.alicdn.com/kf/H9497e10861f1499dbb1534ead70ff677Q.jpg" alt="img"></p><h3 id="Detecting-Abnormal-Events-in-Video-Using-Narrowed-Normality-Clusters"><a href="#Detecting-Abnormal-Events-in-Video-Using-Narrowed-Normality-Clusters" class="headerlink" title="Detecting Abnormal Events in Video Using Narrowed Normality Clusters"></a>Detecting Abnormal Events in Video Using Narrowed Normality Clusters</h3><p><strong>来源</strong>：WACV2019</p><p><strong>关于作者</strong>：作者和Deep Appearance Features for Abnormal Behavior Detection in Video【ICIAP2017】、Unmasking the abnormal events in video【ICCV2017】为同一个团队。</p><p><strong>创新点</strong>：把异常检测当成一个检测离群点（outlier detection）的任务，分为k-means聚类和ocSVM两个阶段。</p><p><strong>网络结构</strong>：</p><p><img src="https://ae01.alicdn.com/kf/H391d592e785646188e5145680960f4f9d.jpg" alt="img"></p><p><strong>创新点</strong>：1.<strong>增强</strong>有深度外观特征的时空视频块。2.综合了两种离群点的检测方法k-means和ocSVM。3.通过学习每个聚类周围的紧密边界来缩小正常簇。</p><p><strong>数据集</strong>：Avenue、Subway、UMN</p><p>（这篇的related work很全面哈）</p><p><strong>方法</strong>：训练和测试的输入都是时空视频块（spatio-temporal cubes）。在训练阶段，使用k-means对提取的时空视频块进行聚类，<strong>并消除了较小的聚类作为离群值</strong>。 在其余的每个聚类上，我们训练一个ocSVM模型以除去离群的视频块。 在测试阶段，在每个ocSVM模型测试每个时空视频块，以获得一组normality得分。最大的分数（带有符号变化【这什么意思，后面第4节之前几行解释了，因为这个分数可能是负的】）作为这个块的异常分数。</p><p><strong>特征提取阶段</strong>：对所有的数据集都这样处理，先resize成120x160，再切成不重叠的10x10小块，五帧一起作为一个视频块，维度为10x10x5。之后从每个视频块儿获得3D gradient features。如果视频在相应区域中是静态的，将消除该区域中的视频块。之后通过位置、运动方向、外观等信息增强视频块。【这个部分中提取时空视频块的代码用的是<a href="https://alliedel.github.io/anomalydetection/，2016ECCV" target="_blank" rel="noopener">https://alliedel.github.io/anomalydetection/，2016ECCV</a>   A discriminative framework for anomaly detection in large videos】</p><p><strong>为什么要有第二个阶段ocSVM</strong>：因为第一个阶段的k-means不能给剩下的clusters一个收紧的边界，仍然留出了很大的空间来容纳outliers。为了解决这个问题，对每个cluster都训练一个ocSVM。ocSVM学习到的边界更为tighter，因为ocSVM模型被迫将cluster中一小部分样本作为outliers。</p><p><strong>结果</strong>：<strong>Avenue frame-level 88.9 pixel-level 94.1, 24fps</strong>。还和ICCV2017的Joint detection and recounting of abnormal events by learning deep generic knowledge进行了单独的对比，因为这个作者用的Avenue数据集去掉了5个数据集，仅用了17个。</p><p><strong>消融实验</strong>：（未增强的）时空视频块+ocSVM；增强的时空视块+ocSVM；增强视频块+k-means+在所有k个cluster上的训练的ocSVM（没有消除小的clusters）；增强视频块+k-means（消除了小的clusters）+1-NN。</p><p><strong>误检的情况</strong>（正常被检测为异常）：Avenue中两个人同步行走；一个人被遮挡；包在空中，在人扔包之前；Subway中一个奔跑的人和两个同步行走的人；UMN中一个人打开门进来或出去（对光的变化不鲁棒）。</p><p><strong>疑惑的地方：</strong>3.1 这几个增强的方法具体是怎么操作的。</p><h3 id="Margin-Learning-Embedded-Prediction-for-Video-Anomaly-Detection-with-A-Few-Anomalies"><a href="#Margin-Learning-Embedded-Prediction-for-Video-Anomaly-Detection-with-A-Few-Anomalies" class="headerlink" title="Margin Learning Embedded Prediction for Video Anomaly Detection with A Few Anomalies"></a>Margin Learning Embedded Prediction for Video Anomaly Detection with A Few Anomalies</h3><p><strong>来源：</strong>IJCAI2019</p><p><strong>创新点：</strong>提出了Margin Learning Embedded Prediction (MLEP) framework for open-set supervised anomaly detection。所提出的网络有三个特征，第一，将每一帧的特征顺序输入到ConvLSTM中，更好地编码时间和空间信息。第二，把margin learning 嵌入网络结构中。有助于观测到和未观测到的异常的检测。第三，能够通过帧级别和视频级别的异常标记处理异常检测。</p><p><strong>贡献：</strong>1.设计了MLEP来进行open-set supervised 异常检测。2.设计了一个预测框架对预测正常行为有利。3.网络可通过帧级别以及视频级别的异常标记处理异常检测。</p><p><strong>针对问题：</strong>unet有利于异常行为的预测；传统的编码器没有足够的能力编码运动信息进而预测正常帧；卷积lstm用观测帧的历史运动信息，可能会预测异常行为。</p><p><strong>网络结构：</strong><br><img src="https://ae01.alicdn.com/kf/H3479b38018a6479f88ad590934ec1649L.jpg" alt="img"></p><p><img src="https://ae01.alicdn.com/kf/H8ec91df4835d4f02ab8c900c57e7caa5M.jpg" alt="img"><br>其中的margin learning模块的loss是triplet loss，灵感来自Person re- identification by multi-channel parts-based cnn with im- proved triplet loss function. 和A unified embedding for face recognition and clustering. </p><p><img src="https://ae01.alicdn.com/kf/Hf4649cb3fac0482bb45e18ec5e1ff2ebH.jpg" alt="img"><br>整个的loss就是两个loss加起来。<br><img src="https://ae01.alicdn.com/kf/H332f813492e644e488e6ec3b89a535f0G.jpg" alt="img"><br><strong>训练阶段：</strong>对于frame-level的标记，随机选择一个anchor，一个正样本和一个负样本来训练。对于video-level的标记，首先仅用正常数据训练一个基于预测的异常检测网络，这时不加triplet loss ，即λ=0。然后我们用训练好的模型预测正常和异常数据的正常分数。最后我们用sampled triplets来重新训练整个网络。测试阶段：仍然用PSNR来判断。越高表示越有可能是正常的。 </p><p><strong>实验细节：</strong>所有的帧resize成224X224。一个video snippet的长度是4。把测试集中的异常数据分成K折，每折仅包含一些异常事件，而不是全部的异常事件。K=10。训练的时候，把其中一折放进训练集中，然后把剩下的作为测试集。由此保证了测试集必须包含训练集没有包含的异常事件，测试集可能包含训练集中观察到的异常事件类型。为什么自己的预测的方法好也和别的方法进行了对比。用cyclegan的3个卷积层和6个残差块作为编码器，解码器用三个反卷积层。</p><p><strong>数据集：</strong>Avenue 和shanghaitech。不用ucf crime是因为正常和异常数据的比例是均衡的，另外摄像头的角度变化，对于预测来说不理想。另外他是一个closed-set supervised异常检测，所以不进行比较。</p><h3 id="Memorizing-Normality-to-Detect-Anomaly-Memory-augmented-Deep-Autoencoder-for-Unsupervised-Anomaly-Detection"><a href="#Memorizing-Normality-to-Detect-Anomaly-Memory-augmented-Deep-Autoencoder-for-Unsupervised-Anomaly-Detection" class="headerlink" title="Memorizing Normality to Detect Anomaly: Memory-augmented Deep Autoencoder for Unsupervised Anomaly Detection"></a>Memorizing Normality to Detect Anomaly: Memory-augmented Deep Autoencoder for Unsupervised Anomaly Detection</h3><p><strong>链接</strong>：<a href="https://github.com/donggong1/memae-anomaly-detection" target="_blank" rel="noopener">code</a></p><p><strong>来源：</strong>ICCV2019</p><p><strong>作者</strong>：和CVPR2019Learning Regularity in Skeleton Trajectories for Anomaly Detection in Videos是一个团队的。</p><p><strong>创新点：</strong>提出用一个memory module来augment 自编码器，叫做MemAE。MemAE首先从encoder获得编码，然后在训练阶段，记忆内容是更新的，测试时，学习到的记忆固定，从正常数据中选择一些记录的记忆得到重建。</p><p><strong>针对问题：</strong>深度自编码器可能会很好地重建异常行为。比如，一些异常和正常的训练数据有相同的compositional patterns 或者decoder解码一些异常编码的时候太厉害了。</p><p><strong>关键词：</strong> attention based memory addressing</p><p><img src="https://ae01.alicdn.com/kf/H9238e8c430f543479d198412b9d711dbJ.jpg" alt=""></p><h3 id="Graph-Convolutional-Label-Noise-Cleaner-Train-a-Plug-and-play-Action-Classifier-for-Anomaly-Detection"><a href="#Graph-Convolutional-Label-Noise-Cleaner-Train-a-Plug-and-play-Action-Classifier-for-Anomaly-Detection" class="headerlink" title="Graph Convolutional Label Noise Cleaner: Train a Plug-and-play Action Classifier for Anomaly Detection"></a>Graph Convolutional Label Noise Cleaner: Train a Plug-and-play Action Classifier for Anomaly Detection</h3><p><strong>链接：</strong><a href="https://github.com/jx-zhong-for-academic-purpose/GCN-Anomaly-Detection" target="_blank" rel="noopener">code</a></p><p><strong>来源：</strong>CVPR2019</p><p><strong>创新点：[首次用GCN来纠正视频分析领域的label noise]</strong> a supervised learning task under noisy labels。只要清除label noise，就可以直接将完全监督的动作分类器(fully supervised action classifiers)应用于弱监督的异常检测，并最大限度地利用这些完善的分类器。为此，设计了一个图卷积网络(graph convolutional network)来校正noisy labels。基于特征相似性和时间一致性(视频的两个特性)，网络将supervisory信号从高置信度的片段传播到低置信度的片段。通过这种方式，网络能够为动作分类器提供cleaned supervision。在测试阶段，我们只需要从动作分类器中获取片段预测，无需做任何后处理。</p><p><strong>针对问题：</strong>近年来，对新兴的二元分类范式进行了一些研究，训练数据包括异常和普通视频。只有视频级别的异常标签提供。之前的工作都把弱监督异常检测问题看做是多示例学习(multiple-instance learning)。我们换了角度，看作是noise labels下的监督学习任务。噪声标签指的是异常视频中正常片段的错误注释，因为标记“异常”的视频可能包含相当多的正常片段。因此，一旦清除了噪声标签，就可以直接训练完全监督的动作分类器。</p><p><strong>方法：</strong>两个阶段，清洁和分类。清洁阶段中，训练一个cleaner来校正classifier得到的预测噪声，并且提供了更少噪声的refined labels。在分类阶段，action classifier使用cleaned labels重新训练并且产生更可靠的预测。cleaner的主要想法是通过高可信度预测的噪声来消除低可信度预测的噪声。设计了一个GCN来建立高可信度片段和低可信度片段之间的关系。在图中，片段被抽象成顶点 (vertexes) ，异常信息通过边 (edges) 传播。测试的时候，我们不需要cleaner，而是直接获得训练snippet-wise的异常结果。</p><p>对两种类型的主流动作分类器进行了大量实验：C3D和TSN。</p><p>（我们的label noise cleaner 的目标是在高可信度注释的监督下，在图（整个视频）中对节点（视频片段）进行分类。）</p><p>Y=0(只包含正常片段的negative bag)是noiseless。而Y=1是noisy，因为部分是异常的。这就叫做one-sided label noise。</p><p><strong>数据集：</strong>UCF-Crime Shanghai Tech UCSD-Peds</p><p><img src="https://ae01.alicdn.com/kf/HTB1GzbSdA5E3KVjSZFCq6zuzXXaQ.jpg" alt="图片"></p><p><strong>灵感来源：</strong>[Sivan Sabato and Naftali Tishby. Multi-instance learning with any hypothesis class. Journal of Machine Learning Research, 13(1):2999–3039, Oct. 2012.] MIL任务可以被视为在one-sided label noise下学习。</p><p>​         <img src="/Users/songyu/Desktop/He2d73e0096074fd2b4f8063626527f33e.png" alt="img">       </p><h3 id="Learning-Regularity-in-Skeleton-Trajectories-for-Anomaly-Detection-in-Videos"><a href="#Learning-Regularity-in-Skeleton-Trajectories-for-Anomaly-Detection-in-Videos" class="headerlink" title="Learning Regularity in Skeleton Trajectories for Anomaly Detection in Videos"></a>Learning Regularity in Skeleton Trajectories for Anomaly Detection in Videos</h3><p><strong>链接：</strong><a href="https://github.com/RomeroBarata/skeleton_based_anomaly_detection" target="_blank" rel="noopener">code</a></p><p><strong>来源：</strong>CVPR2019</p><p><strong>创新点：</strong>用了dynamic skeleton features来建模人运动的正常模式。把skeletal movements分解成global body movement和local body posture。</p><p>(Message-Passing Encoder-Decoder Recurrent Neural Network)</p><p><strong>针对问题：</strong>1<strong>.</strong>现在的方法基于像素的外观特征和运动特征，然而基于像素的特征是高维非结构化信号，对噪声敏感，它掩盖了关于场景的重要信息；另外，这些特征中呈现的冗余的信息增加了对它们进行训练的模型的负担。2.另外一个现在方法的限制是由于视觉特征和事件真实含义之间存在语义鸿沟，缺乏可解释性。</p><p><strong>相关工作：</strong>1传统的用one-class分类的方法在处理具有各种异常类型的大规模数据的时候会获得suboptimal performance。2.基于intensity特征对外观噪声敏感。因此liu【cvpr2018】的工作用了预测的方法，但是光流提取成本高并且远离事件的语义性。</p><p><strong>方法：</strong>【在现实的监控视频中，人体骨骼的尺度在很大程度上取决于它们的位置和动作。 对于近场中的骨架，观察到的运动主要受局部因素的影响。 同时，对于远场中的骨架，运动主要受全局运动的影响，而局部变形则大多被忽略。因此进行了分解。】</p><p><img src="https://ae01.alicdn.com/kf/H7ed4edce4f85450fb58d79625736dcac5.jpg" alt="图片"></p><p>设置了一个附着在人体上的规范参考框架（称为局部框架）。全局分量被定义为原始图像帧内的局部框架中心的绝对位置，它基于骨架边界框的中心。 局部分量定义为从原始运动中减去全局分量后的残差。 它表示骨架关节相对于边界框的相对位置。 由于深度缺失，仅xy坐标不能很好地表示场景中的实际位置。 但是，骨架边界框的大小与场景中骨架的深度相关。 为了弥补这个差距，我们用骨架边界框的宽度和高度fg =（xg，yg，w，h）来扩充全局分量，并使用它们来规范化局部分量。</p><p>网络由两个 recurrent encoder-decoder network分支构成， 该模型的每个分支都具有单编码器 - 双解码器架构，具有三个RNN：编码器，重构解码器和预测解码器。网络是和[Unsupervised learning of video representations using LSTMs的工作相似。但是不同点在于，所提出的MPED-RNN不仅通过跨分支消息传递机制对每个单独组件的动态进行建模，还对它们之间的相互依赖性进行建模。用GRU代替LSTM。</p><p>输入是长度为T的skeleton segment，然后对于每个时间步长t，骨架ft分解成局部的和全局两个部分，分别送到局部和全局的编码器。</p><p><img src="https://ae01.alicdn.com/kf/H14f2ef3986bf42d1a74ba466bde56b9dZ.jpg" alt="img"></p><p>损失函数：</p><p>​         <img src="https://ae01.alicdn.com/kf/H12ff9c27c01240bda3b1cca75ad4fa805.jpg" alt="img">       </p><p>果然！作者用了AlphaPose来检测skeleton。</p><h2 id="2018"><a href="#2018" class="headerlink" title="2018"></a>2018</h2><h3 id="real-world-anomaly-detection-in-surveillance-videos"><a href="#real-world-anomaly-detection-in-surveillance-videos" class="headerlink" title="real-world anomaly detection in surveillance videos"></a>real-world anomaly detection in surveillance videos</h3><p><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-11-11/89215767.jpg" alt="图片"></p><p><strong>链接：</strong><a href="https://github.com/WaqasSultani/AnomalyDetectionCVPR2018" target="_blank" rel="noopener">code</a></p><p><strong>来源：</strong>CVPR2018</p><p><strong>使用的方法</strong>：MIL（multiple instance learning）多示例学习 </p><p><strong>方法步骤：</strong></p><p>（1）     positive(某一部分包含异常)，negative（不包含异常）视频。把positive和negative视频分别分成固定数量的segments。每个视频表示为一个包，每个temporal segment表示包里的一个instance。</p><p>（2）     对video segments提取C3D features。</p><p>（3）     用一个novel ranking loss function（positive bag和negative bag中，在最高分数的instances之间计算ranking loss）来训练一个全连接神经网络。</p><p>简言之就是数据处理、提特征【提取到的特征应该是时空特征吧】、训练网络、通过得分预测是否异常。</p><p><strong>创新点：</strong>同时利用正常和异常的视频来学习异常行为。不需要标记训练视频中的异常segments or clips（非常浪费时间），而是利用弱标记（weakly labeled）的训练视频，通过deep multiple instance ranking framework来学习异常。视频标记（异常或正常）是video-level，而不是clip-level的。我们把正常和异常的视频看作是bags，把video segments看作是instances。【采用MIL的方法引入到异常检测中来】</p><p>另外，对ranking loss function引入了<strong>sparsity</strong>和<strong>temporal smoothness constraints</strong> 来在训练中更好的定位异常。</p><p>（有新的数据集）</p><p><strong>关键词：</strong>weakly-supervised learning，MIL</p><p><strong>针对问题：</strong>1<strong>.</strong>其他的方法都是假设偏离正常的行为就是异常。但是这样假设是有问题的，因为把所有可能的正常行为考虑进去是不太可能的。2.正常和异常之间的界限是模糊的。在现实场景中，是否异常可能和条件的不同有关。</p><p><strong>Baseline methods：</strong>C3D,TCNN（这两种方法在数据集上的效果很差，证明提出来的数据集非常challenging）</p><p><strong>比较：</strong>主要和Learning temporal regularity in video sequences和Abnormal event detection at 150 fps in matlab的方法比较。</p><p>【the first to formulate the video anomaly detection problem in the context of MIL】</p><p><strong>个人感想与总结：</strong>（采用了什么方法，达到了什么效果，还有什么不太好的地方可以改进）作者采用MIL方法，同时利用正常和异常的视频，使用提出的deep MIL ranking loss来进行异常检测。【把异常检测作为一个回归问题】</p><h3 id="Future-Frame-Prediction-for-anomaly-detection-a-new-baseline"><a href="#Future-Frame-Prediction-for-anomaly-detection-a-new-baseline" class="headerlink" title="Future Frame Prediction for anomaly detection-a new baseline"></a>Future Frame Prediction for anomaly detection-a new baseline</h3><p><strong>链接：</strong><a href="https://github.com/StevenLiuWen/ano_pred_cvpr2018" target="_blank" rel="noopener">code</a></p><p><strong>来源</strong>：CVPR2018</p><p><strong>创新点：</strong>在视频预测框架中解决异常检测问题。除了加spatial（<strong>Appearance</strong>）约束还加了temporal（<strong>motion</strong>）约束（光流）。也用到了GAN。<br><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-11-11/34727984.jpg" alt="图片"></p><p><strong>细节 ：1.作者为什么用U-net？</strong>因为现在的工作进行帧预测或者图像生成的通常包含两个模块，一个编码器能够通过逐渐地降低spatial resolution来提取特征，一个解码器能够通过增加spatial resolution来恢复帧。这样的结构有梯度消失问题和信息不平衡。U-net可以抑制梯度消失的问题。</p><p><strong>2.</strong>Intensity 为了保证在RGB空间所有像素的相似性，gradient可以锐化产生的图。</p><p><strong>评价别人的工作：</strong>Learning temporal regularity in video sequences【16年】。Abnormal event detection at 150 FPS in MATLAB【13年】。Anomaly detection in crowded scenes【10年】。Of all these work, the idea of <strong>feature reconstruction for normal training data</strong> is a commonly used strategy. </p><p>Sparse reconstruction cost for abnormal event detection【11年】。Abnormal event detection at 150 FPS in MATLAB【13年】。hand-crafted features。</p><p><strong>思考：</strong>我觉得作者有漏洞的地方：1.【假设正常的事件可以被很好的预测。】可是就像之前作者说自编码器那种重建的思想假设正常的事件可以以较小的误差被重建出来，但是深度神经网络的容量很高，异常事件不一定有更大的重建误差。那么在这里，正常的事件可以被很好的预测，异常的事件就不能被很好的预测吗？</p><p>2.t帧高的PSNR表示这帧很有可能是正常的。人为设置阈值来判断是正常还是异常帧。</p><p>3.没有像素级的检测</p><p>4.作者自己说 there exsits some uncertainties in normal events</p><p>5.对于作者用的gap，计算的是正常帧的平均分数和异常帧的平均分数之间的gap。【这个平均分数是不是有一点点问题，就是有没有一些正常帧其实有很小的psnr，异常帧有很大的psnr，平均是不是抹去了一些差别..重要吗。会造成一定程度的漏检吧】。而且最后加constraint的多少，看到的gap不是差很多，就差一点，不过auc还是提高了一些的。</p><p>6.在和conv-AE对比的时候，可以看到，ped1场景和avenue数据集并没有比conv-AE好太多。而且这个是平均之后的结果，真的有变好吗？</p><p>7.在最后作者用toy dataset来评估效果的时候，出现了有时正常事件的运动方向也不确定的情况。【其实就是正常的事件有时也不能很好的预测呀】</p><h3 id="Adversarially-Learned-One-Class-Classifier-for-Novelty-Detection"><a href="#Adversarially-Learned-One-Class-Classifier-for-Novelty-Detection" class="headerlink" title="Adversarially Learned One-Class Classifier for Novelty Detection"></a>Adversarially Learned One-Class Classifier for Novelty Detection</h3><p><strong>链接：</strong><a href="https://github.com/khalooei/ALOCC-CVPR2018" target="_blank" rel="noopener">code</a></p><p><strong>来源</strong>：CVPR2018</p><p><strong>创新点：【enhance inlier，distort outlier】</strong>(1)提出来一个端到端的结构进行one-class classification。(2)几乎其他的所有基于GAN的方法在训练之后或者抛弃了生成器或者抛弃了判别器。但我们的方法更有效，<strong>能够从两个训练的模块受益</strong>。(3)用在图片或视频中。</p><p><strong>方法：</strong></p><p><img src="https://ae01.alicdn.com/kf/Hdac2389e6ad14d2587238c7c40f8bd9ff.jpg" alt="img"></p><h2 id="2017"><a href="#2017" class="headerlink" title="2017"></a>2017</h2><h3 id="Joint-detection-and-recounting-of-abnormal-events-by-learning-deep-generic-knowledge"><a href="#Joint-detection-and-recounting-of-abnormal-events-by-learning-deep-generic-knowledge" class="headerlink" title="Joint detection and recounting of abnormal events by learning deep generic knowledge"></a>Joint detection and recounting of abnormal events by learning deep generic knowledge</h3><p><strong>来源</strong>：ICCV2017</p><p><strong>创新点：</strong>把检测和描述视频中的异常事件联合起来。Recounting of abnormal events,就是解释为什么他们是异常的。我们把一个generic CNN model和environment-dependent anomaly detection融合起来。</p><p>（异常检测是有场景依赖性的）【动作理解动作识别的方法能不能用上？】</p><p><strong>关键词：</strong>anomaly detector</p><p><strong>方法：</strong>based on multi-task Fast R-CNN</p><ol><li>用大量带标签的数据集【不是anomaly detection的dataset】来学习multi-task Fast R-CNN，学习到generic model。这样提取出deep features 和visual concept classification scores（同时提出的）。</li><li>对每个环境在这些特征和分数上学习到anomaly detectors，建模了目标环境的正常行为并且预测测试样本的异常分数。anomaly detectors和classification scores分别用来做异常行为检测和描述。{anomaly detectors有几种，NN OC-SVM KDE}</li><li>之后就是用以上两个学到的模型做异常检测和描述。分为四个步骤：a) <strong>detect object proposal</strong>. b) <strong>extract features</strong>.这一步由multi-task Fast R-CNN从所有的object proposal同时提语义特征和分类分数。c) <strong>classify normal/abnormal</strong>.将anomaly detector用到proposal的语义特征计算出每个proposal的异常分数，高于设定阈值的被确定为异常行为的源域。d) <strong>recount abnormal events.</strong> 异常行为的三种类型（objects,action, attributes）的visual concepts通过分类分数预测。</li></ol><p><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-11-11/86054775.jpg" alt="图片">方法简言之就是提semantic特征和分类分数，特征用来判异常/正常，分类分数用来做描述，<strong>不需要使用motion features。</strong></p><p><strong>疑惑的地方：（1）</strong>The anomaly scores of each predicted concept are computed by the anomaly detector for classification scores to recount the evidence of anomaly detection.<strong>（2）（3.1）</strong>The bounding box regression was not used because it depends on the class to detect, which is not determined in abnormal event detection.</p><p><strong>关于数据集：</strong>只选取USCD ped2和avenue进行验证，因为ped1的分辨率比较低，所以不用。关于avenue的像素级的标记有些扯淡（比较主观），比如在扔包的异常事件中，包仅仅被标记为异常，因此仅在frame-level进行评估。另外，由于avenue把moving objects看做异常，而该paper研究static objects，因此评估除去了22个clips中的5个。</p><p><strong>结果：</strong>这个paper在<strong>avenue</strong>数据集上的auc达到<strong>89.2，ped2</strong>数据集上的auc达到90.8。FRCN的semantic feature总比HOG和SDAE特征表现好，并且不管用什么anomaly detector都比HOG和SDAE好。</p><p><img src="https://ae01.alicdn.com/kf/HTB1bl_1dBKw3KVjSZTE763uRpXaB.png" alt="img"></p><h3 id="unmasking-the-abnormal-events-in-video"><a href="#unmasking-the-abnormal-events-in-video" class="headerlink" title="unmasking the abnormal events in video"></a>unmasking the abnormal events in video</h3><p><strong>来源</strong>：ICCV2017</p><p><strong>创新点：</strong>不需要training sequences，我们的网络基于unmasking，是之前用来在文本文件中做授权认证的。<br>【the first work to apply unmasking for a computer vision task】<br>作者和【6 （2016）A Discriminative Framework for Anomaly Detection in Large Videos】还有一些监督的方法进行比较。<strong>【6】是主要借鉴的思想。</strong><br><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-11-11/51723764.jpg" alt="图片"></p><p>提特征，训练分类器</p><p><strong>unmask是怎么用的：</strong>We retain the training accuracy of the classifier and repeat the training process by eliminating some of the best features.这个过程就叫做unmasking。如图中所示，After extracting motion or appearance features (step D), we apply unmasking (steps E to G) by training a classifier and removing the highly weighted features for a number of k loops.</p><p>【The unmasking technique [12] is based on testing the degradation rate of the cross-validation accuracy of learned models, as the best features are iteratively dropped from the learning process.We modify the original unmasking technique by considering the training accuracy instead of the cross-validation accuracy, in order to use this approach for online abnormal event detection in video.】 </p><p><strong>相关工作：</strong>很多方法不是完全无监督的。在此之前，唯一<strong>不用任何训练数据</strong>进行异常事件检测的是【 A Discriminative Framework for Anomaly Detection in Large Videos 2016】。作者的方法和这个很类似。但是作者的方法可以<strong>在线</strong>处理。作者的方法相当于在这个上面进行了改进。</p><p><strong>一些说明：</strong>对10x10x5的立方块算3D gradient feature，用的是【 A Discriminative Framework for Anomaly Detection in Large Videos 2016】和【 Abnormal Event Detection at 150 FPS in MATLAB. 2013】的方法【<a href="https://github.com/alliedel/anomalyframework_python" target="_blank" rel="noopener">https://github.com/alliedel/anomalyframework_python</a>】来计算<strong>运动特征</strong>，并且不用PCA。对于<strong>外观特征</strong>，用的是VGG-f，用这个考虑的是实时性所以没有用比较深的CNN；在这儿也不对CNN进行fine-tuning，因为本文的方法不能用任何训练数据，所以只是用预训练的CNN提特征；并且，去掉全连接层，将conv5的结果作为外观特征。</p><p>对于评估指标，<strong>EER</strong>在真实的异常检测中可能是具有误导性的，所以不用这个。</p><p><strong>结论中提到：</strong>采用了<strong>融合运动和外观特征的方法</strong>，但是没有看到大量的改善，需要进一步改进融合的方法。比如，可以用方法来在一个相关任务如action recognition上训练无监督的深度特征，然后用这些特征来同时表示运动和外观信息。</p><p><strong>疑惑的地方：1.</strong>【related work】As the authors want to build an approach independent of temporal ordering, they create shuffles of the data by permuting the frames before running each instance of the change detection. <strong>2.【3】</strong>2x2 spatial bins.什么是bin，为什么不直接写成4。<strong>3.【3.2】</strong>为什么假设前w帧标记为正常，后w帧标记为异常，然后训练一个分类器。这样假设有什么用？并且为什么分类器的准确率高后w帧就是异常，低的话后w帧就是正常。<strong>我觉得分类器的准确率高是异常低是正常可以这样解释：</strong>因为分类器分的准确的标志应该是将某两个特征区分度很多的类分开，如果前w是正常，后w是异常，那么分类器的准确率此时应该高，反之应该低。那么前面的假设也可以说的通了，就是相当于一个起始条件吧。</p><p><strong>题外话：</strong>【这些作者在17年的时候还写了一篇论文：Deep Appearance Features for Abnormal Behavior Detection in Video，前面的提特征方法相同。】</p><h3 id="abnormal-event-detection-in-videos-using-generative-adversarial-nets"><a href="#abnormal-event-detection-in-videos-using-generative-adversarial-nets" class="headerlink" title="abnormal event detection in videos using generative adversarial nets"></a>abnormal event detection in videos using generative adversarial nets</h3><p><strong>来源</strong>：ICIP2017</p><p><strong>方法：</strong>用正常的帧和对应的光流图来训练GAN，来学习正常场景的internal  representation。在测试的时候把真实的数据和GAN产生的外观和运动表示比较，通过计算local 不同来检测异常区域。<br>从raw-pixel frames<strong>产生光流图</strong>。</p><h3 id="a-revisit-of-sparse-coding-based-anomaly-detection-in-stacked-RNN-framework"><a href="#a-revisit-of-sparse-coding-based-anomaly-detection-in-stacked-RNN-framework" class="headerlink" title="a revisit of sparse coding based anomaly detection in stacked RNN framework"></a>a revisit of sparse coding based anomaly detection in stacked RNN framework</h3><p><strong>链接</strong>：<a href="https://github.com/StevenLiuWen/sRNN_TSC_Anomaly_Detection" target="_blank" rel="noopener">code</a></p><p><strong>来源</strong>：ICCV2017</p><p><strong>摘要：</strong>提出了TSC（<strong>Temporally-coherent</strong> sparse coding），enforce 相似的相邻帧用相似的重建系数编码。之后用srnn映射TSC,方便了参数优化加速了异常预测。<strong>用sRNN同时学习所有参数</strong>，能够避免TSC的non-trivial的超参数选择。另外用浅层的sRNN，重建稀疏系数可以在前向传播中推断出来，节约了计算成本。</p><p><strong>创新点：</strong>（1）提了TSC，可映射到sRNN方便了参数优化，加速了异常预测。（2）提了新数据集。作者这个新数据集的特点是不是特意设计异常事件，而是用在不同的spots安装的摄像头采集多种角度。</p><p>【为什么提出TSC？因为基于稀疏编码的异常检测方法不考虑相邻帧之间的temporal coherence。相似的特征也可能被编为不同的稀疏编码，丢失了位置信息。为了保留相邻帧之间的相似性，提出了TSC。】主要比较的</p><p><strong>baseline的缺点：1.</strong>基于词典学习的方法的sparse coefficients的优化非常耗时间。另外这些方法主要是基于人工特征的，对于视频表示可能不是最佳的。<strong>2.</strong>2016conv-AE基于3D ConvNet，但是之前的工作表明用双流网络分别提取外观和运动信息是视频中特征的提取的一个better solution。 而且conv-AE的输入是video cube，cube中的正常/异常帧可能会影响彼此的分类，因此需要在所有的帧上对数据集进行中心采样，计算代价大。【作者主要参考或者说关注的三篇论文：2013matlab，2010MPPCA，2011online】</p><p><strong>行文逻辑：</strong>相关工作（2）。方法（3）：【什么是基于稀疏编码的异常检测（3.1）；他有什么优势和缺点（即为什么用TSC），什么是TSC，如何进行优化（3.2）；如何用sRNN解读TSC（3.3）；如何用sRNN学习参数（3.4）；多种尺度采样的多个patch（3.5）；如何测试（3.6）】。我们的数据集（4）。实验（5）：【设置参数和指标（5.1），用仿真数据集测试（5.2），实际数据集测试（5.3），不同超参数的影响（5.4），运行时间（5.5）。】</p><p><strong>方法：</strong>学习能够编码外观上的正常行为的字典，之后，为了提高在相邻帧的预测的平滑性，加上了一个temporally-coherent term。（作者说，有意思的是得到的TSC的公式可以看成是一个特殊的sRNN）。</p><p>稀疏编码的目标函数：         <img src="https://ae01.alicdn.com/kf/H2059b341a83b497d80fb758cb78d7827f.jpg" alt="img">      </p><p> 第一项对应重建误差，第二项对应sparsity项，lambda平衡了sparsity和重建误差。</p><p>TSC的目标函数：</p><p>​         <img src="https://ae01.alicdn.com/kf/H1174a34af2d44b7a90ef024c208eaf3fC.jpg" alt="img">       </p><p><strong>实验细节：</strong>我认为首先是通过UCF101数据集用ConvNet预训练提取空间特征(没有用运动特征，作者认为不能帮助异常预测)，得到特征图，然后 partition the feature map into increasingly finer regions: 1×1, 2×2, and 4×4。然后最大池化。之后对这些不同尺度的特征学习同一个词典。</p><p>测试阶段：将对应于时间t的每块的特征喂给空间sRNN,通过一次前向传播得到αt，就可以计算出对应xt的重建误差，然后选择这帧的所有块的最大重建误差作为帧级别的重建误差，然后做归一化得到每帧的regularity score。</p><p>​         <img src="https://ae01.alicdn.com/kf/H906777bcca3743abbd727ade2508833ei.jpg" alt="img">       </p><p><strong>评估：</strong>作者有个<strong>挺好的</strong>想法，先用一个Synthesized Dataset评估自己的方法对于外观的突然变化导致的异常的表现如何。这个数据集是这样做的：从MINIST里面随意找两个数字，然后把他们放在225x225尺寸的黑背景中。然后在之后的19帧里，这两个数字随意的横向纵向运动。训练的时候用了10000个序列，对于每个测试的序列，5个连续的帧由随意插入的3x3白色的小方块随意的遮挡。测试集一共有3000个序列。如下图所示：<img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-11-11/24681967.jpg" alt="图片"></p><p><strong>挑选数据集的一些考虑：</strong>不用subway是因为有不同的真实标记。uscd ped1更通常用在像素级别的异常检测中，本文做的是帧级别。</p><h3 id="Remembering-History-With-Convolutional-LSTM-for-Anomaly-Detection"><a href="#Remembering-History-With-Convolutional-LSTM-for-Anomaly-Detection" class="headerlink" title="Remembering History With Convolutional LSTM for Anomaly Detection"></a>Remembering History With Convolutional LSTM for Anomaly Detection</h3><p><strong>链接</strong>：<a href="https://github.com/zachluo/convlstm_anomaly_detection" target="_blank" rel="noopener">code</a></p><p><strong>来源</strong>：ICME2017—IEEE International Conference on Multimedia and Expo</p><p><strong>摘要：</strong>用CNN对每一帧进行appearance encoding，用ConvLSTM来记忆过去的帧对应于运动信息。然后把cnn和convlstm和自编码器整合起来，称为ConvLSTM-AE学习正常的外观和运动信息。</p><p><strong>baseline的缺点：</strong>Learning temporal regularity in video sequences, in CVPR, 2016。<strong>而3D卷积不能很好的encode motion</strong>。作者的全文的关注点，或者说改进比较都是在这篇上的。</p><p><strong>实验细节：</strong>T’越大，表示更多的信息被记住了。所以对于有频繁变化的场景，我们可以用一个更小的T‘来保证更高的准确率。</p><p><strong>疑惑的地方：</strong>1.Learning temporal regularity in video sequences, in CVPR, 2016。In order to get a frame level anomaly prediction, it has to do the anomaly detection for multiple video clips and interpolate the degree of anomaly for each frame, which is time-consuming.<strong>2.（3.3节）</strong>In other words,we enforce the network to forget all history information every T’ frames to improve the anomaly detection accuracy。每T‘帧为单位忘掉所有的历史信息。</p><p><strong>网络结构：</strong></p><p>​         <img src="https://ae01.alicdn.com/kf/HTB1r_YIckxz61VjSZFt761DSVXal.png" alt="img">       </p><p><strong>公式：</strong></p><p>目标函数：</p><p>​         <img src="https://ae01.alicdn.com/kf/HTB1GOj2dBCw3KVjSZFl763JkFXaH.png" alt="img">       </p><p>重建误差：</p><p>​         <img src="https://ae01.alicdn.com/kf/H2df84d908df84b04a7bf162ef1856159s.jpg" alt="img">       </p><p><strong>神奇之处：</strong>作者说，和2016那个convae的异常检测方法不同，我们对不同的数据集分别进行训练 ，因为异常的定义不同。所以2016那个是训练出一个通用的模型吗？</p><h2 id="2016"><a href="#2016" class="headerlink" title="2016"></a>2016</h2><h3 id="Plug-and-Play-CNN-for-Crowd-Motion-Analysis-An-Application-in-Abnormal-Event-Detection"><a href="#Plug-and-Play-CNN-for-Crowd-Motion-Analysis-An-Application-in-Abnormal-Event-Detection" class="headerlink" title="Plug-and-Play CNN for Crowd Motion Analysis: An Application in Abnormal Event Detection"></a>Plug-and-Play CNN for Crowd Motion Analysis: An Application in Abnormal Event Detection</h3><p>【the first work to employ the existing CNN models for motion representation in crowd analysis】<br><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-11-11/95069565.jpg" alt="图片"></p><p><strong>来源</strong>：WACV 2018</p><p><strong>创新点：</strong>随着时间跟踪CNN特征的变化。通过将semantic information（从已有的CNN中得到）和low-level optical-flow结合来measure local abnormality 。不需要fine-tuning阶段。</p><p>Track the changes in the CNN features <strong>across time.</strong></p><p>（1）     引入了一个新的Binary Quantization Layer</p><p>（2）     提出了一个Temporal CNN Pattern measure 来表示人群中的运动。</p><p>（无监督的方法比监督方法在异常检测上更好，因为标注的主观性和训练数据少）</p><p><strong>方法步骤：</strong></p><p>​         <img src="https://ae01.alicdn.com/kf/HTB1s82VdBiE3KVjSZFM762QhVXa1.png" alt="img">       </p><p><strong>1.从输入的视频帧序列中提取CNN-based binary maps.</strong>具体来说是所有的帧输入到一个FCN，把一个binary layer 插在FCN的顶部为了把高维的特征图量化成压缩的二值模式。这个binary layer是一个卷积层，其中的权重是用一个external hashing method来初始化的。对于每个对应于FCN的感受野的patch，binary layer产生binary patterns，叫做binary map。输出的binary maps保留了最初帧的空间关系。         <img src="https://ae01.alicdn.com/kf/H7ad08d98bc414fb1a0f2b35dbcfe33578.jpg" alt="img">       </p><p>（其中的FCN层用的是Alexnet）</p><p><strong>2.用提到的CNN-based binary maps来计算Temporal CNN Pattern值。</strong>先根据binary maps来计算histograms。然后根据这些histograms计算TCP。【TCP measure 是用来表示人群的运动的motion representation】</p><p>{实验中TCP measure的分数来衡量是否异常，公式如下}</p><p>​         <img src="https://ae01.alicdn.com/kf/H21567749ae354cfbbd51b1705529c265Z.jpg" alt="img">       </p><p><strong>3.将TCP值和低层次的运动特征（光流）来找到refined motion segments。</strong></p><p><strong>细节：</strong>Binary Quantization Layer——–为什么用这个层，因为聚类高维的特征图需要很大的成本，另外就是需要事先知道聚类中心，用hashing方法聚类高维特征得到小的binary codes是一种解决方法，24位的binary code可以address 2的24次方聚类中心，另外binary map可以简单表示为三通道的RGB图。我感觉这层的实现就是<img src="https://ae01.alicdn.com/kf/H120a89f6db754bcb8163c676c55c4380V.jpg" alt="img"> ，再通过sigmoid函数，最后通过阈值0.5编码为0或者1。  </p><p>Iterative Quantization Hashing（ITQ）——-是所用的hashing方法，训练这个ITQ是所提方法中的唯一训练成本，只需要在训练数据的子集中做一次，用从ITQ学到的weights来建立BQL层。</p><p><strong>关键词：</strong>BFCN、TCP</p><h3 id="learning-temporal-regularity-in-video-sequences"><a href="#learning-temporal-regularity-in-video-sequences" class="headerlink" title="learning temporal regularity in video sequences"></a>learning temporal regularity in video sequences</h3><p><strong>来源</strong>：CVPR2016</p><p><strong>创新点：</strong>学习正常行为模式用非常有限的监督。两种方法：第一种用传统的人工时空局部特征，并在这些特征上面学习一个全连接的自编码器，但是这些特征可能对于学习正常不是最优的；第二种建立一个fully convolutional feed-forward autoencoder来学习局部特征和分类，是一个端到端的框架。<br><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-11-11/98412371.jpg" alt="图片">第一种Learning Motions on Handcrafted Features。首先用HOG和HOF作为时空appearance feature 描述子。为了提取HOG和HOF特征以及轨迹信息，用的是improved trajectory（IT）features。编码器的输入是204维的HOG+HOF特征。目标函数：</p><p><img src="https://ae01.alicdn.com/kf/H42606ae38757456da44b5699d7370daa3.jpg" alt="img">       </p><p><strong>xi是特征。</strong></p><p>第二种是全卷积自编码器，输入是temporal cuboid。<strong>【因为自编码器的参数数量太大，所以需要大量的数据。】</strong>因此做了这样的事：用不同的<strong>skipping</strong> strides连接帧来建立T-sized input cuboid。目标函数：         <img src="https://ae01.alicdn.com/kf/HTB1WxYWdA9E3KVjSZFG76319XXaT.png" alt="img">       </p><p><strong>Xi是第i个cuboid。</strong></p><p><strong>baseline的缺点：</strong> “Abnormal event detection at 150 fps in matlab,” in ICCV, 2013； “Online detection of unusual events in videos via dynamic sparse coding,” in CVPR,2011； “Sparse Reconstruction Cost for Abnormal Event Detection,” in CVPR, 2011。稀疏编码的优化计算代价大。词袋不能保留单词的时空结构并且需要单词数的先验信息。</p><p><strong>有意思的地方：</strong>1.4.4节predicting the Regular Past and the Future。给中间的帧，能预测near过去的和未来的帧。（预测过去有什么用？）</p><p><strong>2.</strong>用了最大池化，空间信息就丢失了，所以在反卷积网络中用了unpooling。</p><p><strong>3</strong>.作者在4.1节画了这样的曲线：蓝色是表示在单一的数据集上训练的，红色表示在所有的数据集上训练的，黄色表示在除了要测试的数据集上训练的(足以表明transfer的能力)         <img src="https://ae01.alicdn.com/kf/H5edf427c496f4978b538af92baba9a4fD.jpg" alt="img">       </p><p><strong>疑惑的地方：1.（3.1.1）</strong>ReLU is not suitable for a network that has large receptive fields for each neuron as the sum of the inputs to a neuron can become very large.<strong>2.（3.2.1）</strong>The learned filters in the deconvolutional layers serve as bases to reconstruct the shape of an input motion cuboid. As we stack the convolutional layers at the beginning of the network, we stack the deconvolutional layers to capture different levels of shape details for building an autoencoder. The filters in early layers of convolutional and the later layers of deconvolutional layers tend to capture specific motion signature of input video frames while high level motion abstractions are encoded in the filters in later layers.（这段话的用意在哪里）<strong>3.（4.3）</strong>这节都不知道在说啥…</p><h3 id="A-discriminative-framework-for-anomaly-detection-in-large-videos"><a href="#A-discriminative-framework-for-anomaly-detection-in-large-videos" class="headerlink" title="A discriminative framework for anomaly detection in large videos"></a>A discriminative framework for anomaly detection in large videos</h3><p><strong>链接</strong>：<a href="https://github.com/alliedel/anomalyframework_ECCV2016" target="_blank" rel="noopener">code</a>、<a href="https://github.com/alliedel/anomalyframework_python" target="_blank" rel="noopener">python_code</a></p><p><strong>来源</strong>：ECCV2016</p><p><strong>创新点：</strong>异常的分数是独立于时间顺序的，不需要分开的训练序列。</p><h2 id="2015"><a href="#2015" class="headerlink" title="2015"></a>2015</h2><h3 id="crowd-motion-monitoring-using-tracklet-based-commotion-measure"><a href="#crowd-motion-monitoring-using-tracklet-based-commotion-measure" class="headerlink" title="crowd motion monitoring using tracklet-based commotion measure"></a>crowd motion monitoring using tracklet-based commotion measure</h3><p><strong>来源</strong>：ICIP2015</p><p><strong>贡献：1.</strong>提出用Motion Pattern来在magnitude和orientation上表示每一帧的tracklet的statistics。<strong>2.</strong>提出Tracklet Binary Code representation在空间和时间上建模一个异常点在其对应轨迹上的运动。<strong>3.</strong>我们提出了一个新的unsupervised measure来评估像素、帧和视频层级的群体场景的commotion。</p><p><strong>方法：1.</strong>tracklet extraction。先用SIFT算法检测异常点，再用跟踪的方法跟踪 L+1 帧。下图的(a)。</p><p><strong>2.</strong>Motion Pattern。用空间坐标表示对应的异常点，进而计算magnitude和orientation。之后画出来binary polar histogram(只有0、1二值)。把经过vectorized polar histogram叫做motion pattern。下图的(b)。</p><p><strong>3.</strong>Tracklet Binary Codes。把上一步的所有motion pattern连接起来计算一个tracklet histogram。H就是Tracklet Binary Codes。下图的(c)。<img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-11-11/85776732.jpg" alt="图片"></p><p><strong>细节：</strong>在做frame-level的对比实验时，由于和比较的方法HOT（有监督的），没有直接的可比性，所以把视频序列分成了两个子集，A和B。训练和测试两遍，分别是A或者B训练，B或者A测试。和HOT相比，两种方法都比较好，但是我们的方法更好因为我们是无监督的。</p><p><strong>疑惑的地方：2中的</strong>commotion measuring的那部分以及<strong>3中的</strong>video-level部分。</p><h3 id="Video-Anomaly-Detection-and-Localization-Using-Hierarchical-Feature-Representation-and-Gaussian-Process-Regression"><a href="#Video-Anomaly-Detection-and-Localization-Using-Hierarchical-Feature-Representation-and-Gaussian-Process-Regression" class="headerlink" title="Video Anomaly Detection and Localization Using Hierarchical Feature Representation and Gaussian Process Regression"></a>Video Anomaly Detection and Localization Using Hierarchical Feature Representation and Gaussian Process Regression</h3><p><strong>来源</strong>：CVPR2015</p><p><strong>创新点：</strong>通过层级框架来检测<strong>局部和全局</strong>的异常<strong>，</strong>通过层级特征表示和GPR（高斯过程回归）。为了同时检测局部异常和全局异常，我们提出了从训练视频中提取normal interactions 的问题(??)，即有效地找到附近稀疏时空兴趣点的频繁几何关系。用GPR建立并建模了interaction templates的codebook。另外提出了一个新的计算observed interaction的likelihood的inference方法。<br><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-11-11/35454189.jpg" alt="图片"></p><h3 id="Learning-Deep-Representations-of-Appearance-and-Motion-for-Anomalous-Event-Detection"><a href="#Learning-Deep-Representations-of-Appearance-and-Motion-for-Anomalous-Event-Detection" class="headerlink" title="Learning Deep Representations of Appearance and Motion for Anomalous Event Detection"></a>Learning Deep Representations of Appearance and Motion for Anomalous Event Detection</h3><p><strong>来源</strong>：BMVC2015</p><p><strong>创新点：</strong>提出了Appearance and Motion DeepNet(AMDN)来自动学习特征表示。stacked denoising autoencoders用来分别学习外观和运动特征，以及联合表示。作者是第一个用无监督的深度学习框架来对视频异常检测自动construct discriminative representations 。设计了两次fusion。第一次叫做pixel-level early fusion，第二次叫做late fusion。</p><p><img src="https://ae01.alicdn.com/kf/HTB1J7k_aLBj_uVjSZFp7630SXXay.png" alt="img">       </p><p><strong>细节：1.</strong>训练AMDN用两个步骤，pretraining和fine-tuning。<strong>2.</strong>本文用了前景分割！</p><p><strong>总结：</strong>简单来说，就是SDAE+one-class SVMs</p><p><strong>疑惑的地方：2.2.1节</strong> One-class SVM Modeling  <strong>2.2.2节</strong> Late Fusion for Anomaly Detection</p><h2 id="2014"><a href="#2014" class="headerlink" title="2014"></a>2014</h2><h3 id="Anomaly-Detection-and-Localization-in-Crowded-Scenes"><a href="#Anomaly-Detection-and-Localization-in-Crowded-Scenes" class="headerlink" title="Anomaly Detection and Localization in Crowded Scenes"></a>Anomaly Detection and Localization in Crowded Scenes</h3><p><strong>来源</strong>：PAMI</p><p>考虑异常行为的检测和定位。提出了同时检测时空异常的detector。</p><p>Temporal normalcy用MDT(mixtures of dynamic textures)建模，spatial normalcy由基于MDT的一个discriminant saliency 检测器来检测。</p><p>考虑了外观和动态，时间和空间和多种空间规模。提出了USCD数据集。数据集的相关介绍可以看看这篇的6.1.这伙人在cvpr2010anomaly detection in crowded scenes出现过。</p><p> 【就是同一篇论文吧。】</p><h2 id="2013"><a href="#2013" class="headerlink" title="2013"></a>2013</h2><h3 id="abnormal-event-detection-at-150-fps-in-matlab"><a href="#abnormal-event-detection-at-150-fps-in-matlab" class="headerlink" title="abnormal event detection at 150 fps in matlab"></a>abnormal event detection at 150 fps in matlab</h3><p><strong>链接：</strong><a href="https://github.com/kpandey008/Abnormal-Event-Detection" target="_blank" rel="noopener">unofficial code</a></p><p><strong>来源</strong>：ICCV2013</p><p><strong>里程碑：</strong>avenue数据集是他们弄的</p><p><strong>引言：</strong>影响高效率的一个阻碍是建立稀疏表示的inherently intensive computation 。</p><p><strong>优势：</strong>快~每秒140-150帧在平常的电脑上。有效地将原来的复杂问题转化为只涉及少量无代价的小尺度最小二乘优化步骤，从而保证了较短的运行时间。</p><p><strong>方法：</strong>Sparse combination learning。和子空间聚类subspace clustering有关系但是又和传统的方法大不相同。子空间聚类的方法的聚类数量k是提前知道或者固定的,我们的方法用允许的表示误差来建立组合,误差上限是显式表现的,具有统计意义.</p><p><img src="https://puui.qpic.cn/fans_admin/0/3_1655376438_1561086415091/0" alt="img">       </p><p><strong>稀疏组合学习有两个目标：一是有效的表示，即找到K个基底组合，有较小的重建误差。二是让组合的总数K足够小。因为K大的话会让重建误差总是接近0，对于异常的事件也是这样。这两个目标是矛盾的。</strong></p><p>训练的时候用了一个maximum representation的策略，自动寻找K但是不让重建误差大幅度增加。实际上对于每个训练特征的误差t都有一个上限。我们的方法以迭代的方式执行。在每个pass中，我们只更新<strong>一个</strong>组合，使它尽可能多地表示训练数据。这个过程可以快速找到编码重要和最常见特性的主要组合。不能很好地表示此组合的其余训练块特征将被送到下一轮以收集剩余的最大共性。</p><p><strong>其他的方法：</strong>降低字典的大小【 Sparse reconstruction costs for abnormal event detection. In CVPR 2011】和采用快的稀疏编码solvers【 Online detection of unusual events in videos via dynamic sparse coding. In CVPR, 2011】，但是他们仍然不够快。</p><p><strong>细节：</strong>每帧resize成3个不同的scale*<em>(20x20,30x40,120x160)，每种scale的frame分成很多小块（10x10的不重叠小块），一共是208个子块(4+12+12</em>16=208)，看图5就知道啦。之后连续5帧的对应的regions堆叠起来组成时空块对于时空快计算3D gradient features 和【 Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models.CVPR2009 】一样。这些特征在视频序列中是根据它们的空间坐标分别处理的。只有在视频帧中相同空间位置的特征才会被一起用于训练和测试。</p><p><strong>稀疏组合的检验：</strong>一共150个不同的video，收集了一共208*150groups的cube features。每个group的组合数表示为K。如图4所示：组合数是10就足够用了。图5中很多region用1个组合就够了因为他们是静态的。</p><p>​         <img src="https://ae01.alicdn.com/kf/HTB12Or5dBKw3KVjSZTE763uRpXaN.png" alt="img">                <img src="https://ae01.alicdn.com/kf/HTB1FIn0dEuF3KVjSZK9762VtXXaF.png" alt="img">       </p><p> <strong>疑惑的地方：1.</strong>  <strong>2.3节</strong>update那里。<strong>2.</strong>table2和table3中的MISC是什么意思。</p><h2 id="2011"><a href="#2011" class="headerlink" title="2011"></a>2011</h2><h3 id="Video-Parsing-for-abnormality-detection"><a href="#Video-Parsing-for-abnormality-detection" class="headerlink" title="Video Parsing for abnormality detection"></a>Video Parsing for abnormality detection</h3><p><strong>来源</strong>：ICCV2011</p><p><strong>创新点：</strong>Parse video frames by establishing a set of hypotheses that jointly explain all the foreground while, at same time, trying to find normal training samples that explain the hypotheses.</p><p><strong>关键词：</strong>object  hypotheses<img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-11-11/31332758.jpg" alt="图片"></p><h3 id="Sparse-Reconstruction-Cost-for-Abnormal-Event-Detection"><a href="#Sparse-Reconstruction-Cost-for-Abnormal-Event-Detection" class="headerlink" title="Sparse Reconstruction Cost for Abnormal Event Detection"></a>Sparse Reconstruction Cost for Abnormal Event Detection</h3><p><strong>来源</strong>：CVPR2011</p><p><strong>摘要：</strong>引入了sparse reconstruction cost。我们的方法提供了一个unified solution来同时检测local abnormal events和global abnormal events。（什么是全局异常呢？就是整个场景是异常的，即使individual local behavior can be normal，什么是局部异常呢？就是local behavior is different from its spatio-temporal neighborhoods.）<br><strong>引言：</strong>稀疏表示能够表示高维度的sample。<br><strong>贡献：</strong>  <strong>1.</strong>support an efficient and robust dstimation of SRC<br><strong>2.</strong>方便地处理LAE和GAE异常。<br><strong>3.</strong>通过逐步更新字典，我们的方法能够支持在线的异常检测。<br><strong>细节：</strong>USCD ped1 数据集处理方法——–把每帧分成了7x7的local patches，有4像素的重叠。用了Type C basis（spatio-temporal basis），dimension 7x16=102.<br>subway数据集处理方法———把帧从512x384大小resize成了320x240大小，并把新的视频帧分成了15x15local patches，有6像素的重叠，用了Type B basis（temporal basis），dimension 16x5=80？？<br><strong>方法：</strong><br><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-11-11/42528664.jpg" alt="图片"><br>绿色或红色的点是正常或异常的测试样本。representatives（深蓝色的点）的optimal subset通过redundant training features(浅蓝色的点)作为basis来构成正常的字典。深蓝色点的半径表示权重，越大表示越正常。异常检测就是measure 测试样本（绿点或红点）在深蓝色点上的稀疏重建成本。</p><h3 id="Online-Detection-of-Unusual-Events-in-Videos-via-Dynamic-Sparse-Coding"><a href="#Online-Detection-of-Unusual-Events-in-Videos-via-Dynamic-Sparse-Coding" class="headerlink" title="Online Detection of Unusual Events in Videos via Dynamic Sparse Coding"></a>Online Detection of Unusual Events in Videos via Dynamic Sparse Coding</h3><p><strong>来源</strong>：CVPR2011</p><p><strong>创新点：</strong>We propose a fully <strong>unsupervised dynamic sparse coding approach</strong> for detecting unusual events in videos based on <strong>online</strong> sparse reconstructibility of query signals from an atomically learned event dictionary, which forms a sparse coding bases.<br><strong>误检的情况：</strong>Subway Exit数据集里面，出现了小孩误检为异常，一个人停在出口并且回头看也误检。<br><strong>相比前人来说成功的地方：</strong>our method not only detects abnormalities in a fine scale, but also unusual events caused by irregular interactions between people</p><h2 id="2010"><a href="#2010" class="headerlink" title="2010"></a>2010</h2><h3 id="Anomaly-Detection-in-Crowded-Scenes"><a href="#Anomaly-Detection-in-Crowded-Scenes" class="headerlink" title="Anomaly Detection in Crowded Scenes"></a>Anomaly Detection in Crowded Scenes</h3><p><strong>来源</strong>：CVPR2010</p><p>MDT模型<br>时间异常检测：[23]背景帧差法。GMM  MDT<br>空间异常检测：center surround saliency with the MDT</p><h3 id="Chaotic-invariants-of-lagrangian-particle-trajectories-for-anomaly-detection-in-crowded-scenes"><a href="#Chaotic-invariants-of-lagrangian-particle-trajectories-for-anomaly-detection-in-crowded-scenes" class="headerlink" title="Chaotic invariants of lagrangian particle trajectories for anomaly detection in crowded scenes"></a>Chaotic invariants of lagrangian particle trajectories for anomaly detection in crowded scenes</h3><p><strong>来源</strong>：CVPR2010</p><p>特殊的粒子轨迹的应用<br>引入了chaotic dynamics </p><h2 id="2009"><a href="#2009" class="headerlink" title="2009"></a>2009</h2><h3 id="Abnormal-crowd-behavior-detection-using-social-force-model"><a href="#Abnormal-crowd-behavior-detection-using-social-force-model" class="headerlink" title="Abnormal crowd behavior detection using social force model"></a>Abnormal crowd behavior detection using social force model</h3><p><strong>来源</strong>：CVPR2009</p><p>社会力模型<br>Bag of words方法来分类异常和正常<br>这个方法比基于纯光流的方法好。<br><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-11-11/12979627.jpg" alt="图片"></p><h3 id="Observe-Locally-Infer-Globally-a-Space-Time-MRF-for-Detecting-Abnormal-Activities-with-Incremental-Updates"><a href="#Observe-Locally-Infer-Globally-a-Space-Time-MRF-for-Detecting-Abnormal-Activities-with-Incremental-Updates" class="headerlink" title="Observe Locally, Infer Globally: a Space-Time MRF for Detecting Abnormal Activities with Incremental Updates"></a>Observe Locally, Infer Globally: a Space-Time MRF for Detecting Abnormal Activities with Incremental Updates</h3><p><strong>来源</strong>：CVPR2009</p><p><strong>创新点：</strong>提出了一个空时MRF模型。为了学习每个local node的正常行为模式，用Mixture of Probabilistic Principal Component  Analyzers(MPPCA) 来capture光流的分布。另外，模型参数可以在新的观测进来的时候updated incrementally。<br><strong>方法：</strong>We extract optical flow features at each frame, use MPPCA to identify the typical patterns, and construct a space-time MRF to enable inference at each local site.<br><strong>作者说自己的优势：</strong>1.可以在local和global context检测异常活动。比单纯是local的方法好，local的方法fails to detect abnormal activities with irregular temporal orderings，并且local的方法对于光流参数很敏感导致高的false alarm rate.比单纯是global的方法好，global的方法fails to detect abnormal activity happens within a region so small,这个region在全局的场景中简单的被视为可以忽略的噪声。并且global的方法在拥挤的环境中会产生false alarm。<br><strong>对前人的方法做了什么改进：</strong>用了08年Robust Real-Time Unusual Event Detection Using Multiple…的subway数据集的gt，但是capture 更微小的异常，比如“no payment”和“loitering”.<br><strong>误检或者漏检的情况：</strong>entrance gate数据集中1.走的慢的人。2.对于far-filed area，产生了false alarm，因为光流对于far-filel area是不靠谱的。3.走的很快的人。4.没刷卡的人。exit gate数据集中”from right exit to left exit”</p><h2 id="2008"><a href="#2008" class="headerlink" title="2008"></a>2008</h2><h3 id="Robust-Real-Time-Unusual-Event-Detection-Using-Multiple-Fixed-Location-Monitors"><a href="#Robust-Real-Time-Unusual-Event-Detection-Using-Multiple-Fixed-Location-Monitors" class="headerlink" title="Robust Real-Time Unusual Event Detection Using Multiple Fixed-Location Monitors"></a>Robust Real-Time Unusual Event Detection Using Multiple Fixed-Location Monitors</h3><p><strong>来源</strong>：PAMI</p><p><strong>方法：</strong>local-monitors-based 。<br>通过multiple, local, low-level feature monitors来监视不寻常的事件。每个monitor是从视频流提取local low-level observation的object。这个observation可以是在monitor的位置的现在的光流方向，或者是local flow的magnitude。<br><strong>异常检测需要什么：</strong>1.对于给定的视频流的tuning 算法应该简单快速。2.算法应该adaptive，适应环境的变换。3.short learning period。4.低成本。5.predictable performance。<br><strong>局限性：</strong>不能检测loitering person或者在进入安检的时候不刷卡。总结里面说，局限性是the lack of <strong>sequential</strong> monitoring.<br><strong>术语：</strong>aperture problem孔径问题<a href="https://blog.csdn.net/hankai1024/article/details/23433157" target="_blank" rel="noopener">https://blog.csdn.net/hankai1024/article/details/23433157</a>；SSD error matrix<br><strong>评价：</strong>2009年Observe Locally, Infer Globally: a Space-Time MRF for Detecting Abnormal Activities with Incremental Updates评价：focus attention on individual local activities,where typical flow directions and speeds are measured on a grid in the video frame. While efficient and simple to implement, <strong>such an approach fails to model temporal relationships between motions.</strong></p><p><strong>思考：</strong>为什么作者用了这个方法，有什么优缺点。</p>]]></content>
      
      
      <categories>
          
          <category> paper </category>
          
      </categories>
      
      
        <tags>
            
            <tag> anomaly detection </tag>
            
        </tags>
      
    </entry>
    
    <entry>
      <title>异常行为检测文献新论文跟进</title>
      <link href="/2018/11/11/%E5%BC%82%E5%B8%B8%E8%A1%8C%E4%B8%BA%E6%A3%80%E6%B5%8B%E6%96%87%E7%8C%AE%E6%96%B0%E8%AE%BA%E6%96%87%E8%B7%9F%E8%BF%9B/"/>
      <url>/2018/11/11/%E5%BC%82%E5%B8%B8%E8%A1%8C%E4%B8%BA%E6%A3%80%E6%B5%8B%E6%96%87%E7%8C%AE%E6%96%B0%E8%AE%BA%E6%96%87%E8%B7%9F%E8%BF%9B/</url>
      
        <content type="html"><![CDATA[<h1 id="PredGAN-a-deep-multi-scale-video-prediction-framework-for-detecting-anomalies-in-videos"><a href="#PredGAN-a-deep-multi-scale-video-prediction-framework-for-detecting-anomalies-in-videos" class="headerlink" title="PredGAN - a deep multi-scale video prediction framework for detecting anomalies in videos"></a>PredGAN - a deep multi-scale video prediction framework for detecting anomalies in videos</h1><p><strong>来源：</strong>引用了cvpr2018future frame prediction for anomaly detection<br><strong>创新点：</strong>引入了EMD评价指标。Earth Mover’s Distance(EMD)。来判断帧是否是异常。是第一次将EMD作为评价video prediction framework的结果的评价指标。<br><strong>贡献：</strong>一、提出了一个video prediction framework来检测视频中的异常。用正常事件进行训练，能够准确预测视频帧的evolution。二、引入了EMD作为评估生成帧的质量的指标，用这个标准将帧标为异常或正常。三、证明在UCSD 行人数据集和Avenue数据集上的结果达到了state-of-the-art。</p><a id="more"></a><h1 id="Detecting-Abnormality-without-Knowing-Normality-A-Two-stage-Approach-for-Unsupervised-Video-Abnormal-Event-Detection"><a href="#Detecting-Abnormality-without-Knowing-Normality-A-Two-stage-Approach-for-Unsupervised-Video-Abnormal-Event-Detection" class="headerlink" title="Detecting Abnormality without Knowing Normality: A Two-stage Approach for Unsupervised Video Abnormal Event Detection"></a>Detecting Abnormality without Knowing Normality: A Two-stage Approach for Unsupervised Video Abnormal Event Detection</h1><p><strong>来源：</strong>引用了cvpr2018future frame prediction for anomaly detection<br><strong>摘要：</strong>很多方法都采用了supervised setting，即需要收集正常的事件来训练，但是很少的能在不事先知道正常事件的前提下检测异常。现在的无监督方法检测剧烈局部变化的为异常，忽略了全局的时空上下文。<br><strong>创新点：</strong>提出了一个新的无监督方法，包括两个阶段：首先是normality estimation stage，训练了一个自编码器，并通过自适应重建误差阈值从整个未标记的视频中全局地估计正常事件。第二，normality modeling stage，将从上个阶段估计的正常事件喂给one-class svm来建立一个refined normality model，后续可以排除异常事件并且提高异常检测的性能。<br><strong>引言：</strong>现有的方法分成两类：对正常异常进行建模和仅对正常进行建模。第一类的泛化能力差，不能处理没见过的异常行为。现有的方法采用了一种supervised setting，需要人工来确定训练集，（我觉得这点考虑很好），就是说需要人来把视频分成仅包含正常的和其他的包含异常的。作者想通过unsupervised setting，实现不需要事先知道正常事件来训练一个正常模型。【什么意思，没太看懂】[8,28]两篇文章也用了这样的思路但是没有考虑全局的时空上下文。<br><strong>结果：</strong>在ucsd ped1上面的结果还挺差的。<br><strong>有意思的地方：</strong>在图6中，c图当一个人扔包的时候，另外一个人受到了惊吓，也被检测出来了。有意思有意思。<br><strong>细节：</strong>在Normality Estimation Stage，以self-adaptive 的方式选择合适的阈值T。 通过让重建误差损失函数的inter-class variance。<br><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-11-11/29287285.jpg" alt="图片"></p><h1 id="Spatio-Temporal-AutoEncoder-for-Video-Anormaly-Detection"><a href="#Spatio-Temporal-AutoEncoder-for-Video-Anormaly-Detection" class="headerlink" title="Spatio-Temporal AutoEncoder for Video Anormaly Detection"></a>Spatio-Temporal AutoEncoder for Video Anormaly Detection</h1><p><strong>来源：</strong>ACM mm2017 作者是alibaba的【感觉是个水会】<br><strong>代码地址：</strong><a href="https://github.com/yshean/abnormal-spatiotemporal-ae" target="_blank" rel="noopener">https://github.com/yshean/abnormal-spatiotemporal-ae</a><br><strong>创新点：</strong>STAE 时空编码器来自动学习视频的表示，提取时间和空间特征。引入了一个weight-decreasing prediction loss来产生未来的帧。（这个loss guides 编码器更好的提取时间特征）。<br>引入这个weight-decreasing prediction loss是因为模型训练更容易被后续帧里面出现的新的目标影响。<br><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-11-11/97828510.jpg" alt="图片"></p><h1 id="Generative-Neural-Networks-for-Anomaly-Detection-in-Crowded-Scenes"><a href="#Generative-Neural-Networks-for-Anomaly-Detection-in-Crowded-Scenes" class="headerlink" title="Generative Neural Networks for Anomaly Detection in Crowded Scenes"></a>Generative Neural Networks for Anomaly Detection in Crowded Scenes</h1><p><strong>来源：</strong>期刊IEEE Transactions on Information Forensics and Security二区<br><strong>代码地址：</strong><a href="https://github.com/tianwangbuaa/VAE-for-abnormal-event-detection" target="_blank" rel="noopener">https://github.com/tianwangbuaa/VAE-for-abnormal-event-detection</a><br><strong>创新点：</strong>S2-VAE（2是上标）。SF-VAE（F是下标）是一个<strong>浅层</strong>的生成网络生成来得到一个像Gaussian mixture 的模型来适合真实数据的分布。SC-VAE（C是下标），是个<strong>深度</strong>生成网络，利用了CNN和skip connection的优点。<strong>方法细节：</strong>SF-VAE用来从原始的样本中过滤中一些明显正常的样本，可以减少在下一个阶段的训练和测试时间。在第二个阶段，剩下的样本是先enlarged，然后进入到SC-VAE中。SC-VAE的卷积操作可以从输入中学习到hierarchical 特征和local relationship。SC-VAE比SF-VAE有更强的学习能力。<br><strong>实验细节：</strong>先做预处理，先用FCN提取前景。<br><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-11-11/78253764.jpg" alt="图片"><br><strong>作者的一些思考？：</strong>根据模式识别和机器学习，simple Gaussian 分布没有能力来描述复杂的结构，然而，the mixture of Gaussian distribution在适应真实数据的分布上更有能力。</p><h1 id="Detection-of-Unknown-Anomalies-in-Streaming-Videos-with-Generative-Energy-based-Boltzmann-Models"><a href="#Detection-of-Unknown-Anomalies-in-Streaming-Videos-with-Generative-Energy-based-Boltzmann-Models" class="headerlink" title="Detection of Unknown Anomalies in Streaming Videos with Generative Energy-based Boltzmann Models"></a>Detection of Unknown Anomalies in Streaming Videos with Generative Energy-based Boltzmann Models</h1><p><strong>来源：</strong>pattern recognition letters 大类3区小类4区<br><strong>创新点：</strong>用了energy-based models。【这篇文章的切入点是深度置信网络】<br><strong>异常检测有挑战的地方：</strong>标注数据耗费劳动力，能够利用未标注的数据就好了。第二个是定义不明确。<br><strong>作者称自己的优势：</strong>大多数存在的系统能高性能的检测异常，但是不能解释为什么得到了这些检测。我们的模型可以理解场景，解释为什么产生了fault alarms，因此我们的检测结果是可解释的。我们是第一次将DBM用在视频数据中的异常检测的，也是第一次在DBM的文献中用a single model来同时聚类和重建数据的。<br><strong>实验部分： </strong>  <strong>A.Scene clustering</strong> 这个部分的结果和k-means 聚类进行了比较<strong>B.Scene reconstructing</strong>  <strong>C.Anomaly detection </strong>和一些无监督的异常检测系统进行比较。无监督的异常检测系统可以分为(a)无监督学习方法，包括PCA，OC-SVM和GMM(高斯混合模型).(b)CAE和ConvAE。<strong>D.Video analysis and model explanation </strong>从图10可以看出，帧90和帧110中都有一个骑自行车的人，这个人渐行渐远，当这个人变得特别小的时候，就和别的行人颜色什么一样了，因此就解释了为什么产生了误检。<br><strong>题外话：</strong>这个paper的图都好好看好有意思啊……<br>另外作者一直在说为什么没用RBM而用了DBM。因为在RBM中聚类模块和重建模块是分开的，所以不能保证在abstract representation和detection decision中得到一个校准（对齐？），因此我们看到的pattern maps不能反应模型真正的做的东西。而DBM可以同时训练聚类层和重建层。</p>]]></content>
      
      
      <categories>
          
          <category> paper </category>
          
      </categories>
      
      
        <tags>
            
            <tag> anomaly detection </tag>
            
        </tags>
      
    </entry>
    
    <entry>
      <title>学习使用TensorFlow来识别交通标志</title>
      <link href="/2018/09/13/%E5%AD%A6%E4%B9%A0%E4%BD%BF%E7%94%A8TensorFlow%E6%9D%A5%E8%AF%86%E5%88%AB%E4%BA%A4%E9%80%9A%E6%A0%87%E5%BF%97/"/>
      <url>/2018/09/13/%E5%AD%A6%E4%B9%A0%E4%BD%BF%E7%94%A8TensorFlow%E6%9D%A5%E8%AF%86%E5%88%AB%E4%BA%A4%E9%80%9A%E6%A0%87%E5%BF%97/</url>
      
        <content type="html"><![CDATA[<h1 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h1><p>本文参考<a href="https://juejin.im/entry/5a1637f2f265da432528f6ef" target="_blank" rel="noopener">https://juejin.im/entry/5a1637f2f265da432528f6ef</a>  的文章和  <a href="https://github.com/waleedka/traffic-signs-tensorflow" target="_blank" rel="noopener">https://github.com/waleedka/traffic-signs-tensorflow</a>  的源代码。   </p><p> 给定交通标志的图像，我们的模型应该能够知道它的类型。<br> 首先我们要导入需要的库。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> tensorflow <span class="keyword">as</span> tf</span><br><span class="line"><span class="keyword">import</span> os</span><br><span class="line"><span class="keyword">import</span> numpy <span class="keyword">as</span> np</span><br><span class="line"><span class="keyword">import</span> matplotlib.pyplot <span class="keyword">as</span> plt</span><br><span class="line"><span class="keyword">from</span> skimage <span class="keyword">import</span> data</span><br><span class="line"><span class="keyword">from</span> skimage <span class="keyword">import</span> transform</span><br><span class="line"><span class="keyword">import</span> random</span><br></pre></td></tr></table></figure><pre><code>/home/song/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.  from ._conv import register_converters as _register_converters</code></pre>   <a id="more"></a><h2 id="1-加载数据和分析数据"><a href="#1-加载数据和分析数据" class="headerlink" title="1  加载数据和分析数据"></a>1  加载数据和分析数据</h2><h3 id="1-1-加载数据"><a href="#1-1-加载数据" class="headerlink" title="1.1 加载数据"></a>1.1 加载数据</h3><p>我们使用的是Belgian Traffic Sign Dataset。网址为<a href="http://btsd.ethz.ch/shareddata/" target="_blank" rel="noopener">http://btsd.ethz.ch/shareddata/</a><br>在这个网站可以下载到我们需要的数据集。你只需要下载BelgiumTS for Classification (cropped images):后面的两个数据集：  </p><pre><code>BelgiumTSC_Training (171.3MBytes)  BelgiumTSC_Testing (76.5MBytes)  </code></pre><p>  我把这两个数据集分别放在了以下的路径：    </p><pre><code>/home/song/Downloads/BelgiumTSC_Training/Training  /home/song/Downloads/BelgiumTSC_Testing/Testing  </code></pre><p>  Training目录包含具有从00000到00061的序列号的子目录。目录名称表示从0到61的标签，每个目录中的图像表示属于该标签的交通标志。 图像以不常见的.ppm格式保存，但幸运的是，这种格式在skimage库中得到了支持。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">load_data</span><span class="params">(data_dir)</span>:</span></span><br><span class="line">    <span class="comment"># Get all subdirectories of data_dir. Each represents a label.</span></span><br><span class="line">    directories = [d <span class="keyword">for</span> d <span class="keyword">in</span> os.listdir(data_dir)</span><br><span class="line">                  <span class="keyword">if</span> os.path.isdir(os.path.join(data_dir, d))]</span><br><span class="line"></span><br><span class="line">    <span class="comment"># Loop through the label directories and collect the data in</span></span><br><span class="line">    <span class="comment"># two lists, labels and images.</span></span><br><span class="line">    labels = []</span><br><span class="line">    images = []</span><br><span class="line">    <span class="keyword">for</span> d <span class="keyword">in</span> directories:</span><br><span class="line">        label_dir = os.path.join(data_dir, d)</span><br><span class="line">        file_names = [os.path.join(label_dir, f) </span><br><span class="line">                      <span class="keyword">for</span> f <span class="keyword">in</span> os.listdir(label_dir) </span><br><span class="line">                      <span class="keyword">if</span> f.endswith(<span class="string">".ppm"</span>)]</span><br><span class="line">        <span class="keyword">for</span> f <span class="keyword">in</span> file_names:</span><br><span class="line">            images.append(data.imread(f))</span><br><span class="line">            labels.append(int(d))</span><br><span class="line">    <span class="keyword">return</span> images, labels</span><br><span class="line"></span><br><span class="line">ROOT_PATH = <span class="string">"/home/song/Downloads/"</span></span><br><span class="line">train_data_dir = os.path.join(ROOT_PATH, <span class="string">"BelgiumTSC_Training/Training"</span>)</span><br><span class="line">test_data_dir = os.path.join(ROOT_PATH, <span class="string">"BelgiumTSC_Testing/Testing"</span>)</span><br><span class="line"></span><br><span class="line">images, labels = load_data(train_data_dir)</span><br></pre></td></tr></table></figure><h3 id="1-2-分析数据"><a href="#1-2-分析数据" class="headerlink" title="1.2 分析数据"></a>1.2 分析数据</h3><p>我们可以看一下我们的训练集中有多少图片和标签：</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">print(<span class="string">"Unique Labels: &#123;0&#125;\nTotal Images: &#123;1&#125;"</span>.format(len(set(labels)), len(images)))</span><br></pre></td></tr></table></figure><pre><code>Unique Labels: 62Total Images: 4575</code></pre><p>这里的set很有意思，可以看一下这篇文章：<a href="http://www.voidcn.com/article/p-uekeyeby-hn.html" target="_blank" rel="noopener">http://www.voidcn.com/article/p-uekeyeby-hn.html</a><br>这里的set很有意思，可以看一下这篇文章：<a href="http://www.voidcn.com/article/p-uekeyeby-hn.html" target="_blank" rel="noopener">http://www.voidcn.com/article/p-uekeyeby-hn.html</a><br>在处理一系列数据时，如果需要剔除重复项，则通常采用set数据类型。本身labels里面是有很多重复的元素的，但set(labels)就剔除了重复项。可以通过print(labels)和print(set(labels))命令查看一下两者输出的有什么区别。<br>我们还可以通过画直方图来看一下数据的分布情况。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">plt.hist(labels,<span class="number">62</span>)</span><br><span class="line">plt.show()</span><br></pre></td></tr></table></figure><p><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-9-13/58270023.jpg" alt="png"></p><p>可以看出，该数据集中有的标签的分量比其它标签更重：标签 22、32、38 和 61 显然出类拔萃。这一点之后我们会更深入地了解。</p><h3 id="1-3-可视化数据"><a href="#1-3-可视化数据" class="headerlink" title="1.3 可视化数据"></a>1.3 可视化数据</h3><h4 id="1-3-1-热身"><a href="#1-3-1-热身" class="headerlink" title="1.3.1 热身"></a>1.3.1 热身</h4><p>我们可以先随机地选取几个交通标志将其显示出来。我们还可以看一下图片的尺寸。我们还可以看一下图片的最小值和最大值，这是验证数据范围并及早发现错误的一个简单方法。其中的plt.axis(‘off’)是为了不在图片上显示坐标尺，大家可以注释掉这句话看看如果去掉有什么不一样。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">traffic_signs=[<span class="number">100</span>,<span class="number">1050</span>,<span class="number">3650</span>,<span class="number">4000</span>]</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> i <span class="keyword">in</span> range(len(traffic_signs)):</span><br><span class="line">    plt.subplot(<span class="number">1</span>, <span class="number">4</span>, i+<span class="number">1</span>)</span><br><span class="line">    plt.axis(<span class="string">'off'</span>)</span><br><span class="line">    plt.imshow(images[traffic_signs[i]])</span><br><span class="line">    <span class="comment">#plt.subplots_adjust(wspace=0.5)</span></span><br><span class="line">    plt.show()</span><br><span class="line">    print(<span class="string">"shape: &#123;0&#125;, min: &#123;1&#125;, max: &#123;2&#125;"</span>.format(images[traffic_signs[i]].shape, </span><br><span class="line">                                                  images[traffic_signs[i]].min(), </span><br><span class="line">                                                  images[traffic_signs[i]].max()))</span><br></pre></td></tr></table></figure><p><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-9-13/91631184.jpg" alt="png"></p><pre><code>shape: (292, 290, 3), min: 0, max: 255</code></pre><p><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-9-13/94366993.jpg" alt="png"></p><pre><code>shape: (132, 139, 3), min: 4, max: 255</code></pre><p><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-9-13/55563405.jpg" alt="png"></p><pre><code>shape: (146, 110, 3), min: 7, max: 255</code></pre><p><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-9-13/66096551.jpg" alt="png"></p><pre><code>shape: (110, 105, 3), min: 3, max: 255</code></pre><p>大多数神经网络需要固定大小的输入，我们的网络也不例外。 但正如我们上面所看到的，我们的图像大小并不完全相同。 一种常见的方法是将图像裁剪并填充到选定的纵横比，但是我们必须确保在这个过程中我们不会切断部分交通标志。 这似乎需要进行手动操作！ 我们其实有一个更简单的解决方案，即我们将图像大小调整为固定大小，并忽略由不同长宽比导致的失真。 这时，即使图片被压缩或拉伸了一点，我们也可以很容易地识别交通标志。我们用下面的命令将图片的尺寸调整为32<em>32。<br>大多数神经网络需要固定大小的输入，我们的网络也不例外。 但正如我们上面所看到的，我们的图像大小并不完全相同。 一种常见的方法是将图像裁剪并填充到选定的纵横比，但是我们必须确保在这个过程中我们不会切断部分交通标志。 这似乎需要进行手动操作！ 我们其实有一个更简单的解决方案，即我们将图像大小调整为固定大小，并忽略由不同长宽比导致的失真。 这时，即使图片被压缩或拉伸了一点，我们也可以很容易地识别交通标志。我们用下面的命令将图片的尺寸调整为32</em>32。</p><h4 id="1-3-2-重调图片的大小"><a href="#1-3-2-重调图片的大小" class="headerlink" title="1.3.2 重调图片的大小"></a>1.3.2 重调图片的大小</h4><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">images32 = [transform.resize(image,(<span class="number">32</span>,<span class="number">32</span>)) <span class="keyword">for</span> image <span class="keyword">in</span> images]</span><br></pre></td></tr></table></figure><pre><code>/home/song/.local/lib/python3.6/site-packages/skimage/transform/_warps.py:105: UserWarning: The default mode, &apos;constant&apos;, will be changed to &apos;reflect&apos; in skimage 0.15.  warn(&quot;The default mode, &apos;constant&apos;, will be changed to &apos;reflect&apos; in &quot;/home/song/.local/lib/python3.6/site-packages/skimage/transform/_warps.py:110: UserWarning: Anti-aliasing will be enabled by default in skimage 0.15 to avoid aliasing artifacts when down-sampling images.  warn(&quot;Anti-aliasing will be enabled by default in skimage 0.15 to &quot;</code></pre><p>重新运行上面随机显示交通标志的代码。<br>重新运行上面随机显示交通标志的代码。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">traffic_signs=[<span class="number">100</span>,<span class="number">1050</span>,<span class="number">3650</span>,<span class="number">4000</span>]</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> i <span class="keyword">in</span> range(len(traffic_signs)):</span><br><span class="line">    plt.subplot(<span class="number">1</span>, <span class="number">4</span>, i+<span class="number">1</span>)</span><br><span class="line">    plt.axis(<span class="string">'off'</span>)</span><br><span class="line">    plt.imshow(images32[traffic_signs[i]])</span><br><span class="line">    plt.subplots_adjust(wspace=<span class="number">0.5</span>)</span><br><span class="line">    plt.show()</span><br><span class="line">    print(<span class="string">"shape: &#123;0&#125;, min: &#123;1&#125;, max: &#123;2&#125;"</span>.format(images32[traffic_signs[i]].shape, </span><br><span class="line">                                                  images32[traffic_signs[i]].min(), </span><br><span class="line">                                                  images32[traffic_signs[i]].max()))</span><br></pre></td></tr></table></figure><p><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-9-13/65434369.jpg" alt="png"></p><pre><code>shape: (32, 32, 3), min: 0.0, max: 1.0</code></pre><p><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-9-13/59269206.jpg" alt="png"></p><pre><code>shape: (32, 32, 3), min: 0.038373161764705975, max: 1.0</code></pre><p><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-9-13/62497379.jpg" alt="png"></p><pre><code>shape: (32, 32, 3), min: 0.05559895833333348, max: 1.0</code></pre><p><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-9-13/90866008.jpg" alt="png"></p><pre><code>shape: (32, 32, 3), min: 0.048665364583333495, max: 1.0</code></pre><p>从上面的图和shape的值都能看出，图片的尺寸一样大了。最小值和最大值现在的范围在0和1.0之间，和我们未调整图片大小时的范围不同。<br>从上面的图和shape的值都能看出，图片的尺寸一样大了。最小值和最大值现在的范围在0和1.0之间，和我们未调整图片大小时的范围不同。</p><h4 id="1-3-3-显示每一个标签下的第一张图片"><a href="#1-3-3-显示每一个标签下的第一张图片" class="headerlink" title="1.3.3 显示每一个标签下的第一张图片"></a>1.3.3 显示每一个标签下的第一张图片</h4><p>之前我们在直方图中看过62个标签的分布情况。现在我们尝试将每个标签下的第一张图片显示出来，另外还可以通过列表的count()方法来统计某个标签出现的次数，也就是能统计出有多少张图片对应该标签。我们可以定义一个函数，名为display_images_and_labels，你当然可以定义成别的名字，不过定义函数是为了之后可以方便地调用。以下分别显示出了未调整尺寸和已调整尺寸的交通标志图。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">display_images_and_labels</span><span class="params">(images, labels)</span>:</span></span><br><span class="line">    <span class="string">"""Display the first image of each label."""</span></span><br><span class="line">    unique_labels = set(labels)</span><br><span class="line">    plt.figure(figsize=(<span class="number">15</span>, <span class="number">15</span>))</span><br><span class="line">    i = <span class="number">1</span></span><br><span class="line">    <span class="keyword">for</span> label <span class="keyword">in</span> unique_labels:</span><br><span class="line">        <span class="comment"># Pick the first image for each label.</span></span><br><span class="line">        image = images[labels.index(label)]</span><br><span class="line">        plt.subplot(<span class="number">8</span>, <span class="number">8</span>, i)  <span class="comment"># A grid of 8 rows x 8 columns</span></span><br><span class="line">        plt.axis(<span class="string">'off'</span>)</span><br><span class="line">        plt.title(<span class="string">"Label &#123;0&#125; (&#123;1&#125;)"</span>.format(label, labels.count(label)))</span><br><span class="line">        i += <span class="number">1</span></span><br><span class="line">        plt.imshow(image)</span><br><span class="line">    </span><br><span class="line"></span><br><span class="line">display_images_and_labels(images, labels)</span><br><span class="line">display_images_and_labels(images32, labels)</span><br></pre></td></tr></table></figure><p><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-9-13/19586062.jpg" alt="png"></p><p><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-9-13/15441432.jpg" alt="png"></p><p>正如我们在直方图中看到的那样，具有标签 22、32、38 和 61 的交通标志要明显多得多。图中可以看到标签 22 有 375 个实例，标签 32 有 316 实例，标签 38 有 285 个实例，标签 61 有 282 个实例。</p><h4 id="1-3-4-显示某一个标签下的交通标志"><a href="#1-3-4-显示某一个标签下的交通标志" class="headerlink" title="1.3.4 显示某一个标签下的交通标志"></a>1.3.4 显示某一个标签下的交通标志</h4><p>看过每个标签下的第一张图片之后，我们可以将某一个标签下的图片展开显示出来，看看这个标签下的是否是同一类交通标志。我们不需要把该标签下的所有图片都显示出来，可以只展示24张，你可以更改为其他的数字，显示更多或者更少。我们这里选择标签为21的看一下，在之前的图片中可以看到，label 21对应于stop标志。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">display_label_images</span><span class="params">(images, label)</span>:</span></span><br><span class="line">    <span class="string">"""Display images of a specific label."""</span></span><br><span class="line">    limit = <span class="number">24</span>  <span class="comment"># show a max of 24 images</span></span><br><span class="line">    plt.figure(figsize=(<span class="number">15</span>, <span class="number">5</span>))</span><br><span class="line">    i = <span class="number">1</span></span><br><span class="line"></span><br><span class="line">    start = labels.index(label)</span><br><span class="line">    end = start + labels.count(label)</span><br><span class="line">    <span class="keyword">for</span> image <span class="keyword">in</span> images[start:end][:limit]:</span><br><span class="line">        plt.subplot(<span class="number">3</span>, <span class="number">8</span>, i)  <span class="comment"># 3 rows, 8 per row</span></span><br><span class="line">        plt.axis(<span class="string">'off'</span>)</span><br><span class="line">        i += <span class="number">1</span></span><br><span class="line">        plt.imshow(image)</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">display_label_images(images32,<span class="number">21</span>)</span><br></pre></td></tr></table></figure><p><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-9-13/52152061.jpg" alt="png"></p><p>可以看出，label 21对应的前24张图片都是stop标志。不难推测，整个label 21对应的应都是stop标志。</p><h2 id="2-构建深度网络"><a href="#2-构建深度网络" class="headerlink" title="2 构建深度网络"></a>2 构建深度网络</h2><h3 id="2-1-构建TensorFlow图并训练"><a href="#2-1-构建TensorFlow图并训练" class="headerlink" title="2.1 构建TensorFlow图并训练"></a>2.1 构建TensorFlow图并训练</h3><p>首先，我们创建一个Graph对象。TensorFlow有一个默认的全局图，但是我们不建议使用它。设置全局变量通常太容易引入错误了，因此我们自己创建一个图。之后设置占位符来放图片和标签。注意这里参数x的维度是 [None, 32, 32, 3]，这四个参数分别表示 [批量大小，高度，宽度，通道] （通常缩写为 NHWC）。我们定义了一个全连接层，并使用了relu激活函数进行非线性操作。我们通过argmax()函数找到logits最大值对应的索引，也就是预测的标签了。之后定义loss函数，并选择合适的优化算法。这里选择Adam算法，因为它的收敛速度比一般的梯度下降算法更快。这个时候我们只刚刚构建图，并且描述了输入。我们定义的变量，比如，loss和predicted_labels，它们都不包含具体的数值。它们是我们接下来要执行的操作的引用。我们要创建会话才能开始训练。我这里把循环次数设置为301，并且如果i是10的倍数，就打印loss的值。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br></pre></td><td class="code"><pre><span class="line">g = tf.Graph()</span><br><span class="line"></span><br><span class="line"><span class="keyword">with</span> g.as_default():</span><br><span class="line">    <span class="comment"># Initialize placeholders </span></span><br><span class="line">    x = tf.placeholder(dtype = tf.float32, shape = [<span class="keyword">None</span>, <span class="number">32</span>, <span class="number">32</span>,<span class="number">3</span>])</span><br><span class="line">    y = tf.placeholder(dtype = tf.int32, shape = [<span class="keyword">None</span>])</span><br><span class="line"></span><br><span class="line">    <span class="comment"># Flatten the input data</span></span><br><span class="line">    images_flat = tf.contrib.layers.flatten(x)</span><br><span class="line">    <span class="comment">#print(images_flat)</span></span><br><span class="line">    </span><br><span class="line">    <span class="comment"># Fully connected layer </span></span><br><span class="line">    logits = tf.contrib.layers.fully_connected(images_flat, <span class="number">62</span>, tf.nn.relu)</span><br><span class="line"></span><br><span class="line">     <span class="comment"># Convert logits to label </span></span><br><span class="line">    predicted_labels = tf.argmax(logits, <span class="number">1</span>)</span><br><span class="line">    </span><br><span class="line">    <span class="comment"># Define a loss function</span></span><br><span class="line">    loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels = y, </span><br><span class="line">                                logits = logits))</span><br><span class="line"></span><br><span class="line">    <span class="comment"># Define an optimizer </span></span><br><span class="line">    train_op = tf.train.AdamOptimizer(learning_rate=<span class="number">0.001</span>).minimize(loss)</span><br><span class="line"></span><br><span class="line">    print(<span class="string">"images_flat: "</span>, images_flat)</span><br><span class="line">    print(<span class="string">"logits: "</span>, logits)</span><br><span class="line">    print(<span class="string">"loss: "</span>, loss)</span><br><span class="line">    print(<span class="string">"predicted_labels: "</span>, predicted_labels)</span><br><span class="line"></span><br><span class="line">    sess=tf.Session(graph=g)</span><br><span class="line">    sess.run(tf.global_variables_initializer())</span><br><span class="line">    <span class="keyword">for</span> i <span class="keyword">in</span> range(<span class="number">301</span>):</span><br><span class="line">        <span class="comment">#print('EPOCH', i)</span></span><br><span class="line">        _,loss_value = sess.run([train_op, loss], feed_dict=&#123;x: images32, y: labels&#125;) </span><br><span class="line">        <span class="keyword">if</span> i % <span class="number">10</span> == <span class="number">0</span>:</span><br><span class="line">            print(<span class="string">"Loss: "</span>, loss_value)</span><br><span class="line">        <span class="comment">#print('DONE WITH EPOCH')</span></span><br></pre></td></tr></table></figure><pre><code>images_flat:  Tensor(&quot;Flatten/flatten/Reshape:0&quot;, shape=(?, 3072), dtype=float32)logits:  Tensor(&quot;fully_connected/Relu:0&quot;, shape=(?, 62), dtype=float32)loss:  Tensor(&quot;Mean:0&quot;, shape=(), dtype=float32)predicted_labels:  Tensor(&quot;ArgMax:0&quot;, shape=(?,), dtype=int64)Loss:  4.181018Loss:  3.0714655Loss:  2.6622696Loss:  2.4586942Loss:  2.3419585Loss:  2.2633858Loss:  2.2044215Loss:  2.157206Loss:  2.1180305Loss:  2.0847433Loss:  2.0559382Loss:  2.030667Loss:  2.008251Loss:  1.9882014Loss:  1.9701369Loss:  1.9537587Loss:  1.938837Loss:  1.9251733Loss:  1.912607Loss:  1.9010073Loss:  1.8902632Loss:  1.8802778Loss:  1.8709714Loss:  1.8622767Loss:  1.8541412Loss:  1.8465083Loss:  1.8393359Loss:  1.8325756Loss:  1.8261962Loss:  1.8201678Loss:  1.8144621</code></pre><h3 id="2-2使用模型"><a href="#2-2使用模型" class="headerlink" title="2.2使用模型"></a>2.2使用模型</h3><h3 id="2-2使用模型-1"><a href="#2-2使用模型-1" class="headerlink" title="2.2使用模型"></a>2.2使用模型</h3><p>现在我们用sess.run()来使用我们训练好的模型，并随机取了训练集中的10个图片进行分类，并同时打印了真实的标签结果和预测结果。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Pick 10 random images</span></span><br><span class="line">sample_indexes = random.sample(range(len(images32)), <span class="number">10</span>)</span><br><span class="line">sample_images = [images32[i] <span class="keyword">for</span> i <span class="keyword">in</span> sample_indexes]</span><br><span class="line">sample_labels = [labels[i] <span class="keyword">for</span> i <span class="keyword">in</span> sample_indexes]</span><br><span class="line"></span><br><span class="line"><span class="comment"># Run the "predicted_labels" op.</span></span><br><span class="line">predicted = sess.run([predicted_labels], </span><br><span class="line">                        feed_dict=&#123;x: sample_images&#125;)[<span class="number">0</span>]</span><br><span class="line">print(sample_labels)</span><br><span class="line">print(predicted)</span><br></pre></td></tr></table></figure><pre><code>[41, 39, 1, 53, 21, 22, 38, 48, 7, 53][41 39  1 53 21 22 40 47  7 53]</code></pre><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">​```python</span><br><span class="line">fig=plt.figure(figsize=(<span class="number">10</span>,<span class="number">10</span>))</span><br><span class="line"><span class="keyword">for</span> i <span class="keyword">in</span> range(len(sample_images)):</span><br><span class="line">    truth = sample_labels[i]</span><br><span class="line">    prediction = predicted[i]</span><br><span class="line">    plt.subplot(<span class="number">5</span>,<span class="number">2</span>,<span class="number">1</span>+i)</span><br><span class="line">    plt.axis(<span class="string">"off"</span>)</span><br><span class="line">    color=<span class="string">'green'</span> <span class="keyword">if</span> truth == prediction <span class="keyword">else</span> <span class="string">'red'</span></span><br><span class="line">    plt.text(<span class="number">40</span>,<span class="number">10</span>,<span class="string">"Truth:        &#123;0&#125;\nPrediction: &#123;1&#125;"</span>.format(truth, prediction), </span><br><span class="line">             fontsize=<span class="number">12</span>, color=color)</span><br><span class="line">    plt.imshow(sample_images[i])</span><br></pre></td></tr></table></figure><p><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-9-13/94071068.jpg" alt="png"></p><h3 id="2-3评估模型"><a href="#2-3评估模型" class="headerlink" title="2.3评估模型"></a>2.3评估模型</h3><p>以上，我们的模型只在训练集上是可以正常运行的，但是它对于其他的未知数据集的泛化能力如何呢？我们可以在测试集当中进行评估。我们还可以计算一下准确率。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br></pre></td><td class="code"><pre><span class="line">test_images, test_labels = load_data(test_data_dir)</span><br><span class="line">test_images32 = [transform.resize(image, (<span class="number">32</span>, <span class="number">32</span>))</span><br><span class="line">                 <span class="keyword">for</span> image <span class="keyword">in</span> test_images]</span><br><span class="line">display_images_and_labels(test_images32, test_labels)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Calculate how many matches we got.</span></span><br><span class="line">predicted = sess.run([predicted_labels], </span><br><span class="line">                        feed_dict=&#123;x: test_images32&#125;)[<span class="number">0</span>]</span><br><span class="line">match_count = sum([int(y == y_) </span><br><span class="line">                   <span class="keyword">for</span> y, y_ <span class="keyword">in</span> zip(test_labels, predicted)])</span><br><span class="line">accuracy = match_count / len(test_labels)</span><br><span class="line">print(<span class="string">"Accuracy: &#123;:.4f&#125;"</span>.format(accuracy))</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="comment"># Pick 10 random images</span></span><br><span class="line">sample_test_indexes = random.sample(range(len(test_images32)), <span class="number">10</span>)</span><br><span class="line">sample_test_images = [test_images32[i] <span class="keyword">for</span> i <span class="keyword">in</span> sample_test_indexes]</span><br><span class="line">sample_test_labels = [test_labels[i] <span class="keyword">for</span> i <span class="keyword">in</span> sample_test_indexes]</span><br><span class="line"></span><br><span class="line"><span class="comment"># Run the "predicted_labels" op.</span></span><br><span class="line">test_predicted = sess.run([predicted_labels], </span><br><span class="line">                        feed_dict=&#123;x: sample_test_images&#125;)[<span class="number">0</span>]</span><br><span class="line">print(sample_test_labels)</span><br><span class="line">print(test_predicted)</span><br><span class="line"></span><br><span class="line">fig=plt.figure(figsize=(<span class="number">10</span>,<span class="number">10</span>))</span><br><span class="line"><span class="keyword">for</span> i <span class="keyword">in</span> range(len(sample_test_images)):</span><br><span class="line">    truth = sample_test_labels[i]</span><br><span class="line">    prediction = test_predicted[i]</span><br><span class="line">    plt.subplot(<span class="number">5</span>,<span class="number">2</span>,<span class="number">1</span>+i)</span><br><span class="line">    plt.axis(<span class="string">"off"</span>)</span><br><span class="line">    color=<span class="string">'green'</span> <span class="keyword">if</span> truth == prediction <span class="keyword">else</span> <span class="string">'red'</span></span><br><span class="line">    plt.text(<span class="number">40</span>,<span class="number">10</span>,<span class="string">"Truth:        &#123;0&#125;\nPrediction: &#123;1&#125;"</span>.format(truth, prediction), </span><br><span class="line">             fontsize=<span class="number">12</span>, color=color)</span><br><span class="line">    plt.imshow(sample_test_images[i])</span><br></pre></td></tr></table></figure><pre><code>/home/song/.local/lib/python3.6/site-packages/skimage/transform/_warps.py:105: UserWarning: The default mode, &apos;constant&apos;, will be changed to &apos;reflect&apos; in skimage 0.15.  warn(&quot;The default mode, &apos;constant&apos;, will be changed to &apos;reflect&apos; in &quot;/home/song/.local/lib/python3.6/site-packages/skimage/transform/_warps.py:110: UserWarning: Anti-aliasing will be enabled by default in skimage 0.15 to avoid aliasing artifacts when down-sampling images.  warn(&quot;Anti-aliasing will be enabled by default in skimage 0.15 to &quot;Accuracy: 0.5631[38, 35, 19, 32, 32, 7, 13, 38, 18, 38][39  0 19 32 32  7 13 40 17 39]</code></pre><p><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-9-13/67359700.jpg" alt="png"></p><p><img src="http://boketuchuang.oss-cn-beijing.aliyuncs.com/18-9-13/73349594.jpg" alt="png"></p><h3 id="2-4关闭会话"><a href="#2-4关闭会话" class="headerlink" title="2.4关闭会话"></a>2.4关闭会话</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sess.close()</span><br></pre></td></tr></table></figure><p>最后，记得关闭会话。</p>]]></content>
      
      
      <categories>
          
          <category> tensorflow </category>
          
      </categories>
      
      
        <tags>
            
            <tag> tensorflow </tag>
            
        </tags>
      
    </entry>
    
    <entry>
      <title>Git学习——使用Git将本地库的内容推送到Github</title>
      <link href="/2018/05/27/git%E5%AD%A6%E4%B9%A0/"/>
      <url>/2018/05/27/git%E5%AD%A6%E4%B9%A0/</url>
      
        <content type="html"><![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>本文参考廖雪峰老师的教程，记录git将本地库的内容推送到Github远程仓库的整个流程。比如我要推送的是叫assignment1的文件夹，内容为斯坦福大学CS231n课程的第一个作业。</p><h2 id="流程"><a href="#流程" class="headerlink" title="流程"></a>流程</h2><ol><li><p>进入包含assignment1文件夹的目录，把当前目录变成Git可以管理的仓库:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span> git init</span><br></pre></td></tr></table></figure><a id="more"></a></li><li><p>配置用户信息:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span> git config --global user.name "xxx"</span><br><span class="line"><span class="meta">$</span> git config --global user.email "xxx@xxx.com"</span><br></pre></td></tr></table></figure></li><li><p>将文件添加到暂存区:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span> git add assignment1</span><br></pre></td></tr></table></figure></li><li><p>提交文件到分支:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span> git commit -m "add assignment1"</span><br></pre></td></tr></table></figure><p>双引号里面是本次提交的说明。输入说明对自己和对别人的阅读都很重要，所以建议写上。</p></li><li><p>在github上建立远程仓库，仓库的名字取为CS231n。</p></li><li><p>现在，在本地的仓库运行以下命令： </p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span> git remote add origin git@github.com:xxx/CS231n.git</span><br></pre></td></tr></table></figure><p>这里的xxx替换成自己的Github用户名。</p></li><li><p>将本地库的所有内容推送到远程库上：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span> git push -u origin master</span><br></pre></td></tr></table></figure></li><li><p>由于远程库是空的，我们第一次推送master分支时，加上了-u参数，Git不但会把本地的master分支内容推送的远程新的master分支，还会把本地的master分支和远程的master分支关联起来，在以后的推送或者拉取时就可以简化命令：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">$</span> git push origin master</span><br></pre></td></tr></table></figure></li><li><p>如果我们对本地库的文件进行了修改，提交到远程仓库只需要进行第3,4以及8步即可。</p></li></ol>]]></content>
      
      
      <categories>
          
          <category> 一些摸索 </category>
          
      </categories>
      
      
        <tags>
            
            <tag> git </tag>
            
        </tags>
      
    </entry>
    
  
  
</search>