Skip to content

Latest commit

 

History

History
78 lines (44 loc) · 2.3 KB

README.md

File metadata and controls

78 lines (44 loc) · 2.3 KB

crawl-mf2

Build Status Coverage Status Greenkeeper badge

Crawl a microformats2 site to find things like canonical URLs for h-entrys

Note: this module does not really handle pages with more than one top-level microformats2 nodes.

Installation

npm install crawl-mf2

Example

Start a crawl and log canonical h-entry URLs found on https://strugee.net/blog/:

var crawl = require('crawl-mf2');

var crawler = crawl('https://strugee.net/blog/');

crawler.on('h-entry', function(url, mf2node) {
	console.log(url);
});

API

The module exports a single function, crawlMf2, which takes a single argument, the base URL to crawl from.

It returns an EventEmitter.

Events

'error'

Emitted when an error occurs. Currently this means either the microformats2 parser failed or an HTTP error occurred.

Note: treated specially by Node.js.

'urlDisco'

  • String The URL being discovered

Emitted when a new URL is discovered, including the initial base URL.

'mf2Parse'

Emitted when a URL is parsed for microformats2 markup.

'h-feed'

Emitted when an h-feed page is discovered.

'h-entry'

Emitted when an h-entry page is discovered.

License

LGPL 3.0+

Author

AJ Jordan alex@strugee.net