plagg
文件大小: unknow
源码售价: 5 个金币 积分规则     积分充值
资源说明:plagg is a weblog/news aggregator that works in conjunction with Rael Dornfest’s blosxom.
# plagg, a RSS aggregator

## 0. What is this?

plagg is a weblog/news aggregator that works in conjunction with
[Rael Dornfest's](http://www.raelity.org) [blosxom](http://www.blosxom.com).
It can be easily extended to support other blogging tools.

plagg reads an OPML file containing a list of RSS or Atom feeds, and
generates blosxom blog entries from these feeds. The items of each feed
are generated into their own directory/blosxom category, which allows to
read the news all at once or per feed.

You can see examples of plagg's output [on my news page](http://drbeat.li/news).


## 1. Installation

1. Download [plagg](http://drbeat.li/py/plagg/plagg.tar.gz)
2. Untar the distribution file to a directory of your choice
3. Run `python setup.py install` as root
4. Set up an [OPML](#opml) file containing the feeds you'd like to read
5. Run `plagg -d` _newsdir_ _opmlfile_ as often as you like from a cron job, where _newsdir_ is somewhere within your blosxom data directory
6. Enjoy your personalized news feed!


## 2. Usage

### 2.1. Synopsis

    plagg -fFnovVh [-d newsdir] [opmlfile [nickname ...]]

### 2.2. Options

* -f: Don't write the entry footers. Use this option if your blosxom template
      includes a footer.
* -F: Run `plagg` for a single feed whose URL is _opmlfile_. One _nickname_ is
      mandatory and indicates the name of the folder within _newsdir_ where
      the entries get written.
* -n: Write a file _newsdir_/`Latest.txt` that contains the new entries.
* -o: Also generate entries older than one week. These are normally suppressed.
* -v: Be verbose. May be repeated for additional effect.
* -V: Display version information and exit.
* -h: Display usage information and exit.
* -d _newsdir_: The destination directory in subdirectories of which the
      news items are stored. This should be inside your blosxom data directory
      so that blosxom can find and display the items.

### 2.3. Arguments

* _opmlfile_: The OPML file containing the feeds to read and generate news
  items from, or the feed URL if the `-F` option was given.
* _nickname_: If given, updates only the feeds with the given nicknames
  (ignoring their `hours` attribute), otherwise updates all feeds. If `-F`
  was give, the name of the feed.

The default arguments for _opmlfile_ and _destdir_ can be set in the `plagg` script.


## 3. The OPML file

The distribution contains my OPML file as an example.

The basic OMPL syntax is defined in the [OPML specification](http://www.opml.org/spec).

### 3.1. RSS/Atom feeds

Set the `type` attribute to `"rss"`. This is the default feed type.
Plagg reads the feed given by the `xmlUrl` attribute and generates news items
from its content.

Example:

    

The `htmlUrl` attribute is not used by `plagg` itself, but by `opml.xsl`, which
I use to generate my [blogroll](http://drbeat.li/news/news.opml).

### 3.2. HTML scraping

Set the `type` to `"x-plagg-html"`. In this case, plagg reads the HTML page
whose URL is in the `htmlUrl` attribute. There are two ways of specifying how to
scrape: Using a regex or using [XPath][XPATH] expressions.

The result of the scraping is either an image link or an `