sfLucenePlugin
文件大小: unknow
源码售价: 5 个金币 积分规则     积分充值
资源说明:Fork of sfLucenePlugin from Symfony's SVN. Updated to work with Symfony 1.4.x
Please note this is a fork of Carl Vondrick's work, specifically the Doctrine-1.2 branch hosted here: 

http://svn.symfony-project.com/plugins/sfLucenePlugin/branches/1.2-Doctrine/


Warning : This is a work in progress and it has been tested only with Doctrine

= Introduction =
sfLucenePlugin integrates symfony and Zend Search Lucene to instantly add a search engine to your application.  The plugin will auto-detect your ORM layer : Doctrine and Propel.

= Requirements =
  * symfony 1.4
  * Doctrine 1.2
  * PHP 5.2.x+

sfLucene ships with all external dependencies satisfied and will work on both private and shared hosting environments.

= Main Features =
  * Configured all by YAML files
  * Complete integration with symfony 1.1
  * i18n ready
  * Keyword highlighting
  * Management of Lucene indexes
  * 500+ unit tests and 98% API code coverage

= Development Status =
The 1.1 branch for sfLucene will work, however it is not rock-solid stable.  Please use it, but be ready to report a few bugs to Trac.

= Installation =
  * Install the plugin (from project directory):
{{{
cd plugins
git clone git://github.com/mbrung/sfLucenePlugin.git
}}}

  * Initialize configuration files (ignore this if you are upgrading):
{{{
symfony lucene:initialize myapp
}}}
  * Clear the cache
{{{
symfony cc
}}}
  * Configure sfLucene per the instructions below.

= Configuring Lucene =
The entire plugin is configured by search.yml files placed throughout your application.  You must be careful that you are aware of what search.yml file you are working in because each one has a different purpose.  As you will later learn, the project level search.yml file controls the entire engine while a module's search.yml defines indexing parameters.

Open your project's search.yml file, located in {{{ myproject/config/search.yml }}}.  If you followed the installation instructions above, you will see:

{{{
MyIndex:
  models:

  index:
    encoding: UTF-8
}}}

Similar to your schema.yml file, you can have multiple indexes.  The only requirement is that you must name them!  So, enter a name for the first index, where "MyIndex" goes.  This is used internally only by the plugin. If you require a different encoding to be used, enter it.  Note, however, that UTF-8 is generally the best charset to store your indexes in.

If you require i18n support, you must define the cultures that you support under index.  Use the following syntax:
{{{
index:
  cultures: [en, fr]
}}}

(If you receive an exception saying "Culture XXX is not enabled" then define the culture even if you do not use i18n.)

By default, the plugin will not index or search on common words, such as "the" and "a".  Further, it ignores single characters.  If you require different behavior, you can define them like so:

{{{
index:
  stop_words: [the, an, it]
  short_words: 2
}}}

The plugin is also capable of different indexing filters, which determine what content is indexed.  For instance, if you need your index to be case-sensitive, you need to use a different indexing filter.  The same goes for utf-8 support and numbers.  To change these, open the project's search.yml file and add:

{{{
index:
  analyzer: utf8num
  case_sensitive: off
  mb_string: on
}}}

"analyzer" can either by text, textnum, utf8, or utf8num.  If you choose text, all numbers will be ignored.  If you require the index to be case sensitive, set "case_sensitive" to "on".    mb_string determines whether to use the mb_* string functions instead of the standard PHP string filters.  This adds a huge performance bottleneck to indexing when turned on, so use with care.  You only need to turn mb_string on if you are working with a case-insensitive utf8 index.

By default, in the command line sfLucene operates in the "search" environment.  This has the advantage to that if you require a different configuration setting for the search environment, you can easily set it up.  But, if your database is only selectively configured per environment (ie, not for "all"), then you will quickly run into trouble.  To get around this, define a database for either "all" or the "search" environment.

= Indexing =
sfLucene currently supports two ways to add information to the index:
  1. Through the ORM layer
  2. Through symfony actions

Through the ORM layer is the recommended method to add information to the index.  The plugin can keep the index synchronized if you use the ORM layer.  Through symfony actions is intended only for static content, such as the privacy policy.

== ORM layer method ==
Open your project's search.yml file and you will find a model declaration towards the top.  This is where you put the models you wish to index.  For each model, you define the fields you want to index and other parameters. The syntax is:

{{{
MyIndex:
  models:
    BlogPost:
      fields:
        id: unindexed
        title:
          boost: 1.5
          type: text
        content: unstored
        description: text
    BlogComment:
      fields:
        id: unindexed
        summary: text
        message: text
      description: message
      title: summary
}}}

In the above example, two models are set to index: BlogPost and BlogComment.  In BlogPost, the fields title, content, and description are stored, but the title fields holds the most weight with a boost factor of 1.5.

When search results are displayed, the system intelligently guesses which field should be displayed as the result "title" and which field is the result "description." However, to be explicit, you can specify a description and title field, as in BlogComment.

Note that the fields do not have to exist as fields in your database.  As long as it has a getter on the model, you can use it in your index.  The fields are automatically camelized, so if you wish to call "->getSuperDuperMan()" as one of your fieds, you must write it in the YAML file as "super_duper_man".

See [http://framework.zend.com/manual/en/zend.search.lucene.html the Zend_Search_Lucene documentation] for more about the field types.

Further, you can specify a transformation function to put the value through before it is indexed.  This is useful if you have HTML code being returned and you need strip it out.  Define this like so:
{{{
MyIndex:
  models:
    BlogPost:
      fields:
        title: text
        content:
          type: text
          transform: strip_tags
          boost: 1.5
}}}

When this model is indexed, the {{{ content }}} field will be automatically routed through strip_tags() before being stored in the index.

Next, you must tell your application where to route the model when it is returned.  You do this by opening your application's config/search.yml file and defining a route:

{{{
MyIndex:
  models:
    BlogPost:
      route: blog/showPost?id=%id%
    BlogComment:
      route: blog/showComment?id=%id%
}}}

In routes, %xxx% is a token and will be replaced by the appropriate field value.  So, %id% will be the value returned by the ->getId() method.  Warning: You must also define the field in the project's search.yml to be indexed or unexpected results will occur!

Finally, you must register the model with the system.  If you are using Propel, you must use Propel's behaviors.

=== Propel ===
You can do this by opening up the model's file and putting
{{{
sfLucenePropelBehavior::getInitializer()->setup('MyModel');
}}}
after the class declaration.  So, for a blog, you would open project/lib/model/BlogPost.php and append the above, replacing "!MyModel" with "!BlogPost".

If you wish to disable the Propel behavior so that no indexing can occur, you can simply do:
{{{
sfLucenePropelBehavior::setLock(true); // disables Propel behavior, no indexing
sfLucenePropelBehavior::setLock(false); // enables Propel behavior, does index
}}}

By default, the behavior is not locked so indexing does occur.

== Advanced Model Settings ==

sfLucene optimizes memory usage when rebuilding the index from the database by using both an internal pager and hydrating objects on demand.  By default, rows are selected in batches of 250, but if you require this to be different, you can customize it like so:

{{{
MyIndex:
  models:
    BlogPost:
      rebuild_limit: 250
}}}

If only some of your objects should be stored in the index, you can define an validating method on the model that can return a boolean indicating whether the model should be indexed.  If this method returns true, the indexer proceeds with indexing.  If the method returns false, the indexer ignores that particular instance.

You need to specify the method to call like so ("isIndexable" is a good name):

{{{
MyIndex:
  models:
    BlogPost:
      validator: should_index
}}}

== symfony actions method ==
To setup an action to be indexed, you must create a file in the module's config directory named search.yml.  Inside this file, you define the actions you want indexed:

{{{
MyIndex:
  privacy:
  tos:
    security:
      authenticated: true
      credentials: [admin]
  disclaimer:
    params:
      advanced: true
    layout: true
}}}

Remember to prefix each one with the name of the index.

As you can see, it is possible to define request parameters, manipulate authentication, and toggle decorating the response.  By default, the response is not decorated, the user is not authenticated without any credentials, and there aren't any request parameters.

= Building the Index =
After you have defined the indexing parameters, you must build the initial index.  You do this on the command line:

{{{
$ symfony lucene:rebuild myapp
}}}

replacing myapp with the name of your application you want to rebuild.  This will build the index for all cultures.

= Searching =
sfLucene ships with a basic search interface that you can use in your application.  Like the rest of the plugin, it is i18n ready and all you must do is define the translation phrases.

To enable the interface, open your application's settings.yml file and add "sfLucene" to the enabled_modules section:

{{{
all:
  .settings:
    enabled_modules: [default, sfLucene]
}}}

If you have specified multiple indexes in your search.yml files, you need to configure which index that you want to search.  You do this by opening the app.yml file and adding the configuration setting:

{{{
all:
  lucene:
    index: MyIndex
}}}

If you need to configure which index to use on the fly, you can use sfConfig:

{{{
sfConfig::set('app_lucene_index', 'MyIndex');
}}}

== Customizing the Interface ==
As every application is different, it is easy to customize the search interface to fit the look and feel of your site. Doing this is easy as all you must do is overload the templates and actions.

=== Creating a Skeleton Module ===

To get started, simply run the following on the command line:

{{{
$ symfony lucene:init-module myApp
}}}

If you look in myapp's module folder, you will see a new sfLucene module.  Use this to customize your interface.

The lucene:init-module task is capable of custom module names and linking each module to a certain index.  This makes it possible to have multiple search interfaces in the same application.  To do this, simply run the above command with two extra parameters:

{{{
$ symfony lucene:init-module myApp myLucene myIndex
}}}

The above will create a skelelton module called "myLucene" in the application "myApp" and configure this module to search off the index "myIndex".

=== Customizing Results ===

Often, when writing a search engine, you need to display a different result template for each model.  For instance, a blog post should show differently than a forum post.  You can easily customize your results by changing the "partial" value in your application's search.yml file.   For example:
{{{
models:
  BlogPost:
    route: blog/showPost?slug=%slug%
    partial: blog/searchResult
  ForumPost:
    route: forum/showThread?id=%id%
    partial: forum/searchResult
}}}

Alternatively you may use a named route, like so:

{{{
models:
  BlogPost:
    route: @show_post?slug=%slug%
}}}

For ForumPost, the partial apps/myapp/modules/forum/templates/_searchResult.php is loaded.  This partial is given a $result object that you can use to build that result.  The API for this object is pretty simple:

  * {{{ $result->getInternalTitle() }}} returns the title of the search result.
  * {{{ $result->getInternalRoute() }}} returns the route to the search result.
  * {{{ $result->getScore() }}} returns the score / ranking of the search result.
  * {{{ $result->getXXX() }}} returns the XXX field.

In addition to the $result object, it is also given a $query string, which was what the user searched for.  This is useful for highlighting the results.

=== Advanced Search ===

If you wish to disable the advanced search interface, open the application's app.yml file and add the following:

{{{
all:
  lucene:
    advanced: off
}}}

This will prevent sfLucene from giving the user the option to use the advanced mode.

== Routing ==

sfLucene will automatically register friendly routes with symfony.  For example, surfing to {{{ http://example.org/search }}} will route to sfLucene.  If you would like to customize these routes, you can disable them in the app.yml file with:

{{{
all:
  lucene:
    routes: off
}}}

It will then be up to you configure the routing.

== Pagination ==
You can customize pages by using the same logic as above (defaults to 10):
{{{
all:
  lucene:
    per_page: 10
}}}

To customize the pager widget that is displayed, change the pageradius key (defaults 5):
{{{
all:
  lucene:
    pager_radius: 5
}}}

== Results ==
You can configure the presentations of the search results. If you require more fine-tuned customizations, you are encouraged to create your own templates.

To change the number of characters displayed in search results, edit the "resultsize" key:
{{{
all:
  lucene:
    result_size: 200
}}}

To change the highlighter used to highlight results, edit the "resulthighlighter" key:
{{{
all:
  lucene:
    result_highlighter: |
      %s
}}}

= Highlighting Pages =
The plugin has an optional highlighter than will attempt to highlight keywords from searches.  The highlighter will hook into this search engine and also attempts to hook into external search engines, such as Google and Yahoo!.

To enable this feature, open the application's config/filters.yml file and add the highlight filter before the cache filter:
{{{
rendering: ~
web_debug: ~
security:  ~

# generally, you will want to insert your own filters here

highlight:
  class: sfLuceneHighlightFilter

cache:     ~
common:    ~
flash:     ~
execution: ~
}}}

By default, the highlighter will also attempt to display a notice to the user that automatic highlighting occured.  The filter will search the result document for {{{  }}} and replace it with an i18n-ready notice (note: this is case sensitive).

To highlight a keyword, it must meet the following criteria:
  * must be X/HTML response content type
  * response must not be headers only
  * must not be an ajax request
  * be inside the  tag
  * be outside of