data.dump only for the m... - VerySource" />

imexport
文件大小: unknow
源码售价: 5 个金币 积分规则     积分充值
资源说明:Simple import from a text file generated by mysql -E ...
== Description

This library imports data from a text file (created by 
mysql -E -e "SELECT something FROM somewhere, somewhere_else ..." > data.dump
only for the moment) into your rails app, meaning it only works with your
ActiveRecord models.

I had this little problem. There was an old web app written in java and a new 
rails app. Although both were using mysql, "old" and "new" DBs were running on 
separated servers and they had quite a different data structures. Still, 
I wanted to keep some pieces of data synchronized quite frequently, at least 
for a not-so-short transition period.

I had a couple options to consider:

* Simple "INSERT INTO new_DB (...) SELECT original_data FROM old_DB" or similar
  to that. Cons: I couldn't do it in one sql expression as the data structures 
  were too different; I'd have to take care of attributes such as +updated_at+ 
  and +created_at+ manually, let alone models validations.
  
* Create models for old data structures in the new rails app and do something
  like this in a rake task: 
  
     OldModel.all do |old|
       NewModel.create :attr1 => old.attr1, :attr2 => old...
     end
     
  Cons: I didn't want to mess up the new rails app with a bunch of models 
  I'd never use except for synchronization.
  
* Dump +original_data+ into a CSV format and the use +CSV+ or +FasterCSV+.
  Actually this was the choice I opted for from the beginning. The problem
  here was that I had a really messed up data sometimes: lots of copy&paste
  from software like MS Word, etc. +FasterCSV+ was throwing Malformed exceptions
  too often and +CSV+ sometimes wasn't able to recognize end of row / beginning
  of a new row. It wasn't their fault, it was my data bad quality. So, I decided
  to write this little gem.
   
== How it's different from CSV and FasterCSV

First off, this library isn't meant to replace either of them. It works with
different text formats (not CSV) and doesn't do just file parsing.

Consider this snippet created by 
mysql -E -e "SELECT title AS COLUMN_title, speaker AS COLUMN_speaker, abstract AS COLUMN_abstract FROM Seminars" > seminars.txt:

  *************************** 7. row ***************************
  COLUMN_title: Conditional XPath = Codd Complete XPath
  COLUMN_speaker: John Smith
  COLUMN_abstract: This paper positively solves the following problem: Is there a natural
  expansion of XPath 1.0 in which  every first order query over
  XML document tree models is expressible?
  We give two necessary and sufficient conditions on XPath like

This library creates a new model object, recognizes each +COLUMN_attr+ and
tries to set attribute of that object, like model.title = COLUMN_title, 
model.speaker = COLUMN_speaker and model.abstract = COLUMN_abstract.

It then runs model validations (model.valid?) and does either model.save or
model.update_attributes(attrs_hash).

== Usage

Say, you have a model called +Seminar+ with the following attributes:

  create_table "seminars", :force => true do |t|
    t.string   "title",                               
    t.text     "abstract",                            
    t.datetime "date",                                
    t.text     "notes"
    t.datetime "created_at"
    t.datetime "updated_at"
    t.boolean  "published"
  end
  
Consider a snippet of a DB text dump similar to the previous example. 
Let's just add few more columns:

  *************************** 7. row ***************************
  COLUMN_title: Conditional XPath = Codd Complete XPath
  COLUMN_date_time: 2004-11-30T15:30:00
  COLUMN_publish: 1
  COLUMN_abstract: This paper positively solves the following problem: Is there a natural
  expansion of XPath 1.0 in which  every first order query over
  XML document tree models is expressible?

Define a rake task in your rails app and require +imexport+, e.g.

  namespace :db do
    namespace :import do
      task :seminars => :environment do
        require imexport
      end
    end
  end
  
Now, define columns-to-model-attributes:

  COLUMNS_TO_MODEL_MAP = {
    'date_time' => { :date => Proc.new do |datetime| 
                                # YYYY-MM-DDTHH:MM:SS
                                DateTime.strptime(datetime, '%FT%T')
                              end },
    'publish' => { :published => Proc.new do |val|
                                         val.to_i == 1
                                 end }
  }
  
As you noticed we didn't define mapping for +title+ and +abstract+ as they
are simple strings and don't need any special conversion. Plus, column names
are the same as model attributes.

Lastly, let's do the sync:

  ImExport::import(ENV['FROM_FILE'], { 
                   :class_name        => 'Seminar', 
                   :find_by           => 'title',
                   :db_columns_prefix => 'COLUMN_',
                   :map               => COLUMNS_TO_MODEL_MAP})

You would run the task in this way:

  rake db:import:seminars FROM_FILE=/path/to/seminars.txt

and your +seminars+ table is synchronized.

So, the complete rake task would look like this:

  namespace :db do
    namespace :import do
      task :seminars => :environment do
        require imexport
        
        COLUMNS_TO_MODEL_MAP = {
          'date_time' => { :date => Proc.new do |datetime| 
                                      # YYYY-MM-DDTHH:MM:SS
                                      DateTime.strptime(datetime, '%FT%T')
                                    end },
          'publish' => { :published => Proc.new do |val|
                                               val.to_i == 1
                                       end }
        }

        ImExport::import(ENV['FROM_FILE'], { 
                         :class_name        => 'Seminar', 
                         :find_by           => 'title',
                         :db_columns_prefix => 'COLUMN_',
                         :map               => COLUMNS_TO_MODEL_MAP})
      end
    end
  end

Also, you can pass a block to ImExport::import. In that case you'll have to 
call model.save or model.update_attributes(...) yourself:

  ImExport::Import.from_file(ENV['FROM_FILE'], { 
         :class_name        => 'Seminar', 
         :find_by           => 'title',
         :db_columns_prefix => 'COLUMN_',
         :verbose           => false,
         :map               => COLUMNS_TO_MODEL_MAP}) do |seminar|
                   
    # do something with seminar object here, e.g. 
    # seminar.save
    puts "---> #{seminar.inspect}"
  end

=== Options for ImExport::import


+class_name+:: 
  "String" or :symbol. ActiveRecord model defined in your rails app.

+find_by+::
  "String" or :symbol.
  This is how ImExport will recognize whether it should do model.save or 
  model.update_attributes(...). Considering previous example it would do 
  seminar.save if seminar.find_by_title(...) returns nil or 
  seminar.update_attributes(...) otherwise.

+db_columns_prefix+::
  "String".
  Column name prefix that should be skipped while looking for the corresponding
  model attribute name. Againg, considering previous example, +COLUMN_title+ 
  actually means +title+ attribute of +Seminar+ model.
  
+verbose+::
  +true+, +false+ or +Proc+.new { |model| }.
  In cases where model.valid? returns false ImExport might output an error 
  (WARNING) message. +Verbose+ option tells it whether do it or not. In case
  +verbose+ is a Proc, the latter being passed the model in question and should
  return +true+ or +false+. 
  Default is +true+.

+map+::
  Hash.
  Tells ImExport how to map column attributes with their corresponding model 
  attributes. Don't add +db_columns_prefix+ to the colum names here, 
  it is already cleaned up.
    
  Also, you don't really have to define mapping for attributes that have the
  same names as columns in the text file to be parsed, they will be recognized
  and set automatically.
  
  Every item in this Hash can be defined in one of the following ways:
    
    'column_name' => :symbol
  _Behavior_: model.symbol = value_of_column_name
      
    'column_name' => { :symbol => Proc.new { |column_value| ... } }
  _Behavior_: model.symbol = result_of_Proc_call where Proc's only argument is 
  the column value.
      
    'column_name' => Proc.new { |column_value, model_object| ... }
  _Behavior_: Proc called with two arguments, column value and object-to-be-saved itself.
  This is the only case where your code should take of updating 
  model's attribute(s) since ImExport can't guess the attribute name.

== How to install

  sudo gem install crhym3-imexport
  
If that fails execute the following and try it again.

  gem sources -a http://gems.github.com/

== License

Copyright (c) 2009 Alex Vagin, released under the MIT license.

mailto:alex@digns.com


本源码包内暂不包含可直接显示的源代码文件,请下载源码包。