Pynits
文件大小: unknow
源码售价: 5 个金币 积分规则     积分充值
资源说明:Python interactive nit-picking and repairing tool
#+TITLE: Pynits — Notes
#+OPTIONS: H:3

#+BEGIN_QUOTE
  [2012-05-31 jeu] I recently began to foresee that Pynits could be of
  some use in a project I'll be working on.  I'm not sure yet it is
  worth undusting.  To ponder this, I first extracted Pynits from FP
  etc. into its own GitHub project.  The next main steps would likely
  be the adaptation to the newer AST module, the design of some Pymacs
  interface so Pynits could be used from Emacs as well as from Vim,
  and of course, lots of testing and debugging.  I suspect that Python
  2 versus Python 3 would create some more difficulties.  In any case,
  if this cannot be done reasonably quickly, which I'll soon began to
  evaluate, Pynits might return to hibernation for some while…
#+END_QUOTE

* README contents

Pynits is a tool which I find useful while editing Python source code
with Vim, for tidying up individual sources lines.  It is
particularily helpful when formatting lines holding long or complex
expressions.

The documentation is kept in file [[http://fp-etc.progiciels-bpi.ca/pynits-doc.html][pynits.txt]], which you may browse or
print, yet it is meant for the Vim help system.  Once the installation
is completed as per the steps described below, and from within Vim,
you may quickly get back to the documentation through the command::

#+BEGIN_EXAMPLE
  :help pynits
#+END_EXAMPLE

Pynits messages are also available in French, given the *LANG*
environment variable is =fr= or a variant of =fr=, before you call Vim.

** On Windows or other non-Unix systems

You likeley have fetched a [[https://github.com/pinard/Pynits/zipball/master][recent *.zip* archive]].  Choose the directory
where you want the files installed.  This is normally one of the
directories listed by VIM command:

#+BEGIN_EXAMPLE
  :set runtimepath
#+END_EXAMPLE

Let's call it /DIRECTORY/.  Create /DIRECTORY/ if it does not exist and
unzip the distribution you got into /DIRECTORY/.  For making the
documentation available to VIM, execute VIM commands:

#+BEGIN_EXAMPLE
  :helptags DIRECTORY/doc
#+END_EXAMPLE

replacing /DIRECTORY/ in the command above, of course.

You have to manage in such a way that Python will successfully import
file =python/pynits.py=.  One way is changing your Python installation
so it will look in that directory.  See your Python documentation for
knowing where the directory =site-packages/= is, then add these lines to
file =.../site-packages/sitecustomize.py=:

#+BEGIN_SRC python
  import sys
  sys.path.append('DIRECTORY/python')
#+END_SRC

once again replacing /DIRECTORY/ as appropriate, above.  Another way is
to move the just installed file =python/pynits.py= into another
directory on the Python import search path.  Pick a directory among
those Python lists when you interactively execute:

#+BEGIN_SRC python
  import sys
  print sys.path
#+END_SRC

A third way might be to preset *PYTHONPATH* in the environment so this
variable points to =DIRECTORY/python=.

Finally, you should edit file =DIRECTORY/ftplugin/python.vim= — create
such a file if it does not exist already — and make sure it holds at
least the following line:

#+BEGIN_EXAMPLE
  python import pynits
#+END_EXAMPLE

** On Unix or Linux systems

On Linux or other Unix systems, you might follow the instructions
above, meant for non-Unix systems, they should work for Unix systems
as well.  Or else, you may fetch a [[https://github.com/pinard/FP-etc/tarball/master][recent *.tar.gz* archive]].  You may
either install this tool for yourself only, or for all users in your
system.

*** Personal installation

This tool may be installed for yourself only, and this might be your
only choice if you do not have super-user privileges.  Merely type
*make install-user* from the unpacked Pynits distribution.  As a
separate step, your =~/.vimrc= file should be modified, or created if
necessary, so it holds the following lines:

#+BEGIN_EXAMPLE
  if has('python')
    python <=, which is =\= if the
user did not change it.  All mapping equivalence tables, below, assume
that the local leader has not been set to some other value.  You
should change your reading of these maps if the leader happens to be
different.

This tool complies with the '=textwidth=' Vim option for the line
length, but uses 80 if that option is set to zero.  The '=shiftwidth='
option also gives the number of columns per indentation step.

** Reformatting Python code

The following commands distinguish between a set of white lines, a
block of comments, or a group of physical lines representing a logical
line of Python code.  They all have the purpose of producing a
possibly different formatting of the same while preserving the
execution semantics.

If the cursor is on a white line, surrounding white lines are
considered extraneous and deleted.  If the cursor is on a comment
line, the entire comment block of which the line is a part is
reformatted, see [[Handling comments]] below for more details on this.
Otherwise, the cursor is on a line of executable Python code, and the
following commands may yield different results whenever continuation
lines are needed.  A few peculiarities of string formatting are listed
separately, see [[Handling strings]] below.

Mapping equivalences:

| \b | Pynits\_column\_layout        |
| \c | Pynits\_column\_fill\_layout  |
| \l | Pynits\_line\_layout          |
| \p | Pynits\_retract\_layout       |
| \q | Pynits\_retract\_fill\_layout |
| Q  | Pynits\_retract\_fill\_layout |

When any of the above command is given, the tool first tries to find
the Python line containing the cursor.  A Python line, here, means a
set of physical lines in the buffer representing a single logical
line, but possibly continued (either through an escaped-newline or an
unbalanced bracket of any kind).  The Python line may start as far as
a dozen physical lines before the cursor, and the Python line may hold
as many as two hundred physical lines.  Beware that a physical line
may look, on its own, like a valid Python statement, while it is
really a continuation line.  This tool may be fooled by such cases, so
it may happen that you ought to explicitly reposition the cursor on
the first physical line of the whole Python line.  As the cursor is
left on the physical line following the Python line after the
operation, this gives you a good indication of what the tool
determined to be a Python line.

Command =\l= reformats the Python line into a single physical line, and
for this command only, with no limit for the line length.  This
command is meant as a first step when the programmer wants to fully
hand-craft the formatting.

Command =\b= reformats the Python line so to vertically align
continuation lines into a column, the position of which is selected to
best reflect the syntactic structure of a statement or expression.  In
contrast to this, command =\p= will align continuation lines by
indenting the margin by a fixed amount.  This is a bit less legible,
yet still acceptable as it may save a few continuation lines.
However, if =\p= and =\b= would result in exactly the same number of
continuation lines in the processed file, the =\p= command is
(recursively) over-ridden by the =\b= command.  (This is because the
cost of reduced legibility that results from the use of =\p= is only
worth paying if fewer continuation lines result.)

Command =\c= is the same as =\b= with refilling, while command =\q= is the
same as =\p= with refilling.  Refilling is an operation by which
continuation lines are combined after having been produced, provided
this is natural and meaningful enough.  Here as well, refilling may
represent some loss in legibility, but so might the use of too much
vertical space, so refilling may often be a worthwhile compromise.
Heuristics may sometimes fail to find a formatting solution with =\c= or
=\q=, and when this happens, reformatting is automatically retried with
filling disabled, a bit as if =\b= or =\p= have been used instead.

How one remembers all these letters? =\q= has been chosen after =gq=,
which is the standard Vim command for reformatting text. =\q= is the
most aggressive variant for reformatting Python lines, useful enough
to warrant =Q= as an alternate, easier key binding.  Our suggestion is
that you start with =Q=, and only attempt one of the others if you are
not satisfied by the results.  Some bindings are named after their
effect: =\l= requests a single line and =\c= favors columnar layout.  To
avoid refilling, go backwards one position in the alphabet: =\b= is like
=\c= without refilling, and =\p= is like =\q= without refilling.  One may
also notice that =\b=, =\c=, =\p= and =\q=, listed in alphabetical order, use
from least to most complex reformatting heuristics.

** Handling strings

This tool pays special attention to string formatting.  Here are a few
of the principles it tries to follow:

- Strings which represent natural language use double-quotes (") as
  external delimiters, others use single-quotes ('). The heuristic
  used to detect natural language strings is simplistic: there should
  be a word of four letters or more, at least one space is required,
  and there should be at least three times more letters than
  non-letters.

- Isolated strings occur when a Python line contains a string and
  nothing else.  Such isolated strings are blindly taken as
  doc-strings, and so, are rendered with a triple-quoted string.  If
  this annoys you, add a comma after the string before formatting the
  line (this transforms the string into a 1-tuple), and remove that
  comma afterwards.

- Whenever a triple-quoted string is the most economical avenue for
  vertical space, all lines share the same left margin.  This means in
  particular that the opening triple-quote is always followed by a
  backslash, and the string contents may never start on the same line.

- Raw strings are produced whenever this shortens string contents.

- String may be broken into a concatenated set of strings, one per
  line.  When this occurs, all strings use the same delimiters and
  /rawness/.  The split occurs between words, any white space sticks
  with the following word at such split points.  A split point is
  always forced after an embedded =\n=.

- Escape sequences are avoided for non-ASCII letters.  It is assumed
  that the *coding* pragma of the whole file is compatible with such
  letters.  (This should likely be cross-checked with Vim current
  encoding of buffer.)

** Handling comments

This tool does not like in-line comments.  All comments are extracted
out of a Python line, and put either before it if the line does not
end with a colon, or after it whenever the Python line ends with a
colon.

Mapping equivalences:

| \f | Pynits\_choose\_filling\_tool |

Command =\f= cycles through a few refilling algorithms for comments.  By
default, comment refilling is effected through the GNU *fmt* program,
which needs to be installed in the system for this procedure to work
(most Linux systems have it bundled already): it seems to me that
wherever it fits, this is the tool producing the nicest output.  The
second algorithm uses the *par* program, which despite yielding results
a bit less nice than *fmt*, is more clever in some circumstances of
complex quoting or so-called boxed comments.  The third algorithm uses
the Vim built-in paragraph reformatter, which seems a bit less
interesting, but it may only be that I have not studied it very much
yet.  The fourth and last algorithm is the *textwrap* module from the
Python 2.3 distribution; in practice so far, it does not seem fully up
to the task.

** Finding nits

The exact nature of formatting nits is difficult to formalize, so I
will evade the issue and rest content with the /definition/ of a nit as
something I do not like! ☺

Mapping equivalences:

| \\ | Pynits\_find\_nit    |
| \. | Pynits\_correct\_nit |

This tool knows about many nits, and when asked to find one with =\\=,
it starts at the current line and move forward, looking for any
instance of such a nit.  Once a nit is found by =\\=, the user may
choose to edit the code or to otherwise move the cursor before issuing
another =\\= command.  When this is the case, the seek process for a nit
resumes at the beginning of the current line.  This guards against the
possibility that the correction of one nit might introduce another nit
that goes undetected.  Otherwise, if the user immediately issues
another =\\= command without changing anything, the tool assumes that
the user decided to ignore the recently found nit, and so, the seek
process resumes after the text of that nit.

When a nit is found, it is highlighted, and explained in the status
line.  However, merely because I do not know how to do it any better,
all textual contents similar to the nit are also highlighted.  But the
nit really resides at the cursor location.  Simply ignore the other
instances of highlighted text.

Command =\.= asks for the most recently found nit to be corrected.  When
the nit processor does not know how to automatically correct a nit, it
merely asks for human intervention.  Else, a change is made to get rid
of the nit.  If that change is semantically-safe (that is, when it
just cannot alter the meaning of the program), the next nit is
automatically sought exactly as if the user typed =\\=. Otherwise, the
cursor is left after the change for the user to check, and undo if not
accepted.

Here is a quick description of known nits and how they are corrected.
A first set of nits is related to white space.  An empty file is
replaced by the prototype of a program.  A sequence of white lines is
reduce to a single one.  TABs are not allowed in a source: they are
replaced by eight spaces in the left margin, or the escape sequence =\t=
elsewhere.  A sequence of spaces between tokens is replaced by a
single space.  Trailing spaces on a line are removed.  Escaped
newlines are unescaped.  Spaces are not allowed after an opening
parenthesis (or similar character) or before a closing one.  Spaces
are forbidden before a comma (or similar character), but required
after one.  Spaces are expected before and after an assignment symbol
or comparison operator.

A second set of nits is related to various lexical issues.  In-lined
comments are collected and all reported either after or before the
Python line, depending if the Python line ends with a colon, or not; a
mild attempt is also made at turning them into proper sentences.
Operators are much easier to grasp when read at the beginning of
Python lines than at their end (strangely, this is not known enough!).
Double-quoted strings are ideally be reserved to natural language
text.  Triple-quoted strings should have their delimiters on separate
lines whenever possible.  A peculiar (and uncommon) date French format
is especially recognised and such dates are turned to ISO 8601.

A third set of nits is related to syntactic issues, which imply full
reformatting, as if =\q= command was given.  As a special case, lines
going over text witdth are reformatted with no other change.  The
*apply* function is replaced with the =*arguments= notation.  The *find*
function used to check string inclusion is replaced with *in* operator.
The *has\_key* function is replaced with *in* operator.  The *print*
statement is replaced by a call to *sys.stdout.write* or such (this is
under the assumption that *print* should be reserved for debugging).
Functions of the *string* module are replaced by string methods.

A final set of nits is related to syntactic issues, while not implying
full reformatting.  Moreover, many of these nits may only be corrected
by humans, and are better revised by them.  The *close*, *eval*, *execfile*,
*input*, *iterkeys*, *keys*, *readlines* and *xreadlines* functions, as well as
the *exec*, *global* and *import* =*= statements, are often better avoided.
So are comparisons with the result of the *type* function.  Function
*items* is often better written *iteritems*, *open* written *file*, *values*
written *itervalues*.

The Python line reformatter may sometimes produce output which may be
nit-picked, for the sake of strictly protecting the semantics.  For
example, if a doc-string does not end with a newline, the reformatter
will not allow itself to add one.  But then, the nit-picker will
likely complain on the reformatted result.

** Other commands

Mapping equivalences:

| \" | Pynits\_force\_double\_quotes |
| \' | Pynits\_force\_single\_quotes |
| \( | Pynits\_add\_parentheses     |
| \) | Pynits\_remove\_parentheses  |

Command \" seeks for the next string delimited by single quotes after
the cursor on the current line, and changes the delimiters to double
quotes.  Command \' similarly changes a double-quoted string into a
single-quoted string.  The /raw/ attribute is decided automatically.

Command =\(= adds an opening round parenthesis under the cursor, and a
closing round parenthesis at the end of the physical line.  If the
line ends with a colon, the closing parenthesis is added before the
colon instead of after it.  Command =\)= only works if the cursor sits
over an opening or closing bracket of any kind, it deletes both the
bracket under the cursor and the matching bracket, even if the
matching bracket was on a different line.

Mapping equivalences:

| \d | Pynits\_choose\_debug |
| \y | Pynits\_show\_syntax  |

These two commands are meant for the use of the maintainer of this
tool in debugging.  Command =\d= cycles toggles debugging, which
triggers quite verbose output.  Command =\y= does not do any formatting,
but merely dumps the syntactic tree of the selected Python line.

** Caveats and future

Reformatting a Python line might not always yield the result you would
have hand-crafted.  One reason is that taste varies widely when it
comes to formatting matters.  Best is to see if you can grok the
coding style implemented by this tool, edit the results you do not
like, or learn to visually recognise cases for which this tool does
not satisfy you and avoid using it for them.  I could have addressed
many coding styles, but this would require lots of programmtic
switches and knobs, and make the internal code more complex than it
is, with likely not much practical gain.  But I'm a good guy and users
might convince me otherwise ☺.

Another reason is that some internal heuristics are used to guarantee
reasonable reformatting speed by pruning the overall tree of
formatting possibilities.  I may likely adapt these heuristics
according to my own needs, if users report cases of severe
mis-formatting, or even submit (nice and tractable!) patches
implementing better heuristics.  Such things are not easy: this tool
already consumes a noticeable amount of CPU and memory, and is better
avoided on smallish systems.

Two caveats are in order regarding the reformatting of numbers:

- Integer numbers are always produced in decimal notation, even if the
  original number was using octal or hexadecimal notation.  The value
  is always correct, however.

- Floating numbers are mangled in two ways.  First is that the exact
  representation used by the user (exponents, zeroes after point meant
  to indicate precision, etc.) is lost through the process.  Second,
  and more serious, is that the floating number value is /sometimes/
  lost through the process, due to a bug (either in Python *compiler*
  module, or in Vim itself, I do not have a clue yet).  For this
  second problem, whenever a reformatted line contained a floating
  number, due warnings are produced.

The original representation of a string (raw, single or double quotes,
triple-quotes, etc.) is lost through reformatting, yet the contents of
the produced string is guaranteed equivalent to the original.

本源码包内暂不包含可直接显示的源代码文件,请下载源码包。