keys.sgml
上传用户:blenddy
上传日期:2007-01-07
资源大小:6495k
文件大小:5k
- <!--
- $Header: /usr/local/cvsroot/pgsql/doc/src/sgml/keys.sgml,v 1.3 1998/12/29 02:24:16 thomas Exp $
- Indices and Keys
- $Log: keys.sgml,v $
- Revision 1.3 1998/12/29 02:24:16 thomas
- Clean up to ensure tag completion as required by the newest versions
- of Norm's Modular Style Sheets and jade/docbook.
- From Vince Vielhaber <vev@michvhf.com>.
- Revision 1.2 1998/08/17 16:18:13 thomas
- Small sentence cleanups. Add tags for acronyms and products.
- Revision 1.1 1998/08/15 06:52:03 thomas
- Nice exposition on indices and keys from Herouth Maoz which appeared
- on the mailing lists a while ago. Maybe slightly changed to fit docs.
- Will go into the User's Guide.
- -->
- <chapter id="keys">
- <docinfo>
- <authorgroup>
- <author>
- <firstname>Herouth</firstname>
- <surname>Maoz</surname>
- </author>
- </authorgroup>
- <date>1998-03-02</date>
- </docinfo>
- <Title>Indices and Keys</Title>
- <Note>
- <Title>Author</Title>
- <Para>
- Written by
- <ULink url="herouth@oumail.openu.ac.il">Herouth Maoz</ULink>
- </Para>
- </Note>
- <Note>
- <Title>Editor's Note</Title>
- <Para>
- This originally appeared on the mailing list
- in response to the question:
- "What is the difference between PRIMARY KEY and UNIQUE constraints?".
- </Para>
- </Note>
- <ProgramListing>
- Subject: Re: [QUESTIONS] PRIMARY KEY | UNIQUE
- What's the difference between:
- PRIMARY KEY(fields,...) and
- UNIQUE (fields,...)
- - Is this an alias?
- - If PRIMARY KEY is already unique, then why
- is there another kind of key named UNIQUE?
- </ProgramListing>
- <Para>
- A primary key is the field(s) used to identify a specific row. For example,
- Social Security numbers identifying a person.
- </Para>
- <Para>
- A simply UNIQUE combination of fields has nothing to do with identifying
- the row. It's simply an integrity constraint. For example, I have
- collections of links. Each collection is identified by a unique number,
- which is the primary key. This key is used in relations.
- </Para>
- <Para>
- However, my application requires that each collection will also have a
- unique name. Why? So that a human being who wants to modify a collection
- will be able to identify it. It's much harder to know, if you have two
- collections named "Life Science", the the one tagged 24433 is the one you
- need, and the one tagged 29882 is not.
- </Para>
- <Para>
- So, the user selects the collection by its name. We therefore make sure,
- withing the database, that names are unique. However, no other table in the
- database relates to the collections table by the collection Name. That
- would be very inefficient.
- </Para>
- <Para>
- Moreover, despite being unique, the collection name does not actually
- define the collection! For example, if somebody decided to change the name
- of the collection from "Life Science" to "Biology", it will still be the
- same collection, only with a different name. As long as the name is unique,
- that's OK.
- </Para>
- <Para>
- So:
- <itemizedlist>
- <ListItem>
- <Para>
- Primary key:
- <itemizedList Mark="bullet" Spacing="compact">
- <ListItem>
- <Para>
- Is used for identifying the row and relating to it.
- </Para>
- </ListItem>
- <ListItem>
- <Para>
- Is impossible (or hard) to update.
- </Para>
- </ListItem>
- <ListItem>
- <Para>
- Should not allow NULLs.
- </Para>
- </ListItem>
- </itemizedlist>
- </para>
- </listitem>
- <ListItem>
- <Para>
- Unique field(s):
- <itemizedlist Mark="bullet" Spacing="compact">
- <ListItem>
- <Para>
- Are used as an alternative access to the row.
- </Para>
- </ListItem>
- <ListItem>
- <Para>
- Are updateable, so long as they are kept unique.
- </Para>
- </ListItem>
- <ListItem>
- <Para>
- NULLs are acceptable.
- </Para>
- </ListItem>
- </itemizedlist>
- </para>
- </listitem>
- </itemizedlist>
- </para>
- <Para>
- As for why no non-unique keys are defined explicitly in standard <acronym>SQL</acronym> syntax?
- Well, you
- must understand that indices are implementation-dependent. <acronym>SQL</acronym> does not
- define the implementation, merely the relations between data in the
- database. <productname>Postgres</productname> does allow non-unique indices, but indices
- used to enforce <acronym>SQL</acronym> keys are always unique.
- </Para>
- <Para>
- Thus, you may query a table by any combination of its columns, despite the
- fact that you don't have an index on these columns. The indexes are merely
- an implementational aid which each <acronym>RDBMS</acronym> offers you, in order to cause
- commonly used queries to be done more efficiently. Some <acronym>RDBMS</acronym> may give you
- additional measures, such as keeping a key stored in main memory. They will
- have a special command, for example
- <programlisting>
- CREATE MEMSTORE ON <table> COLUMNS <cols>
- </programlisting>
- (this is not an existing command, just an example).
- </Para>
- <Para>
- In fact, when you create a primary key or a unique combination of fields,
- nowhere in the <acronym>SQL</acronym> specification does it say that an index is created, nor that
- the retrieval of data by the key is going to be more efficient than a
- sequential scan!
- </Para>
- <Para>
- So, if you want to use a combination of fields which is not unique as a
- secondary key, you really don't have to specify anything - just start
- retrieving by that combination! However, if you want to make the retrieval
- efficient, you'll have to resort to the means your <acronym>RDBMS</acronym> provider gives you
- - be it an index, my imaginary MEMSTORE command, or an intelligent <acronym>RDBMS</acronym>
- which creates indices without your knowledge based on the fact that you have
- sent it many queries based on a specific combination of keys... (It learns
- from experience).
- </Para>
- </chapter>