arch-dev.sgml
上传用户:blenddy
上传日期:2007-01-07
资源大小:6495k
文件大小:148k
- <chapter id="overview">
- <title>Overview of PostgreSQL Internals</title>
- <note>
- <title>Author</title>
- <para>
- This chapter originally appeared as a part of
- <xref linkend="SIM98" endterm="SIM98">, Stefan Simkovics'
- Master's Thesis prepared at Vienna University of Technology under the direction
- of O.Univ.Prof.Dr. Georg Gottlob and Univ.Ass. Mag. Katrin Seyr.
- </para>
- </note>
- <para>
- This chapter gives an overview of the internal structure of the
- backend of <productname>Postgres</productname>.
- After having read the following sections you
- should have an idea of how a query is processed. Don't expect a
- detailed description here (I think such a description dealing with
- all data structures and functions used within <productname>Postgres</productname>
- would exceed 1000
- pages!). This chapter is intended to help understanding the general
- control and data flow within the backend from receiving a query to
- sending the results.
- </para>
- <sect1>
- <title>The Path of a Query</title>
- <para>
- Here we give a short overview of the stages a query has to pass in
- order to obtain a result.
- </para>
- <procedure>
- <step>
- <para>
- A connection from an application program to the <productname>Postgres</productname>
- server has to be established. The application program transmits a
- query to the server and receives the results sent back by the server.
- </para>
- </step>
- <step>
- <para>
- The <firstterm>parser stage</firstterm> checks the query
- transmitted by the application
- program (client) for correct syntax and creates
- a <firstterm>query tree</firstterm>.
- </para>
- </step>
- <step>
- <para>
- The <firstterm>rewrite system</firstterm> takes
- the query tree created by the parser stage and looks for
- any <firstterm>rules</firstterm> (stored in the
- <firstterm>system catalogs</firstterm>) to apply to
- the <firstterm>querytree</firstterm> and performs the
- transformations given in the <firstterm>rule bodies</firstterm>.
- One application of the rewrite system is given in the realization of
- <firstterm>views</firstterm>.
- </para>
- <para>
- Whenever a query against a view
- (i.e. a <firstterm>virtual table</firstterm>) is made,
- the rewrite system rewrites the user's query to
- a query that accesses the <firstterm>base tables</firstterm> given in
- the <firstterm>view definition</firstterm> instead.
- </para>
- </step>
- <step>
- <para>
- The <firstterm>planner/optimizer</firstterm> takes
- the (rewritten) querytree and creates a
- <firstterm>queryplan</firstterm> that will be the input to the
- <firstterm>executor</firstterm>.
- </para>
- <para>
- It does so by first creating all possible <firstterm>paths</firstterm>
- leading to the same result. For example if there is an index on a
- relation to be scanned, there are two paths for the
- scan. One possibility is a simple sequential scan and the other
- possibility is to use the index. Next the cost for the execution of
- each plan is estimated and the
- cheapest plan is chosen and handed back.
- </para>
- </step>
- <step>
- <para>
- The executor recursively steps through
- the <firstterm>plan tree</firstterm> and
- retrieves tuples in the way represented by the plan.
- The executor makes use of the
- <firstterm>storage system</firstterm> while scanning
- relations, performs <firstterm>sorts</firstterm> and <firstterm>joins</firstterm>,
- evaluates <firstterm>qualifications</firstterm> and finally hands back the tuples derived.
- </para>
- </step>
- </procedure>
- <para>
- In the following sections we will cover every of the above listed items
- in more detail to give a better understanding on <productname>Postgres</productname>'s internal
- control and data structures.
- </para>
- </sect1>
- <sect1>
- <title>How Connections are Established</title>
- <para>
- <productname>Postgres</productname> is implemented using a simple "process per-user"
- client/server model. In this model there is one <firstterm>client process</firstterm>
- connected to exactly one <firstterm>server process</firstterm>.
- As we don't know <foreignphrase>per se</foreignphrase>
- how many connections will be made, we have to use a <firstterm>master process</firstterm>
- that spawns a new server process every time a connection is
- requested. This master process is called <literal>postmaster</literal> and
- listens at a specified TCP/IP port for incoming connections. Whenever
- a request for a connection is detected the <literal>postmaster</literal> process
- spawns a new server process called <literal>postgres</literal>. The server
- tasks (<literal>postgres</literal> processes) communicate with each other using
- <firstterm>semaphores</firstterm> and <firstterm>shared memory</firstterm>
- to ensure data integrity
- throughout concurrent data access. Figure
- ref{connection} illustrates the interaction of the master process
- <literal>postmaster</literal> the server process <literal>postgres</literal> and a client
- application.
- </para>
- <para>
- The client process can either be the <application>psql</application> frontend (for
- interactive SQL queries) or any user application implemented using
- the <filename>libpg</filename> library. Note that applications implemented using
- <application>ecpg</application>
- (the <productname>Postgres</productname> embedded SQL preprocessor for C)
- also use this library.
- </para>
- <para>
- Once a connection is established the client process can send a query
- to the <firstterm>backend</firstterm> (server). The query is transmitted using plain text,
- i.e. there is no parsing done in the <firstterm>frontend</firstterm> (client). The
- server parses the query, creates an <firstterm>execution plan</firstterm>,
- executes the plan and returns the retrieved tuples to the client
- by transmitting them over the established connection.
- </para>
- <!--
- begin{figure}[ht]
- begin{center}
- epsfig{figure=figures/connection.ps}
- caption{How a connection is established}
- label{connection}
- end{center}
- end{figure}
- -->
- </sect1>
- <sect1>
- <title>The Parser Stage</title>
- <para>
- The <firstterm>parser stage</firstterm> consists of two parts:
- <itemizedlist>
- <listitem>
- <para>
- The <firstterm>parser</firstterm> defined in
- <filename>gram.y</filename> and <filename>scan.l</filename> is
- built using the UNIX tools <application>yacc</application>
- and <application>lex</application>.
- </para>
- </listitem>
- <listitem>
- <para>
- The <firstterm>transformation process</firstterm> does
- modifications and augmentations to the data structures returned by the parser.
- </para>
- </listitem>
- </itemizedlist>
- </para>
- <sect2>
- <title>Parser</title>
- <para>
- The parser has to check the query string (which arrives as
- plain ASCII text) for valid syntax. If the syntax is correct a
- <firstterm>parse tree</firstterm> is built up and handed back otherwise an error is
- returned. For the implementation the well known UNIX
- tools <application>lex</application> and <application>yacc</application>
- are used.
- </para>
- <para>
- The <firstterm>lexer</firstterm> is defined in the file
- <filename>scan.l</filename> and is responsible
- for recognizing <firstterm>identifiers</firstterm>,
- the <firstterm>SQL keywords</firstterm> etc. For
- every keyword or identifier that is found, a <firstterm>token</firstterm>
- is generated and handed to the parser.
- </para>
- <para>
- The parser is defined in the file <filename>gram.y</filename> and consists of a
- set of <firstterm>grammar rules</firstterm> and <firstterm>actions</firstterm>
- that are executed
- whenever a rule is fired. The code of the actions (which
- is actually C-code) is used to build up the parse tree.
- </para>
- <para>
- The file <filename>scan.l</filename> is transformed to
- the C-source file <filename>scan.c</filename>
- using the program <application>lex</application>
- and <filename>gram.y</filename> is transformed to
- <filename>gram.c</filename> using <application>yacc</application>.
- After these transformations have taken
- place a normal C-compiler can be used to create the
- parser. Never make any changes to the generated C-files as they will
- be overwritten the next time <application>lex</application>
- or <application>yacc</application> is called.
- <note>
- <para>
- The mentioned transformations and compilations are normally done
- automatically using the <firstterm>makefiles</firstterm>
- shipped with the <productname>Postgres</productname>
- source distribution.
- </para>
- </note>
- </para>
- <para>
- A detailed description of <application>yacc</application> or
- the grammar rules given in <filename>gram.y</filename> would be
- beyond the scope of this paper. There are many books and
- documents dealing with <application>lex</application> and
- <application>yacc</application>. You should be familiar with
- <application>yacc</application> before you start to study the
- grammar given in <filename>gram.y</filename> otherwise you won't
- understand what happens there.
- </para>
- <para>
- For a better understanding of the data structures used in
- <productname>Postgres</productname>
- for the processing of a query we use an example to illustrate the
- changes made to these data structures in every stage.
- </para>
- <example id="simple-select">
- <title>A Simple Select</title>
- <para>
- This example contains the following simple query that will be used in
- various descriptions and figures throughout the following
- sections. The query assumes that the tables given in
- <citetitle>The Supplier Database</citetitle>
- <!--
- XXX The above citetitle should really be an xref,
- but that part has not yet been converted from Stefan's original document.
- - thomas 1999-02-11
- <xref linkend="supplier" endterm="supplier">
- -->
- have already been defined.
- <programlisting>
- select s.sname, se.pno
- from supplier s, sells se
- where s.sno > 2 and s.sno = se.sno;
- </programlisting>
- </para>
- </example>
- <para>
- Figure ref{parsetree} shows the <firstterm>parse tree</firstterm> built by the
- grammar rules and actions given in <filename>gram.y</filename> for the query
- given in <xref linkend="simple-select" endterm="simple-select">
- (without the <firstterm>operator tree</firstterm> for
- the <firstterm>where clause</firstterm> which is shown in figure ref{where_clause}
- because there was not enough space to show both data structures in one
- figure).
- </para>
- <para>
- The top node of the tree is a <literal>SelectStmt</literal> node. For every entry
- appearing in the <firstterm>from clause</firstterm> of the SQL query a <literal>RangeVar</literal>
- node is created holding the name of the <firstterm>alias</firstterm> and a pointer to a
- <literal>RelExpr</literal> node holding the name of the <firstterm>relation</firstterm>. All
- <literal>RangeVar</literal> nodes are collected in a list which is attached to the field
- <literal>fromClause</literal> of the <literal>SelectStmt</literal> node.
- </para>
- <para>
- For every entry appearing in the <firstterm>select list</firstterm> of the SQL query a
- <literal>ResTarget</literal> node is created holding a pointer to an <literal>Attr</literal>
- node. The <literal>Attr</literal> node holds the <firstterm>relation name</firstterm> of the entry and
- a pointer to a <literal>Value</literal> node holding the name of the
- <firstterm>attribute</firstterm>.
- All <literal>ResTarget</literal> nodes are collected to a list which is
- connected to the field <literal>targetList</literal> of the <literal>SelectStmt</literal> node.
- </para>
- <para>
- Figure ref{where_clause} shows the operator tree built for the
- where clause of the SQL query given in example
- <xref linkend="simple-select" endterm="simple-select">
- which is attached to the field
- <literal>qual</literal> of the <literal>SelectStmt</literal> node. The top node of the
- operator tree is an <literal>A_Expr</literal> node representing an <literal>AND</literal>
- operation. This node has two successors called <literal>lexpr</literal> and
- <literal>rexpr</literal> pointing to two <firstterm>subtrees</firstterm>. The subtree attached to
- <literal>lexpr</literal> represents the qualification <literal>s.sno > 2</literal> and the one
- attached to <literal>rexpr</literal> represents <literal>s.sno = se.sno</literal>. For every
- attribute an <literal>Attr</literal> node is created holding the name of the
- relation and a pointer to a <literal>Value</literal> node holding the name of the
- attribute. For the constant term appearing in the query a
- <literal>Const</literal> node is created holding the value.
- </para>
- <!--
- XXX merge in the figures later... - thomas 1999-01-29
- begin{figure}[ht]
- begin{center}
- epsfig{figure=figures/parsetree.ps}
- caption{{it TargetList} and {it FromList} for query of example ref{simple_select}}
- label{parsetree}
- end{center}
- end{figure}
- begin{figure}[ht]
- begin{center}
- epsfig{figure=figures/where_clause.ps}
- caption{{it WhereClause} for query of example ref{simple_select}}
- label{where_clause}
- end{center}
- end{figure}
- -->
- </sect2>
- <sect2>
- <title>Transformation Process</title>
- <para>
- The <firstterm>transformation process</firstterm> takes the tree handed back by
- the parser as input and steps recursively through it. If
- a <literal>SelectStmt</literal> node is found, it is transformed
- to a <literal>Query</literal>
- node which will be the top most node of the new data structure. Figure
- ref{transformed} shows the transformed data structure (the part
- for the transformed <firstterm>where clause</firstterm> is given in figure
- ref{transformed_where} because there was not enough space to show all
- parts in one figure).
- </para>
- <para>
- Now a check is made, if the <firstterm>relation names</firstterm> in the
- <firstterm>FROM clause</firstterm> are known to the system. For every relation name
- that is present in the <firstterm>system catalogs</firstterm> a <abbrev>RTE</abbrev> node is
- created containing the relation name, the <firstterm>alias name</firstterm> and
- the <firstterm>relation id</firstterm>. From now on the relation ids are used to
- refer to the <firstterm>relations</firstterm> given in the query. All <abbrev>RTE</abbrev> nodes
- are collected in the <firstterm>range table entry list</firstterm> which is connected
- to the field <literal>rtable</literal> of the <literal>Query</literal> node. If a name of a
- relation that is not known to the system is detected in the query an
- error will be returned and the query processing will be aborted.
- </para>
- <para>
- Next it is checked if the <firstterm>attribute names</firstterm> used are
- contained in the relations given in the query. For every
- attribute} that is found a <abbrev>TLE</abbrev> node is created holding a pointer
- to a <literal>Resdom</literal> node (which holds the name of the column) and a
- pointer to a <literal>VAR</literal> node. There are two important numbers in the
- <literal>VAR</literal> node. The field <literal>varno</literal> gives the position of the
- relation containing the current attribute} in the range
- table entry list created above. The field <literal>varattno</literal> gives the
- position of the attribute within the relation. If the name
- of an attribute cannot be found an error will be returned and
- the query processing will be aborted.
- </para>
- <!--
- begin{figure}[ht]
- begin{center}
- epsfig{figure=figures/transform.ps}
- caption{Transformed {it TargetList} and {it FromList} for query of example ref{simple_select}}
- label{transformed}
- end{center}
- end{figure}
- noindent Figure ref{transformed_where} shows the transformed {it where
- clause}. Every {tt A_Expr} node is transformed to an {tt Expr}
- node. The {tt Attr} nodes representing the attributes are replaced by
- {tt VAR} nodes as it has been done for the {it targetlist}
- above. Checks if the appearing {it attributes} are valid and known to the
- system are made. If there is an {it attribute} used which
- is not known an error will be returned and the {it query
- processing} will be aborted. \
- \
- The whole {it transformation process} performs a transformation of
- the data structure handed back by the {it parser} to a more
- comfortable form. The character strings representing the {it
- relations} and {it attributes} in the original tree are replaced by
- {it relation ids} and {tt VAR} nodes whose fields are referring to
- the entries of the {it range table entry list}. In addition to the
- transformation, also various checks if the used {it relation}
- and {it attribute} names (appearing in the query) are valid in the
- current context are performed.
- begin{figure}[ht]
- begin{center}
- epsfig{figure=figures/transform_where.ps}
- caption{Transformed {it where clause} for query of example ref{simple_select}}
- label{transformed_where}
- end{center}
- end{figure}
- pagebreak
- clearpage
- begin{figure}[ht]
- begin{center}
- epsfig{figure=figures/plan.ps}
- caption{{it Plan} for query of example ref{simple_select}}
- label{plan}
- end{center}
- end{figure}
- -->
- </sect2>
- </sect1>
- <sect1>
- <title>The <productname>Postgres</productname> Rule System</title>
- <para>
- <productname>Postgres</productname> supports a powerful
- <firstterm>rule system</firstterm> for the specification
- of <firstterm>views</firstterm> and ambiguous <firstterm>view updates</firstterm>.
- Originally the <productname>Postgres</productname>
- rule system consisted of two implementations:
- <itemizedlist>
- <listitem>
- <para>
- The first one worked using <firstterm>tuple level</firstterm> processing and was
- implemented deep in the <firstterm>executor</firstterm>. The rule system was
- called whenever an individual tuple had been accessed. This
- implementation was removed in 1995 when the last official release
- of the <productname>Postgres</productname> project was transformed into
- <productname>Postgres95</productname>.
- </para>
- </listitem>
- <listitem>
- <para>
- The second implementation of the rule system is a technique
- called <firstterm>query rewriting</firstterm>.
- The <firstterm>rewrite system</firstterm>} is a module
- that exists between the <firstterm>parser stage</firstterm> and the
- <firstterm>planner/optimizer</firstterm>. This technique is still implemented.
- </para>
- </listitem>
- </itemizedlist>
- </para>
- <para>
- For information on the syntax and creation of rules in the
- <productname>Postgres</productname> system refer to
- <citetitle>The PostgreSQL User's Guide</citetitle>.
- </para>
- <sect2>
- <title>The Rewrite System</title>
- <para>
- The <firstterm>query rewrite system</firstterm> is a module between
- the parser stage and the planner/optimizer. It processes the tree handed
- back by the parser stage (which represents a user query) and if
- there is a rule present that has to be applied to the query it
- rewrites the tree to an alternate form.
- </para>
- <sect3 id="view-impl">
- <title>Techniques To Implement Views</title>
- <para>
- Now we will sketch the algorithm of the query rewrite system. For
- better illustration we show how to implement views using rules
- as an example.
- </para>
- <para>
- Let the following rule be given:
- <programlisting>
- create rule view_rule
- as on select
- to test_view
- do instead
- select s.sname, p.pname
- from supplier s, sells se, part p
- where s.sno = se.sno and
- p.pno = se.pno;
- </programlisting>
- </para>
- <para>
- The given rule will be <firstterm>fired</firstterm> whenever a select
- against the relation <literal>test_view</literal> is detected. Instead of
- selecting the tuples from <literal>test_view</literal> the select statement
- given in the <firstterm>action part</firstterm> of the rule is executed.
- </para>
- <para>
- Let the following user-query against <literal>test_view</literal> be given:
- <programlisting>
- select sname
- from test_view
- where sname <> 'Smith';
- </programlisting>
- </para>
- <para>
- Here is a list of the steps performed by the query rewrite
- system whenever a user-query against <literal>test_view</literal> appears. (The
- following listing is a very informal description of the algorithm just
- intended for basic understanding. For a detailed description refer
- to <xref linkend="STON89-full" endterm="STON89">).
- </para>
- <procedure>
- <title><literal>test_view</literal> Rewrite</title>
- <step>
- <para>
- Take the query given in the action part of the rule.
- </para>
- </step>
- <step>
- <para>
- Adapt the targetlist to meet the number and order of
- attributes given in the user-query.
- </para>
- </step>
- <step>
- <para>
- Add the qualification given in the where clause of the
- user-query to the qualification of the query given in the
- action part of the rule.
- </para>
- </step>
- </procedure>
- <para>
- Given the rule definition above, the user-query will be
- rewritten to the following form (Note that the rewriting is done on
- the internal representation of the user-query handed back by the
- parser stage but the derived new data structure will represent the following
- query):
- <programlisting>
- select s.sname
- from supplier s, sells se, part p
- where s.sno = se.sno and
- p.pno = se.pno and
- s.sname <> 'Smith';
- </programlisting>
- </para>
- </sect3>
- </sect2>
- </sect1>
- <sect1>
- <title>Planner/Optimizer</title>
- <para>
- The task of the <firstterm>planner/optimizer</firstterm> is to create an optimal
- execution plan. It first combines all possible ways of
- <firstterm>scanning</firstterm> and <firstterm>joining</firstterm>
- the relations that appear in a
- query. All the created paths lead to the same result and it's the
- task of the optimizer to estimate the cost of executing each path and
- find out which one is the cheapest.
- </para>
- <sect2>
- <title>Generating Possible Plans</title>
- <para>
- The planner/optimizer decides which plans should be generated
- based upon the types of indices defined on the relations appearing in
- a query. There is always the possibility of performing a
- sequential scan on a relation, so a plan using only
- sequential scans is always created. Assume an index is defined on a
- relation (for example a B-tree index) and a query contains the
- restriction
- <literal>relation.attribute OPR constant</literal>. If
- <literal>relation.attribute</literal> happens to match the key of the B-tree
- index and <literal>OPR</literal> is anything but '<>' another plan is created using
- the B-tree index to scan the relation. If there are further indices
- present and the restrictions in the query happen to match a key of an
- index further plans will be considered.
- </para>
- <para>
- After all feasible plans have been found for scanning single
- relations, plans for joining relations are created. The
- planner/optimizer considers only joins between every two relations for
- which there exists a corresponding join clause (i.e. for which a
- restriction like <literal>where rel1.attr1=rel2.attr2</literal> exists) in the
- where qualification. All possible plans are generated for every
- join pair considered by the planner/optimizer. The three possible join
- strategies are:
- <itemizedlist>
- <listitem>
- <para>
- <firstterm>nested iteration join</firstterm>: The right relation is scanned
- once for every tuple found in the left relation. This strategy
- is easy to implement but can be very time consuming.
- </para>
- </listitem>
- <listitem>
- <para>
- <firstterm>merge sort join</firstterm>: Each relation is sorted on the join
- attributes before the join starts. Then the two relations are
- merged together taking into account that both relations are
- ordered on the join attributes. This kind of join is more
- attractive because every relation has to be scanned only once.
- </para>
- </listitem>
- <listitem>
- <para>
- <firstterm>hash join</firstterm>: the right relation is first hashed on its
- join attributes. Next the left relation is scanned and the
- appropriate values of every tuple found are used as hash keys to
- locate the tuples in the right relation.
- </para>
- </listitem>
- </itemizedlist>
- </para>
- </sect2>
- <sect2>
- <title>Data Structure of the Plan</title>
- <para>
- Here we will give a little description of the nodes appearing in the
- plan. Figure ref{plan} shows the plan produced for the query in
- example ref{simple_select}.
- </para>
- <para>
- The top node of the plan is a <literal>MergeJoin</literal> node which has two
- successors, one attached to the field <literal>lefttree</literal> and the second
- attached to the field <literal>righttree</literal>. Each of the subnodes represents
- one relation of the join. As mentioned above a merge sort
- join requires each relation to be sorted. That's why we find
- a <literal>Sort</literal> node in each subplan. The additional qualification given
- in the query (<literal>s.sno > 2</literal>) is pushed down as far as possible and is
- attached to the <literal>qpqual</literal> field of the leaf <literal>SeqScan</literal> node of
- the corresponding subplan.
- </para>
- <para>
- The list attached to the field <literal>mergeclauses</literal> of the
- <literal>MergeJoin</literal> node contains information about the join attributes.
- The values <literal>65000</literal> and <literal>65001</literal>
- for the <literal>varno</literal> fields in the
- <literal>VAR</literal> nodes appearing in the <literal>mergeclauses</literal> list (and also in the
- <literal>targetlist</literal>) mean that not the tuples of the current node should be
- considered but the tuples of the next "deeper" nodes (i.e. the top
- nodes of the subplans) should be used instead.
- </para>
- <para>
- Note that every <literal>Sort</literal> and <literal>SeqScan</literal> node appearing in figure
- ref{plan} has got a <literal>targetlist</literal> but because there was not enough space
- only the one for the <literal>MergeJoin</literal> node could be drawn.
- </para>
- <para>
- Another task performed by the planner/optimizer is fixing the
- <firstterm>operator ids</firstterm> in the <literal>Expr</literal>
- and <literal>Oper</literal> nodes. As
- mentioned earlier, <productname>Postgres</productname> supports a variety of different data
- types and even user defined types can be used. To be able to maintain
- the huge amount of functions and operators it is necessary to store
- them in a system table. Each function and operator gets a unique
- operator id. According to the types of the attributes used
- within the qualifications etc., the appropriate operator ids
- have to be used.
- </para>
- </sect2>
- </sect1>
- <sect1>
- <title>Executor</title>
- <para>
- The <firstterm>executor</firstterm> takes the plan handed back by the
- planner/optimizer and starts processing the top node. In the case of
- our example (the query given in example ref{simple_select}) the top
- node is a <literal>MergeJoin</literal> node.
- </para>
- <para>
- Before any merge can be done two tuples have to be fetched (one from
- each subplan). So the executor recursively calls itself to
- process the subplans (it starts with the subplan attached to
- <literal>lefttree</literal>). The new top node (the top node of the left subplan) is a
- <literal>SeqScan</literal> node and again a tuple has to be fetched before the node
- itself can be processed. The executor calls itself recursively
- another time for the subplan attached to <literal>lefttree</literal> of the
- <literal>SeqScan</literal> node.
- </para>
- <para>
- Now the new top node is a <literal>Sort</literal> node. As a sort has to be done on
- the whole relation, the executor starts fetching tuples
- from the <literal>Sort</literal> node's subplan and sorts them into a temporary
- relation (in memory or a file) when the <literal>Sort</literal> node is visited for
- the first time. (Further examinations of the <literal>Sort</literal> node will
- always return just one tuple from the sorted temporary
- relation.)
- </para>
- <para>
- Every time the processing of the <literal>Sort</literal> node needs a new tuple the
- executor is recursively called for the <literal>SeqScan</literal> node
- attached as subplan. The relation (internally referenced by the
- value given in the <literal>scanrelid</literal> field) is scanned for the next
- tuple. If the tuple satisfies the qualification given by the tree
- attached to <literal>qpqual</literal> it is handed back, otherwise the next tuple
- is fetched until the qualification is satisfied. If the last tuple of
- the relation has been processed a <literal>NULL</literal> pointer is
- returned.
- </para>
- <para>
- After a tuple has been handed back by the <literal>lefttree</literal> of the
- <literal>MergeJoin</literal> the <literal>righttree</literal> is processed in the same way. If both
- tuples are present the executor processes the <literal>MergeJoin</literal>
- node. Whenever a new tuple from one of the subplans is needed a
- recursive call to the executor is performed to obtain it. If a
- joined tuple could be created it is handed back and one complete
- processing of the plan tree has finished.
- </para>
- <para>
- Now the described steps are performed once for every tuple, until a
- <literal>NULL</literal> pointer is returned for the processing of the
- <literal>MergeJoin</literal> node, indicating that we are finished.
- </para>
- </sect1>
- <!--
- **********************************************************
- **********************************************************
- **********************************************************
- **********************************************************
- **********************************************************
- pagebreak
- clearpage
- section{The Realization of the Having Clause}
- label{having_impl}
- The {it having clause} has been designed in SQL to be able to use the
- results of {it aggregate functions} within a query qualification. The
- handling of the {it having clause} is very similar to the handling of
- the {it where clause}. Both are formulas in first order logic (FOL)
- that have to evaluate to true for a certain object to be handed back:
- %
- begin{itemize}
- <step> The formula given in the {it where clause} is evaluated for
- every tuple. If the evaluation returns {tt true} the tuple is
- returned, every tuple not satisfying the qualification is ignored.
- %
- <step> In the case of {it groups} the {it having clause} is evaluated
- for every group. If the evaluation returns {tt true} the group is
- taken into account otherwise it is ignored.
- end{itemize}
- %
- subsection{How Aggregate Functions are Implemented}
- label{aggregates}
- Before we can describe how the {it having clause} is implemented we
- will have a look at the implementation of {it aggregate functions} as
- long as they just appear in the {it targetlist}. Note that {it aggregate
- functions} are applied to groups so the query must contain a {it
- group clause}.
- %
- begin{example}
- label{having}
- Here is an example of the usage of the {it aggregate
- function} {tt count} which counts the number of part numbers ({tt
- pno}) of every group. (The table {tt sells} is defined in example
- ref{supplier}.)
- %
- begin{verbatim}
- select sno, count(pno)
- from sells
- group by sno;
- end{verbatim}
- %
- end{example}
- %
- A query like the one in example ref{having} is processed by the usual
- stages:
- %
- begin{itemize}
- <step> the parser stage
- <step> the rewrite system
- <step> the planner/optimizer
- <step> the executor
- end{itemize}
- %
- and in the following sections we will describe what every stage does
- to the query in order to obtain the appropriate result.
- subsubsection{The Parser Stage}
- begin{figure}[ht]
- begin{center}
- epsfig{figure=figures/parse_having.ps}
- caption{{it Querytree} built up for the query of example ref{having}}
- label{parse_having}
- end{center}
- end{figure}
- The parser stage builds up a {it querytree} containing the {it
- where} qualification and information about the {it grouping} that has
- to be done (i.e. a list of all attributes to group for is attached to
- the field {tt groupClause}). The main difference to {it querytrees}
- built up for queries without {it aggregate functions} is given in the
- field {tt hasAggs} which is set to {tt true} and in the {it
- targetlist}. The {tt expr} field of the second {tt TLE} node of the
- {it targetlist} shown in figure ref{parse_having} does not point
- directly to a {tt VAR} node but to an {tt Aggreg} node representing
- the {it aggregate function} used in the query.
- A check is made that every attribute grouped for appears only without
- an {it aggregate function} in the {it targetlist} and that every
- attribute which appears without an {it aggregate function} in the
- {it targetlist} is grouped for.
- %
- pagebreak
- subsubsection{The Rewrite System}
- The rewriting system does not make any changes to the {it querytree}
- as long as the query involves just {it base tables}. If any {it
- views} are present the query is rewritten to access the tables
- specified in the {it view definition}.
- %
- subsubsection{Planner/Optimizer}
- Whenever an {it aggregate function} is involved in a query (which is
- indicated by the {tt hasAggs} flag set to {tt true}) the planner
- creates a {it plantree} whose top node is an {tt AGG} node. The {it
- targetlist} is searched for {it aggregate functions} and for every
- function that is found, a pointer to the corresponding {tt Aggreg}
- node is added to a list which is finally attached to the field {tt aggs} of
- the {tt AGG} node. This list is needed by the {it executor} to know which
- {it aggregate functions} are present and have to be
- handled.
- The {tt AGG} node is followed by a {tt GRP} node. The implementation
- of the {it grouping} logic needs a sorted table for its operation so
- the {tt GRP} node is followed by a {tt SORT} node. The {tt SORT}
- operation gets its tuples from a kind of {tt Scan} node (if no
- indices are present this will be a simple {tt SeqScan} node). Any
- qualifications present are attached to the {tt Scan} node. Figure
- ref{plan_having} shows the {it plan} created for the query given in
- example ref{having}.
- begin{figure}[ht]
- begin{center}
- epsfig{figure=figures/plan_having.ps}
- caption{{it Plantree} for the query of example ref{having}}
- label{plan_having}
- end{center}
- end{figure}
- Note that every node has its own {it targetlist} which may differ from the
- one of the node above or below. The field {tt varattno} of every
- {tt VAR} node included in a {it targetlist} contains a number representing
- the position of the attribute's value in the tuple of the current node.
- pagebreak
- clearpage
- subsubsection{Executor}
- The {it executor} uses the function {tt execAgg()} to execute {tt AGG}
- nodes. As described earlier it uses one main function {tt
- ExecProcNode} which is called recursively to execute subtrees. The
- following steps are performed by {tt execAgg()}:
- %
- begin{itemize}
- %
- <step> The list attached to the field {tt aggs} of the {tt AGG} node is
- examined and for every {it aggregate function} included the {it transition
- functions} are fetched from a {it function table}. Calculating the
- value of an {it aggregate function} is done using three functions:
- %
- begin{itemize}
- <step> The {it first transition function} {tt xfn1} is called with the
- current value of the attribute the {it aggregate function} is applied
- to and changes its internal state using the attribute's value given as
- an argument.
- %
- <step> The {it second transition function} {tt xfn2} is called without any
- arguments and changes its internal state only according to internal rules.
- %
- <step> The {it final function} {tt finalfn} takes the final states of {tt
- xfn1} and {tt xfn2} as arguments and finishes the {it aggregation}.
- end{itemize}
- %
- begin{example} Recall the functions necessary to implement the
- {it aggregate function} {tt avg} building the average over all
- values of an attribute in a group (see section ref{ext_agg} {it
- Extending Aggregates}):
- %
- begin{itemize}
- %
- <step> The first transition function {tt xfn1} has to be a function that
- takes the actual value of the attribute {tt avg} is applied to as an
- argument and adds it to the internally stored sum of previous
- calls.
- %
- <step> The second transition function {tt xfn2} only increases an internal
- counter every time it is called.
- %
- <step> The final function {tt finalfn} divides the result of {tt xfn1} by
- the counter of {tt xfn2} to calculate the average.
- %
- end{itemize}
- %
- end{example}
- %
- Note that {tt xfn2} and {tt finalfn} may be absent (e.g. for the
- {it aggregate function} {tt sum} which simply sums up all values of
- the given attribute within a group. \
- \
- {tt execAgg()} creates an array containing one entry for every {it
- aggregate function} found in the list attached to the field {tt
- aggs}. The array will hold information needed for the execution of
- every {it aggregate function} (including the {it transition functions}
- described above).
- %
- <step> The following steps are executed in a loop as long as there are
- still tuples returned by the subplan (i.e. as long as there are still
- tuples left in the current group). When there are no tuples left
- in the group a {tt NULL} pointer is returned indicating the end of the
- group.
- %
- begin{itemize}
- <step> A new tuple from the subplan (i.e. the {it plan} attached to the
- field {tt lefttree}) is fetched by recursively calling {tt
- ExecProcNode()} with the subplan as argument.
- %
- <step> For every {it aggregate function} (contained in the array created
- before) apply the transition functions {tt xfn1} and {tt xfn2} to
- the values of the appropriate attributes of the current tuple.
- end{itemize}
- %
- <step> When we get here, all tuples of the current group have been
- processed and the {it transition functions} of all {it aggregate
- functions} have been applied to the values of the attributes. We are
- now ready to complete the {it aggregation} by applying the {it final
- function} ({tt finalfn}) for every {it aggregate function}.
- %
- <step> Store the tuple containing the new values (the results of the
- {it aggregate functions}) and hand it back.
- end{itemize}
- %
- Note that the procedure described above only returns one tuple
- (i.e. it processes just one group and when the end of the group is
- detected it processes the {it aggregate functions} and hands back one
- tuple). To retrieve all tuples (i.e. to process all groups) the
- function {tt execAgg()} has to be called (returning a new tuple every
- time) until it returns a {tt NULL} pointer indicating that there are
- no groups left to process.
- subsection{How the Having Clause is Implemented}
- The basic idea of the implementation is to attach the {it operator tree}
- built for the {it having clause} to the field {tt qpqual} of
- node {tt AGG} (which is the top node of the query tree). Now the executor
- has to evaluate the new {it operator tree} attached to {tt qpqual} for
- every group processed. If the evaluation returns {tt true} the group
- is taken into account otherwise it is ignored and the next group will
- be examined. \
- \
- In order to implement the {it having clause} a variety of changes
- have been made to the following stages:
- %
- begin{itemize}
- <step> The {it parser stage} has been modified slightly to build up and
- transform an {it operator tree} for the {it having clause}.
- %
- <step> The {it rewrite system} has been adapted to be able to use {it
- views} with the {it having logic}.
- %
- <step> The {it planner/optimizer} now takes the {it operator tree} of
- the {it having clause} and attaches it to the {tt AGG} node (which
- is the top node of the {it queryplan}).
- %
- <step> The {it executor} has been modified to evaluate the {it operator tree}
- (i.e. the internal representation of the {it having
- qualification}) attached to the {tt AGG} node and the results of the
- {it aggregation} are only considered if the evaluation returns {tt true}.
- end{itemize}
- %
- In the following sections we will describe the changes made to every
- single stage in detail.
- subsubsection{The Parser Stage}
- The grammar rules of the {it parser} defined in {tt gram.y} did not
- require any changes (i.e. the rules had already been prepared for the
- {it having clause}). The {it operator tree} built up for the {it having
- clause} is attached to the field {tt havingClause} of the {tt
- SelectStmt} node handed back by the {it parser}. \
- \
- The {it transformation} procedures applied to the tree handed back by
- the {it parser} transform the {it operator tree} attached to the field
- {tt havingClause} using exactly the same functions used for the
- transformation of the {it operator tree} for the {it where clause}. This
- is possible because both trees are built up by the same grammar rules
- of the {it parser} and are therefore compatible. Additional checks
- which make sure that the {it having clause} involves at least one
- {it aggregate function} etc. are performed at a later point in time
- in the {it planner/optimizer} stage. \
- \
- The necessary changes have been applied to the following functions
- included in the file {tt
- $ldots$/src/backend/parser/analyze.c}. Note, that only the relevant
- parts of the affected code are presented instead of the whole
- functions. Every added source line will be marked by a {tt '+'} at the
- beginning of the line and every changed source line will be marked by
- a {tt '!'} throughout the following code listings. Whenever a part of
- the code which is not relevant at the moment is skipped, three
- vertical dots are inserted instead.
- %
- pagebreak
- %
- begin{itemize}
- <step> {tt transformInsertStmt()} \
- This function becomes is invoked every time a SQL {tt insert}
- statement involving a {tt select} is used like the following example
- illustrates:
- %
- begin{verbatim}
- insert into t2
- select x, y
- from t1;
- end{verbatim}
- %
- Two statements have been added to this function. The first one
- performs the transformation of the {it operator tree} attached to the
- field {tt havingClause} using the function {tt
- transformWhereClause()} as done for the {it where clause}. It is
- possible to use the same function for both clauses, because they are
- both built up by the same {it grammar rules} given in {tt gram.y}
- and are therefore compatible.
- The second statement makes sure, that {it aggregate functions} are
- involved in the query whenever a {it having clause} is used,
- otherwise the query could have been formulated using only a {it where
- clause}.
- %
- begin{verbatim}
- static Query *
- transformInsertStmt(ParseState *pstate,
- InsertStmt *stmt)
- {
- /* make a new query tree */
- Query *qry = makeNode(Query);
- .
- .
- .
- /* fix where clause */
- qry->qual = transformWhereClause(pstate,
- stmt->whereClause);
- + /* The havingQual has a similar meaning as "qual" in
- + * the where statement. So we can easily use the
- + * code from the "where clause" with some additional
- + * traversals done in .../optimizer/plan/planner.c
- + */
- + qry->havingQual = transformWhereClause(pstate,
- + stmt->havingClause);
- .
- .
- .
- + /* If there is a havingQual but there are no
- + * aggregates, then there is something wrong with
- + * the query because having must contain aggregates
- + * in its expressions! Otherwise the query could
- + * have been formulated using the where clause.
- + */
- + if((qry->hasAggs == false) &&
- + (qry->havingQual != NULL))
- + {
- + elog(ERROR,"This is not a valid having query!");
- + return (Query *)NIL;
- + }
- return (Query *) qry;
- }
- end{verbatim}
- %
- <step> {tt transformSelectStmt()} \
- Exactly the same statements added to the function {tt
- transformInsertStmt()} above have been added here as well.
- %
- begin{verbatim}
- static Query *
- transformSelectStmt(ParseState *pstate,
- SelectStmt *stmt)
- {
- Query *qry = makeNode(Query);
- qry->commandType = CMD_SELECT;
- .
- .
- .
- qry->qual = transformWhereClause(pstate,
- stmt->whereClause);
- + /* The havingQual has a similar meaning as "qual" in
- + * the where statement. So we can easily use the
- + * code from the "where clause" with some additional
- + * traversals done in .../optimizer/plan/planner.c
- + */
- + qry->havingQual = transformWhereClause(pstate,
- + stmt->havingClause);
- .
- .
- .
- + /* If there is a havingQual but there are no
- + * aggregates, then there is something wrong with
- + * the query because having must contain aggregates
- + * in its expressions! Otherwise the query could
- + * have been formulated using the where clause.
- + */
- + if((qry->hasAggs == false) &&
- + (qry->havingQual != NULL))
- + {
- + elog(ERROR,"This is not a valid having query!");
- + return (Query *)NIL;
- + }
- return (Query *) qry;
- }
- end{verbatim}
- %
- end{itemize}
- subsubsection{The Rewrite System}
- This section describes the changes to the {it rewrite system} of
- <productname>Postgres</productname> that have been necessary to support the use of {it views}
- within queries using a {it having clause} and to support the
- definition of {it views} by queries using a {it having clause}.
- As described in section ref{view_impl} {it Techniques To Implement
- Views} a query rewrite technique is used to implement {it
- views}. There are two cases to be handled within the {it rewrite
- system} as far as the {it having clause} is concerned:
- %
- pagebreak
- %
- begin{itemize}
- <step> The {it view definition} does not contain a {it having clause}
- but the queries evaluated against this view may contain {it having
- clauses}.
- <step> The {it view definition} contains a {it having clause}. In this
- case queries evaluated against this view must meet some
- restrictions as we will describe later.
- end{itemize}
- %
- paragraph{No having clause in the view definition:} First we will
- look at the changes necessary to support queries using a
- {it having clause} against a {it view} defined without
- a {it having clause}. \
- \
- Let the following view definition be given:
- %
- begin{verbatim}
- create view test_view
- as select sno, pno
- from sells
- where sno > 2;
- end{verbatim}
- %
- and the following query made against <literal>test_view</literal>:
- %
- begin{verbatim}
- select *
- from testview
- where sno <> 5;
- end{verbatim}
- %
- The query will be rewritten to:
- %
- begin{verbatim}
- select sno, pno
- from sells
- where sno > 2 and
- sno <> 5;
- end{verbatim}
- %
- The query given in the definition of the {it view} <literal>test_view</literal>
- is the {it backbone} of the rewritten query. The {it targetlist} is taken
- from the user's query and also the {it where qualification } of the
- user's query is added to the {it where qualification} of the new
- query by using an {tt AND} operation. \
- \
- Now consider the following query:
- %
- begin{verbatim}
- select sno, count(pno)
- from testview
- where sno <> 5
- group by sno
- having count(pno) > 1;
- end{verbatim}
- %
- From now on it is no longer sufficient to add just the {it where
- clause} and the {it targetlist} of the user's query to the new query. The
- {it group clause} and the {it having qualification} also have to be
- added to the rewritten query:
- %
- begin{verbatim}
- select sno, count(pno)
- from sells
- where sno > 2 and
- sno <> 5
- group by sno
- having count(pno) > 1;
- end{verbatim}
- %
- pagebreak
- noindent Several changes that have already been applied to the {it
- targetlist} and the {it where clause} also have to be applied to the
- {it having clause}. Here is a collection of all {it additional} steps that
- have to be performed in order to rewrite a query using a {it having
- clause} against a simple {it view} (i.e. a {it view} whose
- definition does not use any {it group} and {it having clauses}):
- %
- begin{itemize}
- %
- <step> Rewrite the subselects contained in the {it having clause} if any are
- present.
- %
- <step> Adapt the {tt varno} and {tt varattno} fields of all {tt
- VAR} nodes contained in the {it operator tree} representing the {it
- having clause} in the same way as it has been done for the tree
- representing the {it where clause}. The {tt varno} fields are changed
- to use the {it base tables} given in the {it view definition} (which
- have been inserted into the {it range table entry list} in the
- meantime) instead of the {it virtual tables}. The positions of
- the attributes used in the {it view} may differ from the positions of
- the corresponding attributes in the {it base tables}. That's why the
- {tt varattno} fields also have to be adapted.
- %
- <step> Adapt the {tt varno} and {tt varattno} fields of all {tt
- VAR} nodes contained in the {tt groupClause} of the user's query in
- the way and for the reasons described above.
- %
- <step> Attach the tree representing the {it having qualification} (which is
- currently attached to the field {tt havingClause} of the {tt Query}
- node for the user's query) to the field {tt havingClause} of the {tt
- Query} node for the new (rewritten) query.
- %
- <step> Attach the list representing the {it group clause} (currently
- attached to the field {tt groupClause} of the {tt Query} node for
- the user's query) to the field {it group clause} of the node for the
- new (rewritten) query.
- %
- end{itemize}
- paragraph{The view definition contains a having clause:} Now we will
- look at the problems that can arise using {it views} that are
- defined using a query involving a {it having clause}. \
- \
- Let the following {it view definition} be given:
- %
- begin{verbatim}
- create view test_view
- as select sno, count(pno) as number
- from sells
- where sno > 2
- group by sno
- having count(pno) > 1;
- end{verbatim}
- %
- Simple queries against this {it view} will not cause any troubles:
- %
- begin{verbatim}
- select *
- from test_view
- where sno <> 5;
- end{verbatim}
- %
- This query can easily be rewritten by adding the {it where
- qualification} of the user's query ({tt sno $<>$ 5}) to the {it
- where qualification} of the {it view definition's } query. \
- \
- The next query is also simple but it will cause troubles when
- it is evaluated against the above given {it view definition}:
- %
- begin{verbatim}
- select *
- from test_view
- where number > 1; /* count(pno) in the view def.
- * is called number here */
- end{verbatim}
- %
- pagebreak
- The currently implemented techniques for query rewriting will rewrite
- the query to:
- %
- begin{verbatim}
- select *
- from sells
- where sno > 2 and
- count(pno) > 1
- group by sno
- having count(pno) > 1;
- end{verbatim}
- %
- which is an invalid query because an {it aggregate function} appears
- in the {it where clause}. \
- \
- Also the next query will cause troubles:
- %
- begin{verbatim}
- select pno, count(sno)
- from test_view
- group by pno;
- end{verbatim}
- %
- As you can see this query does neither involve a {it where clause}
- nor a {it having clause} but it contains a {it group clause} which
- groups by the attribute {tt pno}. The query in the definition of the
- {it view} also contains a {it group clause} that groups by the
- attribute {tt sno}. The two {it group clauses} are in conflict with
- each other and therefore the query cannot be rewritten to a form that
- would make sense.\
- \
- {bf Note:} There is no solution to the above mentioned problems at the
- moment and it does not make sense to put much effort into that because
- the implementation of the support for queries like:
- %
- begin{verbatim}
- select pno_count, count(sno)
- from ( select sno, count(pno) as pno_count
- from sells
- where sno > 2
- group by sno
- having count(pno) > 1)
- group by pno_count;
- end{verbatim}
- %
- (which is part of the SQL92 standard) will automatically also solve
- these problems. \
- \
- In the next part of the current section we will present the changes
- applied to the source code in order to realize the above described
- items. Note that it is not necessary to understand the meaning of
- every single source line here and therefore we will not discuss
- detailed questions like "Why has the variable {tt varno} to be
- increased by 3?". Questions like that belong to a chapter dealing
- with the implementation of {it views} in <productname>Postgres</productname> and to be able to
- answer them it would be necessary to know all the functions and not
- only those described here. The fact important for us is to make sure,
- that whatever is applied to the {it targetlist} and the data
- structures representing the {it where clause} is also applied to the
- data structures for the {it having clause}. There are three files
- affected: \
- \
- indent {tt $ldots$/src/backend/rewrite/rewriteHandler.c} \
- indent {tt $ldots$/src/backend/rewrite/rewriteManip.c} \
- indent {tt $ldots$/src/backend/commands/view.c} \
- \
- Here is a description of the changes made to the functions contained
- in the file {tt $ldots$/src/backend/rewrite/rewriteHandler.c}:
- %
- pagebreak
- %
- begin{itemize}
- <step> {tt ApplyRetrieveRule()} \
- This function becomes invoked whenever a {tt select} statement
- against a {it view} is recognized and applies the {it rewrite rule}
- stored in the {it system catalogs}. The additional source lines given
- in the listing below make sure that the functions {tt
- OffsetVarNodes()} and {tt ChangeVarNodes()} that are invoked for the
- {it where clause} and the {it targetlist} of the query given in the
- {it view definition} are also called for the {it having clause} and
- the {it group clause} of the query in the {it view
- definition}. These functions adapt the {tt varno} and {tt varattno}
- fields of the {tt VAR} nodes involved.
- The additional source lines at the end of {tt ApplyRetrieveRule()}
- attach the data structures representing the {it having clause} and
- the {it group clause} of the query in the {it view definition} to
- the rewritten {it parsetree}. As mentioned earlier, a {it view
- definition} involving a {it group clause} will cause troubles
- whenever a query using a different {it group clause} against this
- {it view} is executed. There is no mechanism preventing these
- troubles included at the moment.
- Note that the functions {tt OffsetVarNodes()} , {tt ChangeVarNodes()}
- and {tt AddHavingQual()} appearing in {tt ApplyRetrieveRule()} are
- described at a later point in time.
- %
- begin{verbatim}
- static void
- ApplyRetrieveRule(Query *parsetree, RewriteRule *rule,
- int rt_index, int relation_level,
- Relation relation, int *modified)
- {
- Query *rule_action = NULL;
- Node *rule_qual;
- List *rtable,
- .
- .
- .
- OffsetVarNodes((Node *) rule_action->targetList,
- rt_length);
- OffsetVarNodes(rule_qual, rt_length);
-
- + OffsetVarNodes((Node *) rule_action->groupClause,
- + rt_length);
- + OffsetVarNodes((Node *) rule_action->havingQual,
- + rt_length);
- .
- .
- .
- ChangeVarNodes(rule_qual,
- PRS2_CURRENT_VARNO + rt_length,
- rt_index, 0);
- + ChangeVarNodes((Node *) rule_action->groupClause,
- + PRS2_CURRENT_VARNO + rt_length,
- + rt_index, 0);
- + ChangeVarNodes((Node *) rule_action->havingQual,
- + PRS2_CURRENT_VARNO + rt_length,
- + rt_index, 0);
- .
- .
- .
- end{verbatim}
- pagebreak
- begin{verbatim}
- if (*modified && !badsql)
- {
- AddQual(parsetree, rule_action->qual);
- + /* This will only work if the query made to the
- + * view defined by the following groupClause
- + * groups by the same attributes or does not use
- + * groups at all!
- + */
- + if (parsetree->groupClause == NULL)
- + parsetree->groupClause =
- + rule_action->groupClause;
- + AddHavingQual(parsetree,
- + rule_action->havingQual);
- + parsetree->hasAggs =
- + (rule_action->hasAggs || parsetree->hasAggs);
- + parsetree->hasSubLinks =
- + (rule_action->hasSubLinks ||
- + parsetree->hasSubLinks);
- }
- }
- end{verbatim}
- %
- <step> {tt QueryRewriteSubLink()} \
- This function is called by {tt QueryRewrite()} to process possibly
- contained subqueries first. It searches for nested queries by
- recursively tracing through the {it parsetree} given as argument. The
- additional statement makes sure that the {it having clause} is also
- examined.
- %
- begin{verbatim}
- static void
- QueryRewriteSubLink(Node *node)
- {
- if (node == NULL)
- return;
- switch (nodeTag(node))
- {
- case T_SubLink:
- {
- .
- .
- .
- QueryRewriteSubLink((Node *) query->qual);
- + QueryRewriteSubLink((Node *)
- + query->havingQual);
- .
- .
- .
- }
- .
- .
- .
- }
- return;
- }
- end{verbatim}
- %
- pagebreak
- %
- <step> {tt QueryRewrite()} \
- This function takes the {it parsetree} of a query and rewrites it
- using <productname>Postgres</productname>'s {it rewrite system}. Before the query itself can be
- rewritten, subqueries that are possibly part of the query have to be
- processed. Therefore the function {tt QueryRewriteSubLink()} is
- called for the {it where clause} and for the {it having clause}.
- %
- begin{verbatim}
- List *
- QueryRewrite(Query *parsetree)
- {
- QueryRewriteSubLink(parsetree->qual);
- + QueryRewriteSubLink(parsetree->havingQual);
- return QueryRewriteOne(parsetree);
- }
- end{verbatim}
- %
- end{itemize}
- %
- Here we present the changes applied to the functions that are contained
- in the file {tt $ldots$/src/backend/rewrite/rewriteManip.c}:
- %
- begin{itemize}
- %
- <step> {tt OffsetVarNodes()} \
- Recursively steps through the {it parsetree} given as the first
- argument and increments the {tt varno} and {tt varnoold} fields of
- every {tt VAR} node found by the {it offset} given as the second
- argument. The additional statements are necessary to be able to handle
- {tt GroupClause} nodes and {tt Sublink} nodes that may appear in the {it
- parsetree} from now on.
- %
- begin{verbatim}
- void
- OffsetVarNodes(Node *node, int offset)
- {
- if (node == NULL)
- return;
- switch (nodeTag(node))
- {
- .
- .
- .
- + /* This has to be done to make queries using
- + * groupclauses work on views
- + */
- + case T_GroupClause:
- + {
- + GroupClause *group = (GroupClause *) node;
- +
- + OffsetVarNodes((Node *)(group->entry),
- + offset);
- + }
- + break;
- .
- .
- .
- + case T_SubLink:
- + {
- + SubLink *sublink = (SubLink *) node;
- + List *tmp_oper, *tmp_lefthand;
- +
- end{verbatim}
- pagebreak
- begin{verbatim}
- + /* We also have to adapt the variables used
- + * in sublink->lefthand and sublink->oper
- + */
- + OffsetVarNodes((Node *)(sublink->lefthand),
- + offset);
- +
- + /* Make sure the first argument of
- + * sublink->oper points to the same var as
- + * sublink->lefthand does otherwise we will
- + * run into troubles using aggregates (aggno
- + * will not be set correctly)
- + */
- + tmp_lefthand = sublink->lefthand;
- + foreach(tmp_oper, sublink->oper)
- + {
- + lfirst(((Expr *)lfirst(tmp_oper))->args) =
- + lfirst(tmp_lefthand);
- + tmp_lefthand = lnext(tmp_lefthand);
- + }
- + }
- + break;
- .
- .
- .
- }
- }
- end{verbatim}
- %
- <step> {tt ChangeVarNodes()} \
- This function is similar to the above described function {tt
- OffsetVarNodes()} but instead of incrementing the fields {tt varno}
- and {tt varnoold} of {it all} {tt VAR} nodes found, it processes
- only those {tt VAR} nodes whose {tt varno} value matches the
- parameter {tt old_varno} given as argument and whose {tt
- varlevelsup} value matches the parameter {tt sublevels_up}. Whenever
- such a node is found, the {tt varno} and {tt varnoold} fields are
- set to the value given in the parameter {tt new_varno}. The
- additional statements are necessary to be able to handle {tt GroupClause}
- and {tt Sublink} nodes.
- %
- begin{verbatim}
- void
- ChangeVarNodes(Node *node, int old_varno,
- int new_varno, int sublevels_up)
- {
- if (node == NULL)
- return;
- switch (nodeTag(node))
- {
- .
- .
- .
- + /* This has to be done to make queries using
- + * groupclauses work on views */
- + case T_GroupClause:
- + {
- + GroupClause *group = (GroupClause *) node;
- +
- end{verbatim}
- pagebreak
- begin{verbatim}
- + ChangeVarNodes((Node *)(group->entry),
- + old_varno, new_varno,
- + sublevels_up);
- + }
- + break;
- .
- .
- .
- case T_Var:
- {
- .
- .
- .
- /* This is a hack: Whenever an attribute
- * from the "outside" query is used within
- * a nested subquery, the varlevelsup will
- * be >0. Nodes having varlevelsup > 0 are
- * forgotten to be processed. The call to
- * OffsetVarNodes() should really be done at
- * another place but this hack makes sure
- * that also those VAR nodes are processed.
- */
- + if (var->varlevelsup > 0)
- + OffsetVarNodes((Node *)var,3);
- }
- break;
- .
- .
- .
- case T_SubLink:
- {
- .
- .
- .
- + ChangeVarNodes((Node *) query->havingQual,
- + old_varno, new_varno,
- + sublevels_up);
- + ChangeVarNodes((Node *) query->targetList,
- + old_varno, new_varno,
- + sublevels_up);
- +
- + /* We also have to adapt the variables used in
- + * sublink->lefthand and sublink->oper
- + */
- + ChangeVarNodes((Node *) (sublink->lefthand),
- + old_varno, new_varno,
- + sublevels_up);
- }
- break;
- .
- .
- .
- }
- }
- end{verbatim}
- %
- <step> {tt AddHavingQual()} \
- This function adds the {it operator tree} given by the parameter {tt
- havingQual} to the one attached to the field {tt havingQual} of the
- {it parsetree} given by the parameter {tt parsetree}. This is done
- by adding a new {tt AND} node and attaching the old and the new {it
- operator tree} as arguments to it. {tt AddHavingQual()} has not been
- existing until v6.3.2. It has been created for the {it having logic}.
- %
- begin{verbatim}
- void
- AddHavingQual(Query *parsetree, Node *havingQual)
- {
- Node *copy, *old;
- if (havingQual == NULL)
- return;
- copy = havingQual;
- old = parsetree->havingQual;
- if (old == NULL)
- parsetree->havingQual = copy;
- else
- parsetree->havingQual =
- (Node *) make_andclause(
- makeList(parsetree->havingQual,
- copy, -1));
- }
- end{verbatim}
- %
- <step> {tt AddNotHavingQual()} \
- This function is similar to the above described function {tt
- AddHavingQual()}. It also adds the {it operator tree} given by the
- parameter {tt havingQual} but prefixes it by a {tt NOT} node. {tt
- AddNotHavingQual()} has also not been existing until v6.3.2 and has been
- created for the {it having logic}.
- %
- begin{verbatim}
- void
- AddNotHavingQual(Query *parsetree,
- Node *havingQual)
- {
- Node *copy;
- if (havingQual == NULL)
- return;
- copy = (Node *) make_notclause((Expr *)havingQual);
- AddHavingQual(parsetree, copy);
- }
- end{verbatim}
- %
- <step> {tt nodeHandleViewRule()} \
- This function is called by {tt HandleViewRule()}. It replaces all
- {tt VAR} nodes of the {it user query} evaluated against the {it
- view} (the fields of these {tt VAR} nodes represent the positions of
- the attributes in the {it virtual} table) by {tt VAR} nodes that
- have already been prepared to represent the positions of the
- corresponding attributes in the {it physical} tables (given in the
- {it view definition}). The additional statements make sure that {tt
- GroupClause} nodes and {tt Sublink} nodes are handled correctly.
- begin{verbatim}
- static void
- nodeHandleViewRule(Node **nodePtr, List *rtable,
- List *targetlist, int rt_index,
- int *modified, int sublevels_up)
- {
- Node *node = *nodePtr;
- if (node == NULL)
- return;
- switch (nodeTag(node))
- {
- .
- .
- .
- + /* This has to be done to make queries using
- + * groupclauses work on views
- + */
- + case T_GroupClause:
- + {
- + GroupClause *group = (GroupClause *) node;
- + nodeHandleViewRule((Node **) (&(group->entry)),
- + rtable, targetlist, rt_index,
- + modified, sublevels_up);
- + }
- + break;
- .
- .
- .
- case T_Var:
- {
- .
- .
- .
- if (n == NULL)
- {
- *nodePtr = make_null(((Var *)node)->vartype);
- }
- else
- + /* This is a hack: The varlevelsup of the
- + * original variable and the new one should
- + * be the same. Normally we adapt the node
- + * by changing a pointer to point to a var
- + * contained in 'targetlist'. In the
- + * targetlist all varlevelsups are 0 so if
- + * we want to change it to the original
- + * value we have to copy the node before!
- + * (Maybe this will cause troubles with some
- + * sophisticated queries on views?)
- + */
- + {
- + if(this_varlevelsup>0)
- + {
- + *nodePtr = copyObject(n);
- + }
- + else
- + {
- + *nodePtr = n;
- + }
- + ((Var *)*nodePtr)->varlevelsup =
- + this_varlevelsup;
- + }
- *modified = TRUE;
- }
- break;
- .
- .
- .
- case T_SubLink:
- {
- .
- .
- .
- + nodeHandleViewRule(
- + (Node **) &(query->havingQual),
- + rtable, targetlist, rt_index,
- + modified, sublevels_up);
- + nodeHandleViewRule(
- + (Node **) &(query->targetList),
- + rtable, targetlist, rt_index,
- + modified, sublevels_up);
- + /* We also have to adapt the variables used
- + * in sublink->lefthand and sublink->oper
- + */
- + nodeHandleViewRule(
- + (Node **) &(sublink->lefthand),
- + rtable, targetlist, rt_index,
- + modified, sublevels_up);
- + /* Make sure the first argument of
- + * sublink->oper points to the same var as
- + * sublink->lefthand does otherwise we will
- + * run into troubles using aggregates
- + * (aggno will not be set correctly!)
- + */
- + pfree(lfirst(((Expr *)
- + lfirst(sublink->oper))->args));
- + tmp_lefthand = sublink->lefthand;
- + foreach(tmp_oper, sublink->oper)
- + {
- + lfirst(((Expr *) lfirst(tmp_oper))->args) =
- + lfirst(tmp_lefthand);
- + tmp_lefthand = lnext(tmp_lefthand);
- + }
- }
- break;
- .
- .
- .
- }
- }
- end{verbatim}
- %
- <step> {tt HandleViewRule()} \
- This function calls {tt nodeHandleViewRule()} for the {it where
- clause}, the {it targetlist}, the {it group clause} and the {it
- having clause} of the {it user query} evaluated against the given
- {it view}.
- %
- begin{verbatim}
- void
- HandleViewRule(Query *parsetree, List *rtable,
- List *targetlist, int rt_index,
- int *modified)
- {
- .
- .
- .
- + /* The variables in the havingQual and
- + * groupClause also have to be adapted
- + */
- + nodeHandleViewRule(&parsetree->havingQual, rtable,
- + targetlist, rt_index,
- + modified, 0);
- + nodeHandleViewRule(
- + (Node **)(&(parsetree->groupClause)),
- + rtable, targetlist, rt_index, modified, 0);
- }
- end{verbatim}
- %
- end{itemize}
- %
- The following function is contained in {tt
- $ldots$/src/backend/commands/view.c}:
- begin{itemize}
- %
- <step> {tt UpdateRangeTableOfViewParse()} \
- This function updates the {it range table} of the {it parsetree}
- given by the parameter {tt viewParse}. The additional statement makes
- sure that the {tt VAR} nodes of the {it having clause} are modified
- in the same way as the {tt VAR} nodes of the {it where clause} are.
- %
- begin{verbatim}
- static void
- UpdateRangeTableOfViewParse(char *viewName,
- Query *viewParse)
- {
- .
- .
- .
- OffsetVarNodes(viewParse->qual, 2);
- + OffsetVarNodes(viewParse->havingQual, 2);
- .
- .
- .
- }
- end{verbatim}
- %
- end{itemize}
- subsubsection{Planner/Optimizer}
- The {it planner} builds a {it queryplan} like the one shown in
- figure ref{plan_having} and in addition to that it takes the {it operator
- tree} attached to the field {tt havingClause} of the {tt Query} node
- and attaches is to the {tt qpqual} field of the {tt AGG}
- node.
- Unfortunately this is not the only thing to do. Remember from section
- ref{aggregates} {it How Aggregate Functions are Implemented} that
- the {it targetlist} is searched for {it aggregate functions} which
- are appended to a list that will be attached to the field {tt aggs}
- of the {tt AGG} node. This was sufficient as long as {it aggregate
- functions} have only been allowed to appear within the {it
- targetlist}. Now the {it having clause} is another source of {it
- aggregate functions}. Consider the following example:
- %
- begin{verbatim}
- select sno, max(pno)
- from sells
- group by sno
- having count(pno) > 1;
- end{verbatim}
- %
- Here the {it aggregate functions} {tt max} and {tt count} are in
- use. If only the {it targetlist} is scanned (as it was the case before the
- {it having clause} had been implemented) we will only find and process the
- {it aggregate function} {tt max}. The second function {tt count} is
- not processed and therefore any reference to the result of {tt count}
- from within the {it having clause} will fail. The solution to this
- problem is to scan the whole {it operator tree} representing the {it having
- clause} for {it aggregate functions} not contained in the {it targetlist} yet
- and add them to the list of {it aggregate functions} attached to the
- field {tt aggs} of the {tt AGG} node. The scanning is done by the
- function mbox{{tt check_having_qual_for_aggs()}} which steps
- recursively through the tree.\
- \
- While scanning the {it having clause} for {it aggregate functions}
- not contained in the {it targetlist} yet, an additional check is made to
- make sure that {it aggregate functions} are used within
- the {it having clause} (otherwise the query could have been
- formulated using the {it where clause}). Consider the following query
- which is not a valid SQL92 query:
- %
- begin{verbatim}
- testdb=> select sno, max(pno)
- testdb-> from sells
- testdb-> group by sno
- testdb-> having sno > 1;
- ERROR: This could have been done in a where clause!!
- testdb=>
- end{verbatim}
- %
- There is no need to express this query using a {it having clause},
- this kind of qualification belongs to the {it where clause}:
- %
- begin{verbatim}
- select sno, max(pno)
- from sells
- where sno > 1
- group by sno;
- end{verbatim}
- %
- There is still an unsolved problem left. Consider the following query
- where we want to know just the supplier numbers ({tt sno}) of all
- suppliers selling more than one part:
- %
- begin{verbatim}
- select sno
- from sells
- group by sno
- having count(pno) > 1;
- end{verbatim}
- %
- The {it planner} creates a {it queryplan} (like the one shown in figure
- ref{plan_having}) where the {it targetlists} of all nodes involved contain
- only entries of those attributes listed after the {tt select} keyword of
- the query. Looking at the example above this means that the
- {it targetlists} of the {tt AGG} node, the {tt GRP} node the {tt SORT}
- node and the {tt SeqScan} node contain only the entry for the
- attribute {tt sno}. As described earlier the {it aggregation logic}
- operates on attributes of the tuples returned by the subplan of
- the {tt AGG} node (i.e. the result of the {tt GRP} node). Which
- attributes are contained in the tuples handed back by a subplan is
- determined by the {it targetlist}. In the case of our example the attribute
- {tt pno} needed for the {it aggregate function} {tt count} is not
- included and therefore the {it aggregation} will fail.
- pagebreak
- noindent The solution to this problem is given in the following
- steps:
- begin{itemize}
- <step> Make a copy of the actual {it targetlist} of the {tt AGG} node.
- %
- <step> Search the {it operator tree} representing the {it having clause}