README
上传用户:blenddy
上传日期:2007-01-07
资源大小:6495k
文件大小:5k
- Summary
- -------
- The optimizer generates optimial query plans by doing several steps:
- 1) Take each relation in a query, and make a RelOptInfo structure for
- it. Find each way of accessing the relation, called a Path, including
- sequential and index scans, and add it to RelOptInfo.pathlist. Also
- create RelOptInfo.joininfo that lists all the joins that involve this
- relation. For example, the WHERE clause "tab1.col1 = tab2.col1"
- generates a JoinInfo for tab1 listing tab2 as an unjoined relation, and
- tab2's joininfo shows tab1 as an unjoined relation.
- 2) Join each RelOptInfo to other RelOptInfo as specified in
- RelOptInfo.joininfo. At this point each RelOptInfo is a single
- relation, so you are joining every relation to the other relations as
- joined in the WHERE clause.
- Joins occur using two RelOptInfos. One is outer, the other inner.
- Outers drive lookups of values in the inner. In a nested loop, lookups
- of values in the inner occur by scanning to find each matching inner
- row. In a mergejoin, inner and outer rows are ordered, and are accessed
- in order, so only one scan of inner is required to perform the entire
- join. In a hashjoin, inner rows are hashed for lookups.
- Each unique join combination becomes a new RelOptInfo. The RelOptInfo
- is now the joining of two relations. RelOptInfo.pathlist are various
- paths to create the joined result, having different orderings depending
- on the join method used.
- 3) At this point, every RelOptInfo is joined to each other again, with
- a new relation added to each RelOptInfo. This continues until all
- relations have been joined into one RelOptInfo, and the cheapest Path is
- chosen.
- SELECT *
- FROM tab1, tab2, tab3, tab4
- WHERE tab1.col = tab2.col AND
- tab2.col = tab3.col AND
- tab3.col = tab4.col
- Tables 1, 2, 3, and 4 are joined as:
- {1 2},{2 3},{3 4}
- {1 2 3},{2 3 4}
- {1 2 3 4}
- SELECT *
- FROM tab1, tab2, tab3, tab4
- WHERE tab1.col = tab2.col AND
- tab1.col = tab3.col AND
- tab1.col = tab4.col
- Tables 1, 2, 3, and 4 are joined as:
- {1 2},{1 3},{1 4}
- {1 2 3},{1 3 4},{1,2,4}
- {1 2 3 4}
- In the default left-handed joins, each RelOptInfo adds one
- single-relation RelOptInfo in each join pass, and the added RelOptInfo
- is always the inner relation in the join. In right-handed joins, the
- added RelOptInfo is the outer relation in the join. In bushy plans,
- multi-relation RelOptInfo's can be joined to other multi-relation
- RelOptInfo's.
- Optimizer Functions
- -------------------
- These directories take the Query structure returned by the parser, and
- generate a plan used by the executor. The /plan directory generates the
- plan, the /path generates all possible ways to join the tables, and
- /prep handles special cases like inheritance. /utils is utility stuff.
- planner()
- handle inheritance by processing separately
- -init_query_planner()
- preprocess target list
- preprocess qualifications(WHERE)
- --query_planner()
- cnfify()
- Summary:
- Simple cases with all AND's are handled by removing the AND's:
- convert: a = 1 AND b = 2 AND c = 3
- to: a = 1, b = 2, c = 3
- Qualifications with OR's are handled differently. OR's inside AND
- clauses are not modified drastically:
- convert: a = 1 AND b = 2 AND (c = 3 OR d = 4)
- to: a = 1, b = 2, c = 3 OR d = 4
- OR's in the upper level are more complex to handle:
- convert: (a = 1 AND b = 2) OR c = 3
- to: (a = 1 OR c = 3) AND (b = 2 OR c = 3)
- finally: (a = 1 OR c = 3), (b = 2 OR c = 3)
- These clauses all have to be true for a result to be returned,
- so the optimizer can choose the most restrictive clauses.
- pull out constants from target list
- get a target list that only contains column names, no expressions
- if none, then return
- ---subplanner()
- make list of relations in target
- make list of relations in where clause
- split up the qual into restrictions (a=1) and joins (b=c)
- find relation clauses can do merge sort and hash joins
- ----make_one_rel()
- set_base_rel_pathlist()
- find scan and all index paths for each relation
- find selectivity of columns used in joins
- -----make_one_rel_by_joins()
- jump to geqo if needed
- again:
- make_rels_by_joins():
- for each joinrel:
- make_rels_by_clause_joins()
- for each rel's joininfo list:
- if a join from the join clause adds only one relation, do the join
- or make_rels_by_clauseless_joins()
- update_rels_pathlist_for_joins()
- generate nested,merge,hash join paths for new rel's created above
- merge_rels_with_same_relids()
- merge RelOptInfo paths that have the same relids because of joins
- rels_set_cheapest()
- set cheapest path
- if all relations in one RelOptInfo, return
- do group(GROUP)
- do aggregate
- put back constants
- re-flatten target list
- make unique(DISTINCT)
- make sort(ORDER BY)
- Optimizer Structures
- --------------------
- RelOptInfo - a relation or joined relations
- RestrictInfo - restriction clauses, like "x = 3"
- JoinInfo - join clauses, including the relids needed for the join
- Path - every way to generate a RelOptInfo(sequential,index,joins)
- IndexPath - index scans
- NestPath - nested joins
- MergePath - merge joins
- HashPath - hash joins
- PathOrder - every ordering type (sort, merge of relations)