Technology Relational Database Index Design And The Optimizers Pdf


Wednesday, September 25, 2019

Relational database index design and the optimizers: DB2, Oracle, SQL server et al / Lahdenmäki and Leach. p. cm. Includes bibliographical references and. Improve the performance of relational databases with indexes designed for today's hardware. Over the last few years, hardware and software. Application Performance Optimization Summary. Contribute to sjtuhjh/appdocs development by creating an account on GitHub.

Relational Database Index Design And The Optimizers Pdf

Language:English, Spanish, Hindi
Genre:Academic & Education
Published (Last):14.04.2016
ePub File Size:22.81 MB
PDF File Size:15.57 MB
Distribution:Free* [*Regsitration Required]
Uploaded by: ROBENA

Introduction. Index and Table Pages. Index Rows. Index Structure. Table Rows. Buffer Pools and Disk I/Os. Reads from the DBMS Buffer. Relational Database. Index Design and the. Optimizers. DB2, Oracle, SQL Server , et al. Tapio Lahdenmäki. Michael Leach. A JOHN WILEY & SONS, INC. Lahdenmaki T., Leach M. Relational Database Index Design and the Optimizers. Файл формата pdf; размером 6,22 МБ. Добавлен.

In Figure the leftmost leaf block is linked to the second leaf block Note: Indexes in columns with character data are based on the binary values of the characters in the database character set. Index Scans In an index scan, the database retrieves a row by traversing the index, using the indexed column values specified by the statement. This is the basic principle behind Oracle Database indexes. If a SQL statement accesses only indexed columns, then the database reads values directly from the index rather than from the table.

If the statement accesses columns in addition to the indexed columns, then the database uses rowids to find the rows in the table. Typically, the database retrieves table data by alternately reading an index block and then a table block. A full index scan is available if a predicate WHERE clause in the SQL statement references a column in the index, and in some circumstances when no predicate is specified.

A full scan can eliminate sorting because the data is ordered by index key. Oracle Database performs a full scan of the index, reading it in sorted order ordered by department ID and last name and filtering on the salary attribute. In this way, the database scans a set of data smaller than the employees table, which contains more columns than are included in the query, and avoids sorting the data.

For example, the full scan could read the index entries as follows: 50,Atkinson,,rowid 60,Austin,,rowid 70,Baer,,rowid 80,Abel,,rowid 80,Ande,,rowid ,Austin,,rowid. Fast Full Index Scan A fast full index scan is a full index scan in which the database accesses the data in the index itself without accessing the table, and the database reads the index blocks in no particular order.

Fast full index scans are an alternative to a full table scan when both of the following conditions are met: The index must contain all columns needed for the query. A row containing all nulls must not appear in the query result set.

If the last name and salary are a composite key in an index, then a fast full index scan can read the index entries to obtain the requested information: Baida,,rowid Zlotkey,,rowid Austin,,rowid Baer,,rowid Atkinson,,rowid Austin,,rowid. Index Range Scan An index range scan is an ordered scan of an index that has the following characteristics: One or more leading columns of an index are specified in conditions. The database commonly uses an index range scan to access selective data.

The selectivity is the percentage of rows in the table that the query selects, with 0 meaning no rows and 1 meaning all rows. A predicate becomes more selective as the value approaches 0 and less selective or more unselective as the value approaches 1. For example, a user queries employees whose last names begin with A. For example, two employees are named Austin, so two rowids are associated with the key Austin.

An index range scan can be bounded on both sides, as in a query for departments with IDs between 10 and 40, or bounded on only one side, as in a query for IDs over To scan the index, the database moves backward or forward through the leaf blocks. For example, a scan for IDs between 10 and 40 locates the first index leaf block that contains the lowest key value that is 10 or greater.

The scan then proceeds horizontally through the linked list of leaf nodes until it locates a value greater than Index Unique Scan In contrast to an index range scan, an index unique scan must have either 0 or 1 rowid associated with an index key.

The database performs a unique scan when a predicate references all of the columns in a UNIQUE index key using an equality operator. An index unique scan stops processing as soon as it finds the first record because no second record is possible.

In this case, the database can use an index unique scan to locate the rowid for the employee whose ID is 5. Index Skip Scan An index skip scan uses logical subindexes of a composite index.

The database "skips" through a single index as if it were searching separate indexes. Skip scanning is beneficial if there are few distinct values in the leading column of a composite index and many distinct values in the nonleading key of the index. The database may choose an index skip scan when the leading column of the composite index is not specified in a query predicate. For example, assume that you run the following query for a customer in the sh.

Example shows a portion of the index entries. In a skip scan, the number of logical subindexes is determined by the number of distinct values in the leading column.

In Example , the leading column has two possible values.

Relational Database Principles

The database logically splits the index into one subindex with the key F and a second subindex with the key M. When searching for the record for the customer whose email is Abbey company. The more order that exists in row storage for this value, the lower the clustering factor.

The index entries point to random table blocks, so the database may have to read and reread the same blocks over and over again to retrieve the data pointed to by the index. The index keys in a range tend to point to the same data block, so the database does not have to read and reread the same blocks over and over. The clustering factor is relevant for index scans because it can show: Whether the database will use an index for large range scans The degree of table organization in relation to the index key Whether you should consider using an index-organized table, partitioning, or table cluster if rows must be ordered by the index key For example, assume that the employees table fits into two data blocks.

Table depicts the rows in the two data blocks the ellipses indicate data that is not shown. Rows are stored in the blocks in order of last name shown in bold. For example, the bottom row in data block 1 describes Abel, the next row up describes Ande, and so on alphabetically until the top row in block 1 for Steven King.

The bottom row in block 2 describes Kochar, the next row up describes Kumar, and so on alphabetically until the last row in the block for Zlotkey. Assume that an index exists on the last name column. Each name entry corresponds to a rowid. Conceptually, the index entries would look as follows: Abel,block1row1 Ande,block1row2 Atkinson,block1row3 Austin,block1row4 Baer,block1row5. Assume that a separate index exists on the employee ID column.

Conceptually, the index entries might look as follows, with employee IDs distributed in almost random locations throughout the two blocks: ,block1row50 ,block2row1 ,block1row9 ,block2row19 ,block2row39 ,block1row4.

For example, if the index key is 20, and if the two bytes stored for this key in hexadecimal are C1,15 in a standard B-tree index, then a reverse key index stores the bytes as 15,C1.

Reversing the key solves the problem of contention for leaf blocks in the right side of a B-tree index.

This problem can be especially acute in an Oracle Real Application Clusters Oracle RAC database in which multiple instances repeatedly modify the same block. For example, in an orders table the primary keys for orders are sequential.

One instance in the cluster adds order 20, while another adds 21, with each instance writing its key to the same leaf block on the right-hand side of the index. In a reverse key index, the reversal of the byte order distributes inserts across all leaf keys in the index.

For example, keys such as 20 and 21 that would have been adjacent in a standard key index are now stored far apart in separate blocks. Because the data in the index is not sorted by column key when it is stored, the reverse key arrangement eliminates the ability to run an index range scanning query in some cases.

For example, if a user issues a query for order IDs greater than 20, then the database cannot start with the block containing this ID and proceed horizontally through the leaf blocks. By default, character data is ordered by the binary values contained in each byte of the value, numeric data from smallest to largest number, and date from earliest to latest value.

In this case, the index stores data on a specified column or columns in descending order. If the index in Figure on the employees. The default search through a descending index is from highest to lowest value. Descending indexes are useful when a query sorts some columns ascending and others descending. Key compression can greatly reduce the space consumed by the index. In general, index keys have two pieces, a grouping piece and a unique piece.

Key compression breaks the index key into a prefix entry, which is the grouping piece, and a suffix entry, which is the unique or nearly unique piece.

The database achieves compression by sharing the prefix entries among the suffix entries in an index block. Therefore, only one clustered index can be created on a given database table. Clustered indices can greatly increase overall speed of retrieval, but usually only where the data is accessed sequentially in the same or reverse order of the clustered index, or when a range of items is selected.

Since the physical records are in this sort order on disk, the next row item in the sequence is immediately before or after the last one, and so fewer data block reads are required. The primary feature of a clustered index is therefore the ordering of the physical data rows in accordance with the index blocks that point to them.

Some databases separate the data and index blocks into separate files, others put two completely different data blocks within the same physical file s.

Relational Database Index Design and the Optimizers

Cluster[ edit ] When multiple databases and multiple tables are joined, it's referred to as a cluster not to be confused with clustered index described above.

The records for the tables sharing the value of a cluster key shall be stored together in the same or nearby data blocks.

A cluster can be keyed with a B-Tree index or a hash table. The data block where the table record is stored is defined by the value of the cluster key. Column order[ edit ] The order that the index definition defines the columns in is important.

Relational Database Index Design and the Optimizers (eBook, PDF)

It is possible to retrieve a set of row identifiers using only the first indexed column. However, it is not possible or efficient on most databases to retrieve the set of row identifiers using only the second or greater indexed column.

For example, imagine a phone book that is organized by city first, then by last name, and then by first name. If you are given the city, you can easily extract the list of all phone numbers for that city.

However, in this phone book it would be very tedious to find all the phone numbers for a given last name. You would have to look within each city's section for the entries with that last name.

So, to improve the performance, one must ensure that the index is created on the order of search columns. Applications and limitations[ edit ] Indexes are useful for many applications but come with some limitations.

With an index the database simply follows the B-tree data structure until the Smith entry has been found; this is much less computationally expensive than a full table scan. This query would yield an email address for every customer whose email address ends with " wikipedia. This is because the index is built with the assumption that words go from left to right. With a wildcard at the beginning of the search-term, the database software is unable to use the underlying B-tree data structure in other words, the WHERE-clause is not sargable.

This puts the wild-card at the right-most part of the query now gro.

Index Design Basics

Rather only a sequential search is performed, which takes O N time. Types of indexes[ edit ] Main article: Bitmap index A bitmap index is a special kind of indexing that stores the bulk of its data as bit arrays bitmaps and answers most queries by performing bitwise logical operations on these bitmaps.

In contrast, the bitmap index is designed for cases where the values of a variable repeat very frequently. For example, the sex field in a customer database usually contains at most three distinct values: male, female or unknown not recorded. For such variables, the bitmap index can have a significant performance advantage over the commonly used trees.

Dense index[ edit ] A dense index in databases is a file with pairs of keys and pointers for every record in the data file. Every key in this file is associated with a particular pointer to a record in the sorted data file.

In clustered indices with duplicate keys, the dense index points to the first record with that key. Every key in this file is associated with a particular pointer to the block in the sorted data file.Hash Index Architecture A hash index consists of an array of pointers, and each element of the array is called a hash bucket.

Filtered indexes are useful when columns contain well-defined subsets of data that queries reference in SELECT statements. In VLDB. When a clustered index has multiple partitions, each partition has a B-tree structure that contains the data for that specific partition. Compression is performed on each column segment within a rowgroup.