This document is available in: English Castellano ChineseBig5 Deutsch Francais Nederlands Portugues Russian Turkce Korean |
by Manuel Soriano <manu(at)europa3.com> About the author: Manuel is a software consultant and his passion is Linux. He did already worked with computers when one single computer filled a whole room and had 32K Ram. Translated to English by: Dallas L. Gale <unknown(at)unknown.org> Content: |
Abstract:
In part II of the tutorial we review several basic statements of the SQL language.
This is the second part of the SQL course. In this part, we will focus on various SQL commands from the creation of a table, its modification, and/or deletion.
We will focus, above all, on the SELECT command, which is, in my judgment, the most important of all.
I hope that this second part will be enjoyable and instructive for you.
Like we have seen in the first installment, to create a table, the CREATE command with the TABLE qualifier is used. In effect, the CREATE command serves to create:
The CREATE command indicates to the manager that something is going to be created. Later we will cover what and how.
What we are interested in now is the creation of a table:
CREATE TABLE name ( column type [DEFAULT value] [NOT NULL], ... [INHERITS (inherits, ...)] [CONSTRAINT constraints CHECK (test), CHECK (test)] );Where:
name: | Is the name that is given to the table and how it is referenced by any command |
Column: | Is the name of the column |
Type: | Is the type of data(varchar, char, int, date, time, timestamp), Postgres has other types of data, but they aren't compatible with ANSI SQL |
Value: | The value that it will have by default |
Inherits: | This is Postgres's own. It defines an inheritance from another table. This will create an entity that contains the columns of the table that we are creating and the ones inherited |
Nom_cons: | This defines a rule of integrity to meet each time that a row is modified |
Test: | Conditions to check |
CREATE TABLE countries ( cod_country integer NOT NULL, name varchar(30)) CONSTRAINT cod_raro CHECK (cod_country > 0 AND cod_country < 154);
With this example we have created a table of countries. Each time that we insert a new row, these conditions will be meet:
ExecAppend: Fail to add null value in not null attribute cod_country
ExecAppend: rejected due to CHECK constraint cod_raro
What does NULL mean? In SQL there exist two states, data and no data. In effect, we can be interested that a field of rows doesn't have data, like zero (0) as well as the spaces are data. SQL introduces the concept of NULL and works with it. A practical example:
I have a table of bills, with the following fields: customer, value, date_issued, date_paid
When I create the row, I will insert the data: customer, value, date_issued
I will leave the date paid null; in this way I will know which bills need to be paid with the following command:
SELECT * FROM bills WHERE date_paid IS NULL;
One can claim that a zero (0) in the field date_paid would do the same thing. It's true, except that zero (0) is not a date, and it prevents creating date_paid of date type and applying appropriate date functions.
Examples of creating with NULL:
insert into countries values (15, NULL);Or:
insert into countries (cod_country) values (27);
The absence of the "name" field implies that this will receive a NULL value.
In PostgreSQL, modification only deals with adding new columns.
ALTER TABLE table ADD name type;
Where:
Table | Name of the table to modify |
Name | Name of the column to add |
Type | Type of data (see CREATE TABLE) |
Now we will insert data into our table:
SYNTAX:INSERT INTO table [(column, column, ...)] VALUES (value-1, value-2, ...)or else:
INSERT INTO table [(column, column, ...)] SELECT ....
As we have seen, there are two forms of inserting data into a table, either line by line or the result of a sub-select that can return one or more rows.
When we insert lines into a table, we will ALWAYS put data in all the columns including those that we don't mention, these are created with NULL values.
If, in the command, we don't specify which columns we are going to fill, it is understood that we are going to give data to all, example:
INSERT INTO countries VALUES (34, 'Spain');This would be incorrect:
INSERT INTO countries VALUES (34);But, this would be correct:
INSERT INTO countries (cod_country) VALUES (34);
I recommend that a command embedded in "C" programs or in database functions ALWAYS specify the columns that we are going to touch, in effect, if we add a new column to the table (ALTER TABLE), the next insert an error will occur, Example:
This will give a parser error, given that the population data is missing
PostgreSQL, doesn't generate an error. It creates the line with the "population" field NULL. This is only a particularity of PostgreSQL, any other SQL manager would give an error.
We still have another type of INSERT, which is fed from a sub-select.
This type of insert is used very often to create temporary tables or tables to carry out a concrete task of speculative calculations.
The part replaced is that which touches the data itself, this comes from the SELECT instructions that were carried out previously and the inserting of the data. The instruction SELECT can return one or more rows, this instruction SELECT has the same restrictions of the same SELECT.
I wanted to arrive at this point! :-))
We have covered required SQL commands, the SQL language without SELECT would be like beans without sausage.
The SELECT command allows us to access data, but with the reservation that searches, unions of tables, functions with the data, and with the search rules can be carried out.
An example:
select * from countries;Another example:
SELECT a.name, SUM(population) FROM countries a, states b, counties c WHERE b.cod_country = a.cod_country AND (c.cod_country = b.cod_country AND c.state_code = b.state_code) AND population IS NOT NULL GROUP BY a.name ORDER BY sum ASC;
Let me explain, I requested the populations of all of the countries ordered by the population in an ascending order.
For this I have added a new column (population) to the counties table.
This would be like this:create table counties (cod_country int, state_code int, county_code int, county_name varchar(60), population int); insert into counties values (1, 1, 1, 'Country 1, State 1, County 1', 5435); insert into counties values (2, 1, 1, 'Country 2, State 1, County 1', 7832); insert into counties values (3, 1, 1, 'Country 3, State 1, County 1', 4129); insert into counties values (1, 2, 1, 'Country 1, State 2, County 1', 76529); insert into counties values (2, 2, 1, 'Country 2, State 2, County 1', 9782); insert into counties values (3, 2, 1, 'Country 3, State 2, County 1', 852); insert into counties values (1, 3, 1, 'Country 1, State 3, County 1', 3433); insert into counties values (2, 3, 1, 'Country 2, State 3, County 1', 7622); insert into counties values (3, 3, 1, 'Country 3, State 3, County 1', 2798); insert into counties values (1, 1, 2, 'Country 1, State 1, County 2', 7789); insert into counties values (2, 1, 2, 'Country 2, State 1, County 2', 76511); insert into counties values (3, 1, 2, 'Country 3, State 1, County 2', 98); insert into counties values (1, 2, 2, 'Country 1, State 2, County 2', 123865); insert into counties values (2, 2, 2, 'Country 2, State 2, County 2', 886633); insert into counties values (3, 2, 2, 'Country 3, State 2, County 2', 982345); insert into counties values (1, 3, 2, 'Country 1, State 3, County 2', 22344); insert into counties values (2, 3, 2, 'Country 2, State 3, County 2', 179); insert into counties values (3, 3, 2, 'Country 3, State 3, County 2', 196813); insert into counties values (1, 1, 3, 'Country 1, State 1, County 3', 491301); insert into counties values (2, 1, 3, 'Country 2, State 1, County 3', 166540); insert into counties values (3, 1, 3, 'Country 3, State 1, County 3', 165132); insert into counties values (1, 2, 3, 'Country 1, State 2, County 3', 0640); insert into counties values (2, 2, 3, 'Country 2, State 2, County 3', 65120); insert into counties values (3, 2, 3, 'Country 3, State 2, County 3', 1651462); insert into counties values (1, 3, 3, 'Country 1, State 3, County 3', 60650); insert into counties values (2, 3, 3, 'Country 2, State 3, County 3', 651986); insert into counties values (3, 3, 3, 'Country 3, State 3, County 3', NULL); commit work;
Now, we can't do it with ALTER TABLE, but should use UPDATE and I haven't explained it, so use the "cut & paste" and everyone will be happy :-))
Now we can execute the QUERY, and the results should be:
name | sum - ---------+------- country 1| 705559 country 2|1212418 country 3|2804018 (3 rows)Now we will verify this:
sum - ------ 791986 (1 row)
!!!!!! One Difference !!!!!!
Let's look at the states table, it is missing the state 3, we do:
INSERT INTO states VALUES (3, 1, 'State 3, Country 1'); INSERT INTO states VALUES (3, 2, 'State 3, Country 2'); INSERT INTO states VALUES (3, 3, 'State 3, Country 3');and repeat the command, resulting in:
name | sum - ---------+------- country 1| 791986 country 2|1872205 country 3|3003629
We are missing the state 3 from each country.
Now for those that have been lost, remember that the joins between tables were EXACT, it only extracts data if the condition are exact.
Let's look at the first part of the WHERE: b.cod_country = a.cod_country
This means that I join the country table with the states where the country code is equal, now remember the country data that we inserted:
Don't execute this, is only used as an illustration.
create table countries (cod_country integer, name varchar(30)); insert into countries values (1, 'country 1'); insert into countries values (2, 'country 2'); insert into countries values (3, 'country 3'); commit work;Now the states data:
create table states (state_code int, cod_country int, state_name varchar(30)); insert into states values (1, 1, 'State 1, Country 1'); insert into states values (2, 1, 'State 2, Country 1'); insert into states values (1, 2, 'State 1, Country 2'); insert into states values (2, 2, 'State 2, Country 2'); insert into states values (1, 3, 'State 1, Country 3'); insert into states values (2, 3, 'State 2, Country 3'); commit work;All of the states 3 from every country are missing, but in the country table the corresponding data of the states 3, so it is normal that we don't add the country data with the states with code 3 to be discarded in the second part where:
AND (c.cod_country = b.cod_country AND c.state_code = b.state_code)
The state exists in the counties table but NOT in the states table.
For those that haven't understood, take an aspirin, go walk your dog (if you don't have a dog, go walk without a dog), breath a little fresh air and comeback to begin from the first exercise.
Is is very important to understand how the joining of data is carried out, without this the development that we do can have unpredictable results.
Let's shift gears and start with the SELECT command syntax.
SELECT [DISTINCT] expression1 [AS nom-attribute] {, expression-i [as nom-attribute-i]} [INTO TABLE classname] [FROM from-list] [WHERE where-clause] [GROUP BY attr_name1 {, attr_name-i....}] [ORDER BY attr_name1 [ASC | DESC ] [USING op1 ] {, nom-attribute-i...}] [UNION {ALL} SELECT ...]Step by step:
DISTINCT: | This is to eliminate duplicate row returned |
expression1: | what we want returned, normally a column for a table from the list FROM |
AS nom-attribute: |
an alias for the name of a column, example:
manu=> select cod_country from countries; cod_country - ----------- 1 2 3 (3 rows)Now with the alias: manu=> select cod_country as countr from countries; countr - ------ 1 2 3 (3 rows) |
INTO TABLE: | allow the resulting row to be inserted directly into another table (see INSERT ... SELECT...) |
FROM: | list of input tables |
WHERE: | selection statement (union and selection criteria). |
GROUP BY: | group criteria, certain functions that are used in expressions might need grouping |
ORDER BY: | Order criteria of the returned rows, ASC ascending order, DESC descending order, USING if the column that defines the order is not in the list (expression) |
UNION ALL SELECT: | This says to add to the results of the first SELECT this second SELECT that can be different tables, but return the same number of columns. |
We have seen that the commands SELECT not only returns items from the DB but it can also modify them:
This will return the pay increment of a pay increase more then 10%.
Let's look at the functions that we have available:
COUNT(): | returns the number of rows that aren't NULL |
SUM(): | returns the total sum of a column of numbers |
AVG(): | returns the average of a column of numbers |
MIN(): | returns the minimum value in a column |
MAX(): | returns the maximum value in a column |
FLOAT(int): | returns a FLOAT8, FLOAT(12345) |
FLOAT4(int): | returns a FLOAT4, FLOAT4(12345) |
INT(float): | returns an INT from a FLOAT/4, INT(123.456) |
LOWER(text): | returns lower-case text |
UPPER(text): | returns upper-case text |
LPAD(text, long, char): | fill to the left with char the length of long the column text |
RPAD(text, long, char): | fill to the right with char the length of long the column text |
LTRIM(text, char): | remove from the left of text all of the characters char |
RTRIM(text, char): | remove from the right of text all of the characters char |
POSITION(string IN text): | extract from text the position of string, but IT DOESN'T WORK |
SUBSTR(text,from[,to]): | extract the substring of text, from the position from to the position of to or the end of the string |
DATETIME(date, hour): | converts a datetime format to date (YYYY-MM-DD) and an hour (HH:MM) |
These were a few of the functions that exist in SQL, these are those that are defined in ANSI SQL and are also present in Postgres95.
Until now, we have seen that the section of WHERE from the SELECT we put things like:
AND column = value
This is a small example of what we put or combine:
AND, OR, NOT, IN, IN ALL, =, !=, >, <, (SELECT....), LIKE also the parenthesis are relevant, example:
WHERE column IN (SELECT DISTINCT column FROM table WHERE ....) column IN ('value1','value2','value3',...) (column = 'value' and column = 'other_value' OR column != 'value')!= the same as saying NOT EQUAL
WHERE column LIKE '%Pepito%'The % is a wildcard, in the example, it's true if "Pepito" is in the string
WHERE column LIKE 'Pepito%'will be true if "Pepito" is at the beginning of the string
WHERE column LIKE '%Pepito'will be true if "Pepito" is at the end of the string
There's not enough time here to list all of the possible options used with WHERE, the limit is in the imagination of the programmer or the limits of the individual process parser.
Now we can leave the SELECT command and concentrate on the last two.
The UPDATE command allows modification of one or more rows, depending on the defined condition in WHERE
SYNTAX:UPDATE table SET column-1 = expression-1 [, column-i = expression-i] [WHERE condition]Where:
table: | the table to modify, only one table can be modified at a time |
column: | the column that will be modified |
expression: | the value that the column will receive. This value can be static or the results of a function |
condition: | the condition that define the modification boundaries, here the defined rules are applied for the SELECT |
DELETE FROM table [WHERE condition]Where:
table: | the table where rows will be deleted, only one table can be deleted from at a time |
condition: | the condition that define the modification boundaries,
here the defined rules are applied for the
SELECT NOTE: without the WHERE ALL of the rows will be deleted from the table |
SQL: Introduction to SQL. Installation of PostgreSQL, first article of the series.
Webpages maintained by the LinuxFocus Editor team
© Manuel Soriano, FDL LinuxFocus.org |
Translation information:
|
2002-10-22, generated by lfparser version 2.32