SQL Tutorial Part II
Summary:
In part II of the tutorial we review several basic statements of the
SQL language
Introduction
This is the second part of the SQL course. In this part, we will
focus
on various SQL commands from the creation of a table,
its modification, and/or deletion.
We will focus, above all, on the SELECT command, which is,
in my judgment, the most important of all.
I hope that this second part will be enjoyable and instructive for
you.
Creating a table
Like we have seen in the first installment, to create a table,
the CREATE command with the TABLE qualifier is used.
In effect, the CREATE command serves to create:
- users: CREATE USER
- tables: CREATE TABLE
- indexes: CREATE INDEX
- views: CREATE VIEW
The CREATE command indicates to the manager that something
is going
to be created. Later we will cover what and how.
What we are interested in now is the creation of a table:
Syntax
CREATE TABLE name ( column type [DEFAULT value] [NOT NULL], ...
[INHERITS (inherits, ...)]
[CONSTRAINT constraints CHECK (test), CHECK (test)] );
Where:
name: |
Is the name that is given to the table and how it is referenced by
any
command |
Column: |
Is the name of the column |
Type: | Is the type of data(varchar, char,
int, date,
time, timestamp), Postgres has other types of data, but they
aren't
compatible with ANSI SQL |
Value: |
The value that it will have by default |
Inherits: |
This is Postgres's own. It defines an inheritance from another
table. This
will create an entity that contains the columns of the table that we
are
creating and the ones inherited |
Nom_cons: |
This defines a rule of integrity to meet each time that a row is
modified
|
Test: |
Conditions to check |
Example:
CREATE TABLE countries (
cod_country integer NOT NULL,
name varchar(30))
CONSTRAINT cod_raro CHECK (cod_country > 0 AND cod_country < 154);
With this example we have created a table of countries. Each time
that
we insert a new row, these conditions will be meet:
Note
What does NULL mean? In SQL there exist two states, data and no
data.
In effect, we can be interested that a field of rows doesn't have
data, like zero (0) as well as the spaces are data. SQL introduces
the
concept of NULL and works with it. A practical example:
I have a table of bills, with the following fields:
customer, value, date_issued, date_paid
When I create the row, I will insert the data: customer, value,
date_issued
I will leave the date paid null; in this way I will know which
bills need to be paid with the following command:
SELECT * FROM bills WHERE date_paid IS NULL;
One can claim that a zero (0) in the field date_paid would
do
the same thing. It's true, except that zero (0) is not a date, and it
prevents creating date_paid of date type and applying
appropriate date functions.
Examples of creating with NULL:
insert into countries values (15, NULL);
Or:
insert into countries (cod_country) values (27);
The absence of the "name" field implies that this will receive a
NULL
value.
Modifying a table
In PostgreSQL, modification only deals with adding new columns.
ALTER TABLE table ADD name type;
Where:
Table |
Name of the table to modify |
Name |
Name of the column to add |
Type | Type of data (see CREATE TABLE) |
Inserting data into a table
Now we will insert data into our table:
SYNTAX:
INSERT INTO table [(column, column, ...)] VALUES (value-1, value-2,
...)
or else:
INSERT INTO table [(column, column, ...)] SELECT ....
As we have seen, there are two forms of inserting data into a
table, either
line by line or the result of a sub-select that can return one or more
rows.
When we insert lines into a table, we will ALWAYS put data in
all the columns including those that we don't mention, these are
created
with NULL values.
If, in the command, we don't specify which columns we are going to
fill,
it is understood that we are going to give data to all, example:
INSERT INTO countries VALUES (34, 'Spain');
This would be incorrect:
INSERT INTO countries VALUES (34);
But, this would be correct:
INSERT INTO countries (cod_country) VALUES (34);
I recommend that a command embedded in "C" programs or in
database functions ALWAYS specify the columns that
we are going to touch, in effect, if we add a new column to the table
(ALTER TABLE), the next insert an error will
occur,
Example:
INSERT INTO countries VALUES (34, 'Spain');
INSERT 18301 1
ALTER TABLE countries add population integer
INSERT INTO countries VALUES (34, 'Spain');
This will give a parser error, given that the population data is
missing
Note
PostgreSQL, doesn't generate an error. It creates the line with
the
"population" field NULL. This is only a particularity of PostgreSQL,
any other
SQL manager would give an error.
We still have another type of INSERT, which is fed from
a sub-select.
This type of insert is used very often to create
temporary tables or tables to carry out a concrete task of speculative
calculations.
The part replaced is that which touches the data itself, this
comes from the SELECT instructions that were carried out
previously and the inserting of the data. The instruction
SELECT can return one or more rows, this instruction
SELECT has the same restrictions of the same
SELECT.
Selecting data
I wanted to arrive at this point! :-))
We have covered required SQL commands, the SQL language without
SELECT would be like beans without sausage.
The SELECT command allows us to access data, but
with the reservation that searches, unions of tables, functions with
the data,
and with the search rules can be carried out.
An example:
select * from countries;
Another example:
SELECT a.name, SUM(population)
FROM countries a, states b, counties c
WHERE b.cod_country = a.cod_country
AND (c.cod_country = b.cod_country
AND c.state_code = b.state_code)
AND population IS NOT NULL
GROUP BY a.name
ORDER BY sum ASC;
Let me explain, I requested the populations of all of the countries
ordered
by the population in an ascending order.
For this I have added a new column (population) to the counties
table.
This would be like this:
create table counties (cod_country int,
state_code int,
county_code int,
county_name varchar(60),
population int);
insert into counties values (1, 1, 1, 'Country 1, State 1, County 1',
5435);
insert into counties values (2, 1, 1, 'Country 2, State 1, County 1',
7832);
insert into counties values (3, 1, 1, 'Country 3, State 1, County 1',
4129);
insert into counties values (1, 2, 1, 'Country 1, State 2, County 1',
76529);
insert into counties values (2, 2, 1, 'Country 2, State 2, County 1',
9782);
insert into counties values (3, 2, 1, 'Country 3, State 2, County 1',
852);
insert into counties values (1, 3, 1, 'Country 1, State 3, County 1',
3433);
insert into counties values (2, 3, 1, 'Country 2, State 3, County 1',
7622);
insert into counties values (3, 3, 1, 'Country 3, State 3, County 1',
2798);
insert into counties values (1, 1, 2, 'Country 1, State 1, County 2',
7789);
insert into counties values (2, 1, 2, 'Country 2, State 1, County 2',
76511);
insert into counties values (3, 1, 2, 'Country 3, State 1, County 2',
98);
insert into counties values (1, 2, 2, 'Country 1, State 2, County 2',
123865);
insert into counties values (2, 2, 2, 'Country 2, State 2, County 2',
886633);
insert into counties values (3, 2, 2, 'Country 3, State 2, County 2',
982345);
insert into counties values (1, 3, 2, 'Country 1, State 3, County 2',
22344);
insert into counties values (2, 3, 2, 'Country 2, State 3, County 2',
179);
insert into counties values (3, 3, 2, 'Country 3, State 3, County 2',
196813);
insert into counties values (1, 1, 3, 'Country 1, State 1, County 3',
491301);
insert into counties values (2, 1, 3, 'Country 2, State 1, County 3',
166540);
insert into counties values (3, 1, 3, 'Country 3, State 1, County 3',
165132);
insert into counties values (1, 2, 3, 'Country 1, State 2, County 3',
0640);
insert into counties values (2, 2, 3, 'Country 2, State 2, County 3',
65120);
insert into counties values (3, 2, 3, 'Country 3, State 2, County 3',
1651462);
insert into counties values (1, 3, 3, 'Country 1, State 3, County 3',
60650);
insert into counties values (2, 3, 3, 'Country 2, State 3, County 3',
651986);
insert into counties values (3, 3, 3, 'Country 3, State 3, County 3',
NULL);
commit work;
Now, we can't do it with ALTER TABLE, but should use
UPDATE and I haven't explained it, so use the "cut & paste" and
everyone will
be happy :-))
Now we can execute the QUERY, and the results should
be:
name | sum
- ---------+-------
country 1| 705559
country 2|1212418
country 3|2804018
(3 rows)
Now we will verify this:
select sum(population) from counties where cod_country = 1;
Results:
sum
- ------
791986
(1 row)
!!!!!! One Difference !!!!!!
Let's look at the states table, it is missing the state 3, we
do:
INSERT INTO states VALUES (3, 1, 'State 3, Country 1');
INSERT INTO states VALUES (3, 2, 'State 3, Country 2');
INSERT INTO states VALUES (3, 3, 'State 3, Country 3');
and repeat the command, resulting in:
name | sum
- ---------+-------
country 1| 791986
country 2|1872205
country 3|3003629
We are missing the state 3 from each country.
Now for those that have been lost, remember that the joins
between tables were EXACT, it only extracts data if the
condition are exact.
Let's look at the first part of the WHERE: b.cod_country =
a.cod_country
This means that I join the country table with the states where
the country code is equal, now remember the country
data that we inserted:
Don't execute this, is only used as an illustration.
create table countries (cod_country integer, name varchar(30));
insert into countries values (1, 'country 1');
insert into countries values (2, 'country 2');
insert into countries values (3, 'country 3');
commit work;
Now the states data:
create table states (state_code int,
cod_country int,
state_name varchar(30));
insert into states values (1, 1, 'State 1, Country 1');
insert into states values (2, 1, 'State 2, Country 1');
insert into states values (1, 2, 'State 1, Country 2');
insert into states values (2, 2, 'State 2, Country 2');
insert into states values (1, 3, 'State 1, Country 3');
insert into states values (2, 3, 'State 2, Country 3');
commit work;
All of the states 3 from every country are missing, but in the country
table the corresponding data of the states 3, so it is normal that
we don't add the country data with the states with code 3 to be
discarded in the second part where:
AND (c.cod_country = b.cod_country
AND c.state_code = b.state_code)
The state exists in the counties table but NOT in the states
table.
For those that haven't understood, take an aspirin, go walk your
dog
(if you don't have a dog, go walk without a dog), breath a little
fresh
air and comeback to begin from the first exercise.
Is is very important to understand how the joining of data is
carried
out, without this the development that we do can have unpredictable
results.
Let's shift gears and start with the SELECT command syntax.
SELECT [DISTINCT] expression1 [AS nom-attribute] {, expression-i [as
nom-attribute-i]}
[INTO TABLE classname]
[FROM from-list]
[WHERE where-clause]
[GROUP BY attr_name1 {, attr_name-i....}]
[ORDER BY attr_name1 [ASC | DESC ] [USING op1 ] {,
nom-attribute-i...}]
[UNION {ALL} SELECT ...]
Step by step:
DISTINCT: |
This is to eliminate duplicate row returned |
expression1: |
what we want returned, normally a column for a table from the
list FROM |
AS nom-attribute: |
an alias for the name of a column, example:
manu=> select cod_country from countries;
cod_country
- -----------
1
2
3
(3 rows)
Now with the alias:
manu=> select cod_country as countr from countries;
countr
- ------
1
2
3
(3 rows)
|
INTO TABLE: |
allow the resulting row to be inserted directly into another table
(see INSERT ... SELECT...) |
FROM: |
list of input tables |
WHERE: | selection statement (union
and selection criteria). |
GROUP BY: | group criteria, certain functions
that
are used in expressions might need grouping
|
ORDER BY: | Order criteria of the returned
rows, ASC ascending order, DESC descending order, USING if the column
that defines the order is not in the list (expression) |
UNION ALL SELECT: | This says to add to the
results of the first SELECT this second SELECT that can be
different tables, but return the same number of columns. |
We have seen that the commands SELECT not only returns
items from the DB but it can also modify them:
SELECT SUM(salary * 1.1) - SUM(salary) AS increment FROM
employees;
This will return the pay increment of a pay increase more then
10%.
Let's look at the functions that we have available:
COUNT(): |
returns the number of rows that aren't NULL |
SUM(): |
returns the total sum of a column of numbers |
AVG(): |
returns the average of a column of numbers |
MIN(): |
returns the minimum value in a column |
MAX(): |
returns the maximum value in a column |
FLOAT(int): |
returns a FLOAT8, FLOAT(12345) |
FLOAT4(int): |
returns a FLOAT4, FLOAT4(12345) |
INT(float): |
returns an INT from a FLOAT/4, INT(123.456) |
LOWER(text): |
returns lower-case text |
UPPER(text): |
returns upper-case text |
LPAD(text, long, char): |
fill to the left with char the length of long
the column text |
RPAD(text, long, char): |
fill to the right with char the length of long
the column text |
LTRIM(text, char): |
remove from the left of text all of the characters
char |
RTRIM(text, char): |
remove from the right of text all of the characters
char |
POSITION(string IN text): | extract
from
text the position of string, but IT DOESN'T
WORK |
SUBSTR(text,from[,to]): | extract the
substring of text, from the position from
to the position of to or the end of the string |
DATETIME(date, hour): |
converts a datetime format to date (YYYY-MM-DD) and an
hour (HH:MM) |
These were a few of the functions that exist in SQL, these are
those
that are defined in ANSI SQL and are also present in Postgres95.
Details of WHERE
Until now, we have seen that the section of WHERE from the
SELECT we put things like:
AND column = value
This is a small example of what we put or combine:
AND, OR, NOT, IN, IN ALL, =, !=, >, <, (SELECT....),
LIKE also the parenthesis are relevant, example:
WHERE
column IN (SELECT DISTINCT column FROM table WHERE ....)
column IN ('value1','value2','value3',...)
(column = 'value' and column = 'other_value' OR column != 'value')
!= the same as saying NOT EQUAL
LIKE permits searching for a string within a column with
wildcards:
WHERE column LIKE '%Pepito%'
The % is a wildcard, in the example, it's true if "Pepito"
is in the string
WHERE column LIKE 'Pepito%'
will be true if "Pepito" is at the beginning of the string
WHERE column LIKE '%Pepito'
will be true if "Pepito" is at the end of the string
There's not enough time here to list all of the possible options
used with WHERE, the limit is in the imagination of the
programmer or the limits of the individual process parser.
Now we can leave the SELECT command and concentrate on
the last two.
UPDATE command
The UPDATE command allows modification of one or more
rows,
depending on the defined condition in WHERE
SYNTAX:
UPDATE table SET column-1 = expression-1
[, column-i = expression-i]
[WHERE condition]
Where:
table: | the table to modify, only one table can be
modified at a time |
column: | the column that will be modified |
expression: | the value that the column will receive.
This value can be static or the results of a function |
condition: | the condition that define the
modification boundaries, here the defined rules are
applied for the SELECT |
DELETE command
The DELETE command allows modification of one or more row of
a table.
SYNTAX
DELETE FROM table
[WHERE condition]
Where:
table: | the table where rows will be deleted, only one
table
can be deleted from at a time |
condition: | the condition that define the
modification boundaries, here the defined rules are
applied for the SELECT
NOTE: without the WHERE ALL of the rows will be deleted
from the table
|
|