SQL Syntax

SQL

Beginner

SQL is case insensitive

Querying Databases with SQL

Writing SQL query is the primary way to interact with the database. SQL is a declarative statement language.

SELECT

SELECT columns01, column02, ... FROM recent_grads;

The query word SELECTis the most basic statement in SQL. The language reads more like English. The semicolon ; specifies where the query ends. Thus we can structure a query in several lines of length.

note: order in which columns are called matters!

WHERE

The WHERE statement allows filtering the database according to a specific constrain.

SELECT col01, col02
FROM db_nm
WHERE col01 > 0.5;

The following operators are defined > >= < <= = <>. The last operator <> is not equal to can also be expressed as !=.

LIMIT

SQL comes with a statement called LIMIT that allows us to specify how many results we’d like the database to return as an integer value.

SELECT col02
FROM db_nm
WHERE col01 > 0.5
LIMIT 10;

Notice that col01 is filtered but the return result is col02.

Logical Operators & Ordering

AND & OR

SELECT [col01,...]
FROM [db_nm]
WHERE [cond01] AND [cond02];

In the above example AND can be interchanged with OR depending which Boolean condition is needed.

ORDER BY

The returned query is ordered according to the original ordering of the data. It is possible to specify desired ordering using a ORDER BY statement.

select [col01,...] from [db_nm] where [cond01,...] order by [col_name] asc;

The above query will be ordered by values in col_name column in an ascending order. Specifying desc will order the results in a descending order.

Aggregation Functions

COUNT()

SELECT COUNT(*) FROM tbl_nm;

Will return a total count of all rows.

SELECT COUNT(col_nm) FROM tbl_nm;

Will return total count of all non-null values in a column.

MIN() & MAX()

SELECT MIN(col_nm) FROM tbl_nm;
SELECT MAX(col_nm) FROM tbl_nm;

SUM() & AVG()

SELECT SUM(col_nm) FROM tbl_nm;
SELECT AVG(col_nm) FROM tbl_nm;

DISTINCT

SELECT DISTINCT col_nm FROM tbl_nm;

This command will select unique values from the specified columns. We could also aggregate unique values by specifying AVG(DISTINCT col_name).

GROUP BY

SELECT col_nm_01, ... FROM tbl_nm GROUP BY col_name;

The above query will group all selected columns by col_name

Renaming Columns with AS

SELECT col_name AS new_col_nm FROM tbl_nm;

Performing Arithmetic

Arithmetic operations are performed on a column or between columns. The expression to perform an operation is intuitive.

SELECT ((col_nm_01 + col_nm_02) / col_nm_03) FROM tbl_nm;

Querying Virtual Columns

HAVING

SELECT col_nm_01, ... FROM tbl_nm GROUP BY col_nm HAVING (condition);

When using GROUP BY we create a virtual column, thus WHERE clause cannot be used. Instead, we use HAVING to filter results and select a subset.

Querying SQLite from Python

Python comes with SQLite module which can be imported.

import sqlite3 as sql

Connecting to a Database

We use connect() function and pass the path to the filename.

conn = sql.connect('db_nm.db')

We also need a Cursor object that will run and execute queries.

cursor = conn.cursor()

To execute a query we pass a string of statements in SQL to an execute() method and then fetch the results into a list of tuples.

query = 'select * from tbl_nm;'
cursor.execute(query)
results = cursor.fetchall()

The variable results will contain all the returned values from the declared query.

Shortcut

There is a way to run queries as a shortcut by-passing a Cursor object altogether.

conn = sql.connect('db_nm.db')
query = 'select * from tbl_nm;'
results = conn.execute(query).fetchall()

Retrieving Specific Number of Results

To retrieve one single result (as a tuple) we use the Cursor method fetchone(). To retrieve n results we use fetchmany(n). Both methods return results in increments meaning by calling fetchone() the second time, second tuple will be returned.

import sqlite3 as sql

conn = sql.connect('db_nm.db')
cursor = conn.cursor()

query = 'select * from tbl_nm;'
cursor.execute(query)

single_result = cursor.fetchone()
five_results = cursor.fetchmany(5)
all_results = cursor.fetchall()

Closing Connection

conn.close()

Intermediate

Modifying Data

INSERT

Inserting data into a table is done through the INSERT statement.

INSERT INTO [tablename] VALUES (primary_key, val01,...);

UPDATE

We can use the UPDATE statement to change any existing values in a table.

UPDATE (tbl_nm)
SET (col_nm_01 = new_val_01,...)
WHERE (col_nm_01 = old_val_01,...);

Note that all rows will be updated if WHERE clause is not specified.

DELETE

If we need to delete any rows from a table we can use the DELETE statement.

DELETE FROM (tbl_nm)
WHERE (cond01,...);

Missing Values

You can retrieve any rows where a specific column is NULL by using the following syntax:

SELECT * from tableName
WHERE columnName IS NULL;

Notice that IS statement is used to specify where values are equal to NULL.

Table Schemas

Adding Columns

We can add a column with the ALTER TABLE statement:

ALTER TABLE tableName
ADD columnName dataType;

The statement ADD will add a column to tableName

Removing Column

We can alter the table to remove a column.

ALTER TABLE tableName
DROP COLUMN colName;

Note that the above command is only possible with SQL, not SQLite.

Creating Tables

CREATE TABLE

The syntax to create a new table is the following

CREATE TABLE db_nm.tb_nm(
id integer PRIMARY KEY,
col_nm_01 dataType,
.
.
.
col_nm_N dataType
);

Relations Between Tables

To query across multiple tables and define relations between columns we can create associations between those columns .

CREATE TABLE db_nm.tbl_nm_01(
id integer PRIMARY KEY,
col_nm_01 dataType,
.
.
.
col_nm_N dataType	
FOREIGN KEY(col_nm_N) REFERENCES tbl_nm_02(col_nm_M)
);

We are referencing col_nm_N from tbl_nm_01 with col_nm_M from tbl_nm_02.

Querying Across Foreign Keys

Types of Joins

INNER JOIN

We can use the INNER JOIN statement to make querying across foreign key relationships easier:

SELECT [column1, column2, ...] from tableName1
INNER JOIN tableName2
ON tableName1.some_col_nm == tableName2.some_col_nm;

The above indicates that we are trying to query columns from tableName1 joined with tableName2 on 3rd column from one table with 4th column of the other.

INNER JOIN displays rows only when there’s a match on the condition in both tables. Non-matching rows are excluded

LEFT OUTER JOIN

Displays NULL for all right side values if left side row didn’t find a match.

SELECT [col1,...] FROM tbl_nm_01
LEFT OUTER JOIN tbl_nm_02
ON tbl_nm_01.some_col_nm == tbl_nm_02.some_col_nm;
RIGHT OUTER JOIN

All left side rows are filled with NULL values if no matching row on the right was found.

SELECT [col01,...] FROM tbl_nm_01
RIGHT OUTER JOIN tbl_nm_02
ON tbl_nm_01.some_col_nm == rbl_nm_02.some_col_nm;

Creating a Database

CREATE DATABASE

CREATE DATABASE db_nm;

OWNER

We can specify which user will be the owner of the database. The owner is the only that can access and modify the database with an exception of superusers.

CREATE DATABASE db_nm OWNER user_nm;

Deleting Database

DROP DATABASE

DROP DATABASE db_nm;

PostgreSQL

Importing, Connecting & Closing

Similar to SQLite we create a connect and a cursor objects. Difference being, we have to specify the database name we wish to connect to as well as the username.

import psycopg2 as postgre
conn = postgre.connect('dbname=db_nm user=usr_nm')
cursor = conn.cursor()
.
.
.
conn.close()

Note: default database and username is postgres

Committing

Changes made are auto rolled back upon closing the connection. To make all queries and changes to data permanent they need to be committed.

import psycopg2 as postgre

conn = postgre.connect("dbname=db_nm user=usr_nm")
curs = conn.cursor()

query = "sql_syntax"

conn.execute(sql_syntax)
curs.commit()
conn.close()

Alternatively,

conn.autocommit = True

Command Line PostgreSQL

Starting, Running Queries & Closing

psql
.
SQL_query_syntax;
.
\q

List of Commands

  • \? Lists all available commands
  • \l Lists all available databases
  • \du Lists users
  • \dt List tables
  • \dp tbl_nm Lists permissions for a specific table

Connecting to a database

psql -d db_nm

Creating Users

CREATE ROLE user_nm WITH LOGIN PASSWORD 'password';

Adding CREATEDB after WITH will allow user_nm to create databases as well.

Complete list of commands are available here.

Adding Permissions

To grant permission to users for running queries the GRANT statement is used.

GRANT SELECT, INSERT, DELETE ON tbl_nm TO user_nm;
GRANT ALL PRIVILIGES ON tbl_nm TO user_nm;

The above queries grant limited or complete priviliges to a specific user.

Removing Permissions

To remove a permission we must revoke it.

REVOKE SELECT, INSERT ON tbl_nm FROM user_nm;
REVOKE ALL PRIVILEGES ON tbl_nm FROM user_nm;

Deleting Users

Once all permissions have been revoked the user can be dropped

DROP ROLE user_nm;

Superusers

Superusers are able to override all access and restrictions.

CREATE ROLE user_nm WITH LOGIN PASSWORD 'psswrd' SUPERUSER;

Advanced

Introduction to Indexing

CREATE INDEX

CREATE INDEX index_nm ON tbl_nm(col_nm);

The above query will create an index based on the specified col_nm within tbl_nm and name the index as index_nm.

To avoid creating an index name that already exists we can use the IF NOT EXISTS clause.

CREAT INDEX IF NOT EXISTS index_nm ON tbl_nm(col_nm);

Multi-Column Indexing

To create a multi-column index, we use the same CREATE INDEX syntax as before but instead specify 2 columns in the ON statement:

CREATE INDEX index_nm ON tbl_nm(col_nm_1, col_nm_2);

Covering Index

When an index contains all of the information necessary to answer a query, it’s called a covering index.

A sample query will produce the following answer: ``` query_plan = conn.execute(
			"EXPLAIN QUERY PLAN 
			SELECT col_01, col_02 
			FROM tbl_nm 
			WHERE col_01 > n 
			AND col_02 < m;").fetchall() ```

Its covering because the index covers all of the selected columns.

Written on July 24, 2017