The SELECT clause is the most basic of the commands associated with data manipulation. We need some data first so we select the database:
USE mydatabase;
Then we create the table:
CREATE TABLE IF NOT EXISTS biodata (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(100) NOT NULL UNIQUE,
gender VARCHAR(10) NOT NULL,
dob DATE NOT NULL,
mass DECIMAL(5,2) NOT NULL,
height DECIMAL(5,2) NOT NULL
);
The format for the fields in the table is column name DATATYPE CONSTRAINT(S) (optional).
Now run the command:
SELECT * FROM biodata;
It will indicate that there is an empty set, meaning that there are no rows in our table as yet but our table is present. We can verify this using the command:
SHOW TABLES;
Add some rows to the table:
INSERT INTO biodata (name, email, gender, dob, mass, height)
VALUES
("Joash Gobin","joashgobin@example.com","Male","1999-02-03",100.2,176.5);
Retry our SELECT command:
SELECT * FROM biodata;
Add more values:
INSERT INTO biodata (name, email, gender, dob, mass, height)
VALUES
("Mary Sue","mary.sue@example.com","Female","1992-02-03",103.2,106.5),
("John Doe","john.doe@example.com","Male","1997-06-09",133.9,191.5),
("John Doe","john.doe2@example.com","Male","1994-08-01",133.9,188.3);
Retry the SELECT command and see the results:
SELECT * FROM biodata;
Selecting specific columns
Try getting specific columns by listing them instead of using *:
SELECT name,gender,dob FROM biodata;
Computing new variables with SELECT
Run the command:
SELECT name, mass, height FROM biodata;
We can compute a new variable by using a calculation in the set of columns specified in the SELECT:
SELECT name, mass,height,mass/(height/100*height/100) FROM biodata;
Yes, the calculations follow the usual order of operations (BODMAS/PEMDAS/BOPS). We can rename the column using the as keyword:
SELECT name, mass,height,mass/(height/100*height/100) as bmi FROM biodata;
Recall that body mass index (BMI) is given by the formula: $$BMI=\frac{mass}{(height)^2}$$ where mass is given in kilograms ($kg$) and height is measured in meters ($m$).
Selecting distinct values
Sometimes we will want to avoid duplicate rows in our selections. We can use the SELECT DISTINCT command to remove redundant rows. Try the following:
SELECT gender FROM biodata;
Now try the distinct selection:
SELECT DISTINCT gender FROM biodata;
We can also select distinct combinations of columns. Run the following command:
SELECT name,gender FROM biodata;
Notice how there are two ($2$) rows which have name “John Doe” and gender “Male”. We can remove the repeats using the SELECT DISTINCT command:
SELECT DISTINCT name,gender FROM biodata;
Reflection
We completed the following:
- Creating a table to store biodata
- Inserting some entries into the biodata table
- Selecting every column in the table
- Selecting specific columns
- Computing new columns
- Renaming computed columns
- Showing distinct combinations of row data