Dr. Codd's T-Join
In the Second Version of the relational model in 1990, Dr. E. F. Codd introduced a set of new theta operators, called T-operators, which were based on the idea of a best-fit or approximate equality (Codd 1990). The algorithm for the operators is easier to understand with an example modified from Dr. Codd.
The problem is to assign the classes to the available classrooms. We want (class_size < or <= room_size) to be true after the assignments are made. The first < will allow us a few empty seats in each room for late students; the <= can have exact matches.
Th naive approaches Codd gave were: (1) sort the tables in ascending order by classroom size and then matched the number of students in a class. (2) sort the tables in descending order by classroom size and then matched the number of students in a class.
We start with the following tables:
CREATE TABLE Rooms
(room_nbr CHAR(2) NOT NULL PRIMARY KEY,
room_size INTEGER NOT NULL
CHECK (room_size > 0));
INSERT INTO Rooms
VALUES
('r1', 70),
('r2', 40),
('r3', 50),
('r4', 85),
('r5', 30),
('r6', 65),
('r7', 55);
CREATE TABLE Classes
(class_nbr CHAR(2) NOT NULL PRIMARY KEY,
class_size INTEGER NOT NULL
CHECK (class_size > 0));
INSERT INTO Classes (class_nbr, class_size)
VALUES
('c1', 80),
('c2', 70),
('c3', 65),
('c4', 55),
('c5', 50),
('c6', 40);
The best fit problem is to assign classes to rooms with the fewest empty chairs in the room and nobody standing. The starting point is to find all the rooms that can hold a particular class.
SELECT room_nbr, class_nbr, (room_size - class_size) AS fit
FROM Classes, Rooms
WHERE class_size <= room_size;
Now, let's try to find the best fits:
WITH RC --- all legal pairs
AS
(SELECT *, (room_size - class_size) AS fit
FROM Classes, Rooms
WHERE class_size <= room_size)
SELECT * -- best fit
FROM RC
WHERE fit = (SELECT MIN(fit)
FROM RC AS RCa
WHERE RC.class_nbr = RCa.class_nbr);
Results
==============
c1 80 r4 85 5
c2 70 r1 70 0
c3 65 r6 65 0
c4 55 r7 55 0
c5 50 r3 50 0
c6 40 r2 40 0
This is very good, but it depends on the data! Slight change in the legal pairs CTE or the data, to allow for spare seating, will invalidate the results.
WITH RC --- all legal pairs with spare room
AS
(SELECT *, (room_size - class_size) AS fit
FROM Classes, Rooms
WHERE class_size < room_size)
SELECT * -- best fit
FROM RC
WHERE fit = (SELECT MIN(fit)
FROM RC AS RCa
WHERE RC.class_nbr = RCa.class_nbr);
Results
=================
c1 80 r4 85 5 ◄ Opps!
c2 70 r4 85 15 ◄ Opps! Duplicate room number!
c3 65 r1 70 5
c4 55 r6 65 10
c5 50 r7 55 5
c6 40 r3 50 10
Anyone want to try for general solution?