The set up is that you have a table that captures a series of events. There is a select group of ten items I want to look for, but only when they occur in a certain time frame. They have to belong to the same user within say five minutes of each other.
CREATE TABLE Events
(item_id INTEGER NOT NULL,
user_id INTEGER NOT NULL,
event_timestamp DATETIME2(0) DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (item_id, user_id, event_timestamp) );
This is a kind of relational division. First set up your divisor (the ten items) in a table:
CREATE TABLE Divisor
(item_id INTEGER NOT NULL PRIMARY KEY);
INSERT INTO Divisor VALUES (1), (2), .., (10);
The basic relational division is easy when you have only one (user_id, item_id) in the dividend (Events) table and just want to find who has all ten without regard to how long it took them:
SELECT E1.user_id
FROM (SELECT DISTINCT E1.item_id, E1.user_id - clear timestamp
FROM Events) AS E1,
Divisor AS D1
WHERE E1.item_id = D1.item_id
GROUP BY E1.user_id
HAVING COUNT(E1.item_id) = (SELECT COUNT(item_id) FROM Divisor);
The question is how that timestamp works. Are these five-minute timeslots that start and end at fixed points in time? ('2013-05-25 00:00:00', '2013-05-25 00:04:59') etc? Or is it any five minute frame that I can drop on the data?
The first is fairly easy; the second spec is a bitch. Anyone want to try it?
CREATE TABLE Events
(item_id INTEGER NOT NULL,
user_id INTEGER NOT NULL,
event_timestamp DATETIME2(0) DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (item_id, user_id, event_timestamp) );
This is a kind of relational division. First set up your divisor (the ten items) in a table:
CREATE TABLE Divisor
(item_id INTEGER NOT NULL PRIMARY KEY);
INSERT INTO Divisor VALUES (1), (2), .., (10);
The basic relational division is easy when you have only one (user_id, item_id) in the dividend (Events) table and just want to find who has all ten without regard to how long it took them:
SELECT E1.user_id
FROM (SELECT DISTINCT E1.item_id, E1.user_id - clear timestamp
FROM Events) AS E1,
Divisor AS D1
WHERE E1.item_id = D1.item_id
GROUP BY E1.user_id
HAVING COUNT(E1.item_id) = (SELECT COUNT(item_id) FROM Divisor);
The question is how that timestamp works. Are these five-minute timeslots that start and end at fixed points in time? ('2013-05-25 00:00:00', '2013-05-25 00:04:59') etc? Or is it any five minute frame that I can drop on the data?
The first is fairly easy; the second spec is a bitch. Anyone want to try it?