Fill gaps of dates and variable within group - PostgreSQL
up vote
1
down vote
favorite
I have a price table that has gaps in two main variables, dates (sales_date) and the sales channel (channel). I need to fill those gaps for all the combinations of posible skus (ean) and clients (id_client).
At this moment I have been able to fill the dates and channels, but there are some cases where multiple channels are shared across the same date, in those "bizarre" cases my approach is duplicating everything.
Tables
create table prices_master (
id_price serial primary key,
sales_date date,
ean varchar(15),
id_client int,
channel varchar(15),
price float
);
create table channels_master (
id_channel serial primary key,
channel varchar(15)
);
insert into prices_master (sales_date, ean, id_client, channel, price)
values
('2015-07-01', '7506205801143', 7, 'COMERCIAL',47655),
('2015-08-01', '7506205801143', 7, 'COMERCIAL',51655),
('2015-12-01', '7506205801143', 7, 'COMERCIAL', 55667),
('2015-12-01', '7506205801143', 7, 'DISTRIBUIDOR', 35667),
('2015-07-01', '5052197008555', 7, 'DISTRIBUIDOR', 7224),
('2015-10-01', '5052197008555', 7, 'DISTRIBUIDOR', 8224);
insert into channels_master (channel) values
('DISTRIBUIDOR'), ('INSTITUCIONAL'), ('NON_TRADE'), ('COMERCIAL');
My approach
WITH full_dates AS (
WITH min_max AS (
SELECT min(prm.sales_date) AS min_date, ((max(prm.sales_date))) :: date AS max_date
FROM prices_master prm
)
SELECT generate_series((min_max.min_date) :: timestamp with time zone,
(min_max.max_date) :: timestamp with time zone, '1 mon' :: interval) AS sales_date
FROM min_max),
completechannels AS (
SELECT DISTINCT channel
FROM channels_master
),
temp AS (
SELECT prices_master.sales_date,
prices_master.id_client,
prices_master.ean,
prices_master.channel,
prices_master.price,
lead(
prices_master.sales_date) OVER (PARTITION BY prices_master.id_client, prices_master.ean, prices_master.channel ORDER BY prices_master.sales_date) AS next_sales_date
FROM prices_master
ORDER BY prices_master.id_client, prices_master.ean, prices_master.channel, prices_master.sales_date
)
SELECT (full_dates.sales_date) :: date AS sales_date,
temp.id_client,
temp.ean,
completechannels.channel,
price
FROM full_dates
JOIN temp ON full_dates.sales_date >= temp.sales_date AND
(full_dates.sales_date < temp.next_sales_date OR temp.next_sales_date IS NULL)
JOIN completechannels ON 1=1
ORDER BY temp.id_client, temp.ean, completechannels.channel,
full_dates.sales_date;
My problem comes on code 7506205801143 on sales_date 2015-12-01, since this code has prices for both channels DISTRIBUIDOR and COMERCIAL my approach is duplicating the rows:
My approach result (bad)
+------------+-----------+---------------+---------------+-------+
| sales_date | id_client | ean | channel | price |
+------------+-----------+---------------+---------------+-------+
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 35667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 35667 |
+------------+-----------+---------------+---------------+-------+
Expected Result (good)
+------------+-----------+---------------+---------------+-------+
| sales_date | id_client | ean | channel | price |
+------------+-----------+---------------+---------------+-------+
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
+------------+-----------+---------------+---------------+-------+
For INSTITUTIONAL and NON_TRADE the max price can be used to fill the gaps.
sql postgresql gaps-and-islands
add a comment |
up vote
1
down vote
favorite
I have a price table that has gaps in two main variables, dates (sales_date) and the sales channel (channel). I need to fill those gaps for all the combinations of posible skus (ean) and clients (id_client).
At this moment I have been able to fill the dates and channels, but there are some cases where multiple channels are shared across the same date, in those "bizarre" cases my approach is duplicating everything.
Tables
create table prices_master (
id_price serial primary key,
sales_date date,
ean varchar(15),
id_client int,
channel varchar(15),
price float
);
create table channels_master (
id_channel serial primary key,
channel varchar(15)
);
insert into prices_master (sales_date, ean, id_client, channel, price)
values
('2015-07-01', '7506205801143', 7, 'COMERCIAL',47655),
('2015-08-01', '7506205801143', 7, 'COMERCIAL',51655),
('2015-12-01', '7506205801143', 7, 'COMERCIAL', 55667),
('2015-12-01', '7506205801143', 7, 'DISTRIBUIDOR', 35667),
('2015-07-01', '5052197008555', 7, 'DISTRIBUIDOR', 7224),
('2015-10-01', '5052197008555', 7, 'DISTRIBUIDOR', 8224);
insert into channels_master (channel) values
('DISTRIBUIDOR'), ('INSTITUCIONAL'), ('NON_TRADE'), ('COMERCIAL');
My approach
WITH full_dates AS (
WITH min_max AS (
SELECT min(prm.sales_date) AS min_date, ((max(prm.sales_date))) :: date AS max_date
FROM prices_master prm
)
SELECT generate_series((min_max.min_date) :: timestamp with time zone,
(min_max.max_date) :: timestamp with time zone, '1 mon' :: interval) AS sales_date
FROM min_max),
completechannels AS (
SELECT DISTINCT channel
FROM channels_master
),
temp AS (
SELECT prices_master.sales_date,
prices_master.id_client,
prices_master.ean,
prices_master.channel,
prices_master.price,
lead(
prices_master.sales_date) OVER (PARTITION BY prices_master.id_client, prices_master.ean, prices_master.channel ORDER BY prices_master.sales_date) AS next_sales_date
FROM prices_master
ORDER BY prices_master.id_client, prices_master.ean, prices_master.channel, prices_master.sales_date
)
SELECT (full_dates.sales_date) :: date AS sales_date,
temp.id_client,
temp.ean,
completechannels.channel,
price
FROM full_dates
JOIN temp ON full_dates.sales_date >= temp.sales_date AND
(full_dates.sales_date < temp.next_sales_date OR temp.next_sales_date IS NULL)
JOIN completechannels ON 1=1
ORDER BY temp.id_client, temp.ean, completechannels.channel,
full_dates.sales_date;
My problem comes on code 7506205801143 on sales_date 2015-12-01, since this code has prices for both channels DISTRIBUIDOR and COMERCIAL my approach is duplicating the rows:
My approach result (bad)
+------------+-----------+---------------+---------------+-------+
| sales_date | id_client | ean | channel | price |
+------------+-----------+---------------+---------------+-------+
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 35667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 35667 |
+------------+-----------+---------------+---------------+-------+
Expected Result (good)
+------------+-----------+---------------+---------------+-------+
| sales_date | id_client | ean | channel | price |
+------------+-----------+---------------+---------------+-------+
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
+------------+-----------+---------------+---------------+-------+
For INSTITUTIONAL and NON_TRADE the max price can be used to fill the gaps.
sql postgresql gaps-and-islands
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have a price table that has gaps in two main variables, dates (sales_date) and the sales channel (channel). I need to fill those gaps for all the combinations of posible skus (ean) and clients (id_client).
At this moment I have been able to fill the dates and channels, but there are some cases where multiple channels are shared across the same date, in those "bizarre" cases my approach is duplicating everything.
Tables
create table prices_master (
id_price serial primary key,
sales_date date,
ean varchar(15),
id_client int,
channel varchar(15),
price float
);
create table channels_master (
id_channel serial primary key,
channel varchar(15)
);
insert into prices_master (sales_date, ean, id_client, channel, price)
values
('2015-07-01', '7506205801143', 7, 'COMERCIAL',47655),
('2015-08-01', '7506205801143', 7, 'COMERCIAL',51655),
('2015-12-01', '7506205801143', 7, 'COMERCIAL', 55667),
('2015-12-01', '7506205801143', 7, 'DISTRIBUIDOR', 35667),
('2015-07-01', '5052197008555', 7, 'DISTRIBUIDOR', 7224),
('2015-10-01', '5052197008555', 7, 'DISTRIBUIDOR', 8224);
insert into channels_master (channel) values
('DISTRIBUIDOR'), ('INSTITUCIONAL'), ('NON_TRADE'), ('COMERCIAL');
My approach
WITH full_dates AS (
WITH min_max AS (
SELECT min(prm.sales_date) AS min_date, ((max(prm.sales_date))) :: date AS max_date
FROM prices_master prm
)
SELECT generate_series((min_max.min_date) :: timestamp with time zone,
(min_max.max_date) :: timestamp with time zone, '1 mon' :: interval) AS sales_date
FROM min_max),
completechannels AS (
SELECT DISTINCT channel
FROM channels_master
),
temp AS (
SELECT prices_master.sales_date,
prices_master.id_client,
prices_master.ean,
prices_master.channel,
prices_master.price,
lead(
prices_master.sales_date) OVER (PARTITION BY prices_master.id_client, prices_master.ean, prices_master.channel ORDER BY prices_master.sales_date) AS next_sales_date
FROM prices_master
ORDER BY prices_master.id_client, prices_master.ean, prices_master.channel, prices_master.sales_date
)
SELECT (full_dates.sales_date) :: date AS sales_date,
temp.id_client,
temp.ean,
completechannels.channel,
price
FROM full_dates
JOIN temp ON full_dates.sales_date >= temp.sales_date AND
(full_dates.sales_date < temp.next_sales_date OR temp.next_sales_date IS NULL)
JOIN completechannels ON 1=1
ORDER BY temp.id_client, temp.ean, completechannels.channel,
full_dates.sales_date;
My problem comes on code 7506205801143 on sales_date 2015-12-01, since this code has prices for both channels DISTRIBUIDOR and COMERCIAL my approach is duplicating the rows:
My approach result (bad)
+------------+-----------+---------------+---------------+-------+
| sales_date | id_client | ean | channel | price |
+------------+-----------+---------------+---------------+-------+
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 35667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 35667 |
+------------+-----------+---------------+---------------+-------+
Expected Result (good)
+------------+-----------+---------------+---------------+-------+
| sales_date | id_client | ean | channel | price |
+------------+-----------+---------------+---------------+-------+
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
+------------+-----------+---------------+---------------+-------+
For INSTITUTIONAL and NON_TRADE the max price can be used to fill the gaps.
sql postgresql gaps-and-islands
I have a price table that has gaps in two main variables, dates (sales_date) and the sales channel (channel). I need to fill those gaps for all the combinations of posible skus (ean) and clients (id_client).
At this moment I have been able to fill the dates and channels, but there are some cases where multiple channels are shared across the same date, in those "bizarre" cases my approach is duplicating everything.
Tables
create table prices_master (
id_price serial primary key,
sales_date date,
ean varchar(15),
id_client int,
channel varchar(15),
price float
);
create table channels_master (
id_channel serial primary key,
channel varchar(15)
);
insert into prices_master (sales_date, ean, id_client, channel, price)
values
('2015-07-01', '7506205801143', 7, 'COMERCIAL',47655),
('2015-08-01', '7506205801143', 7, 'COMERCIAL',51655),
('2015-12-01', '7506205801143', 7, 'COMERCIAL', 55667),
('2015-12-01', '7506205801143', 7, 'DISTRIBUIDOR', 35667),
('2015-07-01', '5052197008555', 7, 'DISTRIBUIDOR', 7224),
('2015-10-01', '5052197008555', 7, 'DISTRIBUIDOR', 8224);
insert into channels_master (channel) values
('DISTRIBUIDOR'), ('INSTITUCIONAL'), ('NON_TRADE'), ('COMERCIAL');
My approach
WITH full_dates AS (
WITH min_max AS (
SELECT min(prm.sales_date) AS min_date, ((max(prm.sales_date))) :: date AS max_date
FROM prices_master prm
)
SELECT generate_series((min_max.min_date) :: timestamp with time zone,
(min_max.max_date) :: timestamp with time zone, '1 mon' :: interval) AS sales_date
FROM min_max),
completechannels AS (
SELECT DISTINCT channel
FROM channels_master
),
temp AS (
SELECT prices_master.sales_date,
prices_master.id_client,
prices_master.ean,
prices_master.channel,
prices_master.price,
lead(
prices_master.sales_date) OVER (PARTITION BY prices_master.id_client, prices_master.ean, prices_master.channel ORDER BY prices_master.sales_date) AS next_sales_date
FROM prices_master
ORDER BY prices_master.id_client, prices_master.ean, prices_master.channel, prices_master.sales_date
)
SELECT (full_dates.sales_date) :: date AS sales_date,
temp.id_client,
temp.ean,
completechannels.channel,
price
FROM full_dates
JOIN temp ON full_dates.sales_date >= temp.sales_date AND
(full_dates.sales_date < temp.next_sales_date OR temp.next_sales_date IS NULL)
JOIN completechannels ON 1=1
ORDER BY temp.id_client, temp.ean, completechannels.channel,
full_dates.sales_date;
My problem comes on code 7506205801143 on sales_date 2015-12-01, since this code has prices for both channels DISTRIBUIDOR and COMERCIAL my approach is duplicating the rows:
My approach result (bad)
+------------+-----------+---------------+---------------+-------+
| sales_date | id_client | ean | channel | price |
+------------+-----------+---------------+---------------+-------+
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 35667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 35667 |
+------------+-----------+---------------+---------------+-------+
Expected Result (good)
+------------+-----------+---------------+---------------+-------+
| sales_date | id_client | ean | channel | price |
+------------+-----------+---------------+---------------+-------+
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
+------------+-----------+---------------+---------------+-------+
For INSTITUTIONAL and NON_TRADE the max price can be used to fill the gaps.
sql postgresql gaps-and-islands
sql postgresql gaps-and-islands
asked Nov 12 at 3:17
JuanD
204
204
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
up vote
1
down vote
accepted
You're going to find this much easier to do by flipping some of this on its head, and considering the main prices table to be overrides. That is, you want to construct a "base" table of just the (maximum) price for the date
/client
/ean
tuple, and ignoring the channel until later.
First, you need to add the following CTE to the ones you already have (formatting/naming updated to my usual style):
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range -- aka TEMP in your original query
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
This makes the set multiplication from the cartesian product (JOIN completechannels ON 1=1
- although normally done via CROSS JOIN
) work with you: now there won't be extra rows:
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
Maximum_Price_Per_Date.price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 55667 |
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
Now we just have to LEFT JOIN
back (again) to the Price_Date_Range
CTE again, using the price there, if present:
-- Note that you should have a Calendar table, which would remove this.
WITH Date_Range AS (
-- You probably should be using an explicit range here, to account for future dates.
WITH Min_Max AS (
SELECT MIN(sales_date) AS min_date, MAX(sales_date) AS max_date
FROM Prices_Master
),
Timezone_Range AS (
SELECT GENERATE_SERIES(min_date, max_date, CAST('1 mon' AS INTERVAL)) AS sales_date
FROM Min_Max
)
SELECT CAST(sales_date AS DATE) AS sales_date
FROM Timezone_Range
),
-- This would really benefit by being a MQT - materialized query table
Price_Date_Range AS (
SELECT sales_date, lead(sales_date) OVER (PARTITION BY id_client, ean, channel ORDER BY sales_date) AS next_sales_date,
id_client, ean, channel, price
FROM Prices_Master
),
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
COALESCE(Price_Date_Range.price, Maximum_Price_Per_Date.price) AS price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
LEFT JOIN Price_Date_Range
ON Price_Date_Range.channel = Channel.channel
AND Price_Date_Range.id_client = Maximum_Price_Per_Date.id_client
AND Price_Date_Range.ean = Maximum_Price_Per_Date.ean
AND Price_Date_Range.sales_date <= Maximum_Price_Per_Date.sales_date
AND (Price_Date_Range.next_sales_date > Maximum_Price_Per_Date.sales_date OR Price_Date_Range.next_sales_date IS NULL)
ORDER BY Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean, Channel.channel
Fiddle example
(thanks to @D-Shih for the setup)
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
Excellent!. Thanks for the naming convention, actually made things a lot clearer. It is an MQT, just wanted to keep ti simple.
– JuanD
Nov 12 at 13:06
add a comment |
up vote
1
down vote
You can try to use ROW_NUMBER
window function by sales_date
DESC in a subquery to get max row data for each channel
then use coalesce
and MAX
window function to make it.
Query 1:
WITH pricesCTE as (
SELECT price,sales_date,id_client,ean,cm.channel,ROW_NUMBER() OVER(PARTITION BY cm.channel ORDER BY sales_date DESC) rn
FROM (SELECT DISTINCT channel FROM channels_master) cm
LEFT JOIN prices_master pm on pm.channel = cm.channel
)
SELECT
coalesce(sales_date,MAX(sales_date) OVER(ORDER BY coalesce(price,0) DESC)) sales_date,
coalesce(id_client,MAX(id_client) OVER(ORDER BY coalesce(price,0) DESC)) id_client,
coalesce(ean,MAX(ean) OVER(ORDER BY coalesce(price,0) DESC)) ean,
channel,
coalesce(price,MAX(price) OVER(ORDER BY coalesce(price,0) DESC)) price
FROM
(
select *
from pricesCTE
where rn = 1
) t1
Results:
| sales_date | id_client | ean | channel | price |
|------------|-----------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53255540%2ffill-gaps-of-dates-and-variable-within-group-postgresql%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
You're going to find this much easier to do by flipping some of this on its head, and considering the main prices table to be overrides. That is, you want to construct a "base" table of just the (maximum) price for the date
/client
/ean
tuple, and ignoring the channel until later.
First, you need to add the following CTE to the ones you already have (formatting/naming updated to my usual style):
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range -- aka TEMP in your original query
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
This makes the set multiplication from the cartesian product (JOIN completechannels ON 1=1
- although normally done via CROSS JOIN
) work with you: now there won't be extra rows:
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
Maximum_Price_Per_Date.price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 55667 |
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
Now we just have to LEFT JOIN
back (again) to the Price_Date_Range
CTE again, using the price there, if present:
-- Note that you should have a Calendar table, which would remove this.
WITH Date_Range AS (
-- You probably should be using an explicit range here, to account for future dates.
WITH Min_Max AS (
SELECT MIN(sales_date) AS min_date, MAX(sales_date) AS max_date
FROM Prices_Master
),
Timezone_Range AS (
SELECT GENERATE_SERIES(min_date, max_date, CAST('1 mon' AS INTERVAL)) AS sales_date
FROM Min_Max
)
SELECT CAST(sales_date AS DATE) AS sales_date
FROM Timezone_Range
),
-- This would really benefit by being a MQT - materialized query table
Price_Date_Range AS (
SELECT sales_date, lead(sales_date) OVER (PARTITION BY id_client, ean, channel ORDER BY sales_date) AS next_sales_date,
id_client, ean, channel, price
FROM Prices_Master
),
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
COALESCE(Price_Date_Range.price, Maximum_Price_Per_Date.price) AS price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
LEFT JOIN Price_Date_Range
ON Price_Date_Range.channel = Channel.channel
AND Price_Date_Range.id_client = Maximum_Price_Per_Date.id_client
AND Price_Date_Range.ean = Maximum_Price_Per_Date.ean
AND Price_Date_Range.sales_date <= Maximum_Price_Per_Date.sales_date
AND (Price_Date_Range.next_sales_date > Maximum_Price_Per_Date.sales_date OR Price_Date_Range.next_sales_date IS NULL)
ORDER BY Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean, Channel.channel
Fiddle example
(thanks to @D-Shih for the setup)
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
Excellent!. Thanks for the naming convention, actually made things a lot clearer. It is an MQT, just wanted to keep ti simple.
– JuanD
Nov 12 at 13:06
add a comment |
up vote
1
down vote
accepted
You're going to find this much easier to do by flipping some of this on its head, and considering the main prices table to be overrides. That is, you want to construct a "base" table of just the (maximum) price for the date
/client
/ean
tuple, and ignoring the channel until later.
First, you need to add the following CTE to the ones you already have (formatting/naming updated to my usual style):
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range -- aka TEMP in your original query
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
This makes the set multiplication from the cartesian product (JOIN completechannels ON 1=1
- although normally done via CROSS JOIN
) work with you: now there won't be extra rows:
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
Maximum_Price_Per_Date.price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 55667 |
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
Now we just have to LEFT JOIN
back (again) to the Price_Date_Range
CTE again, using the price there, if present:
-- Note that you should have a Calendar table, which would remove this.
WITH Date_Range AS (
-- You probably should be using an explicit range here, to account for future dates.
WITH Min_Max AS (
SELECT MIN(sales_date) AS min_date, MAX(sales_date) AS max_date
FROM Prices_Master
),
Timezone_Range AS (
SELECT GENERATE_SERIES(min_date, max_date, CAST('1 mon' AS INTERVAL)) AS sales_date
FROM Min_Max
)
SELECT CAST(sales_date AS DATE) AS sales_date
FROM Timezone_Range
),
-- This would really benefit by being a MQT - materialized query table
Price_Date_Range AS (
SELECT sales_date, lead(sales_date) OVER (PARTITION BY id_client, ean, channel ORDER BY sales_date) AS next_sales_date,
id_client, ean, channel, price
FROM Prices_Master
),
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
COALESCE(Price_Date_Range.price, Maximum_Price_Per_Date.price) AS price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
LEFT JOIN Price_Date_Range
ON Price_Date_Range.channel = Channel.channel
AND Price_Date_Range.id_client = Maximum_Price_Per_Date.id_client
AND Price_Date_Range.ean = Maximum_Price_Per_Date.ean
AND Price_Date_Range.sales_date <= Maximum_Price_Per_Date.sales_date
AND (Price_Date_Range.next_sales_date > Maximum_Price_Per_Date.sales_date OR Price_Date_Range.next_sales_date IS NULL)
ORDER BY Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean, Channel.channel
Fiddle example
(thanks to @D-Shih for the setup)
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
Excellent!. Thanks for the naming convention, actually made things a lot clearer. It is an MQT, just wanted to keep ti simple.
– JuanD
Nov 12 at 13:06
add a comment |
up vote
1
down vote
accepted
up vote
1
down vote
accepted
You're going to find this much easier to do by flipping some of this on its head, and considering the main prices table to be overrides. That is, you want to construct a "base" table of just the (maximum) price for the date
/client
/ean
tuple, and ignoring the channel until later.
First, you need to add the following CTE to the ones you already have (formatting/naming updated to my usual style):
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range -- aka TEMP in your original query
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
This makes the set multiplication from the cartesian product (JOIN completechannels ON 1=1
- although normally done via CROSS JOIN
) work with you: now there won't be extra rows:
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
Maximum_Price_Per_Date.price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 55667 |
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
Now we just have to LEFT JOIN
back (again) to the Price_Date_Range
CTE again, using the price there, if present:
-- Note that you should have a Calendar table, which would remove this.
WITH Date_Range AS (
-- You probably should be using an explicit range here, to account for future dates.
WITH Min_Max AS (
SELECT MIN(sales_date) AS min_date, MAX(sales_date) AS max_date
FROM Prices_Master
),
Timezone_Range AS (
SELECT GENERATE_SERIES(min_date, max_date, CAST('1 mon' AS INTERVAL)) AS sales_date
FROM Min_Max
)
SELECT CAST(sales_date AS DATE) AS sales_date
FROM Timezone_Range
),
-- This would really benefit by being a MQT - materialized query table
Price_Date_Range AS (
SELECT sales_date, lead(sales_date) OVER (PARTITION BY id_client, ean, channel ORDER BY sales_date) AS next_sales_date,
id_client, ean, channel, price
FROM Prices_Master
),
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
COALESCE(Price_Date_Range.price, Maximum_Price_Per_Date.price) AS price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
LEFT JOIN Price_Date_Range
ON Price_Date_Range.channel = Channel.channel
AND Price_Date_Range.id_client = Maximum_Price_Per_Date.id_client
AND Price_Date_Range.ean = Maximum_Price_Per_Date.ean
AND Price_Date_Range.sales_date <= Maximum_Price_Per_Date.sales_date
AND (Price_Date_Range.next_sales_date > Maximum_Price_Per_Date.sales_date OR Price_Date_Range.next_sales_date IS NULL)
ORDER BY Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean, Channel.channel
Fiddle example
(thanks to @D-Shih for the setup)
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
You're going to find this much easier to do by flipping some of this on its head, and considering the main prices table to be overrides. That is, you want to construct a "base" table of just the (maximum) price for the date
/client
/ean
tuple, and ignoring the channel until later.
First, you need to add the following CTE to the ones you already have (formatting/naming updated to my usual style):
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range -- aka TEMP in your original query
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
This makes the set multiplication from the cartesian product (JOIN completechannels ON 1=1
- although normally done via CROSS JOIN
) work with you: now there won't be extra rows:
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
Maximum_Price_Per_Date.price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 55667 |
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
Now we just have to LEFT JOIN
back (again) to the Price_Date_Range
CTE again, using the price there, if present:
-- Note that you should have a Calendar table, which would remove this.
WITH Date_Range AS (
-- You probably should be using an explicit range here, to account for future dates.
WITH Min_Max AS (
SELECT MIN(sales_date) AS min_date, MAX(sales_date) AS max_date
FROM Prices_Master
),
Timezone_Range AS (
SELECT GENERATE_SERIES(min_date, max_date, CAST('1 mon' AS INTERVAL)) AS sales_date
FROM Min_Max
)
SELECT CAST(sales_date AS DATE) AS sales_date
FROM Timezone_Range
),
-- This would really benefit by being a MQT - materialized query table
Price_Date_Range AS (
SELECT sales_date, lead(sales_date) OVER (PARTITION BY id_client, ean, channel ORDER BY sales_date) AS next_sales_date,
id_client, ean, channel, price
FROM Prices_Master
),
Maximum_Price_Per_Date AS (
SELECT Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean,
MAX(Price_Date_Range.price) AS price
FROM Date_Range
JOIN Price_Date_Range
ON Price_Date_Range.sales_date <= Date_Range.sales_date
AND (Price_Date_Range.next_sales_date > Date_Range.sales_date OR Price_Date_Range.next_sales_date IS NULL)
GROUP BY Date_Range.sales_date, Price_Date_Range.id_client, Price_Date_Range.ean
)
SELECT Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean,
Channel.channel,
COALESCE(Price_Date_Range.price, Maximum_Price_Per_Date.price) AS price
FROM Maximum_Price_Per_Date
CROSS JOIN (SELECT DISTINCT channel
FROM Channels_Master) Channel
LEFT JOIN Price_Date_Range
ON Price_Date_Range.channel = Channel.channel
AND Price_Date_Range.id_client = Maximum_Price_Per_Date.id_client
AND Price_Date_Range.ean = Maximum_Price_Per_Date.ean
AND Price_Date_Range.sales_date <= Maximum_Price_Per_Date.sales_date
AND (Price_Date_Range.next_sales_date > Maximum_Price_Per_Date.sales_date OR Price_Date_Range.next_sales_date IS NULL)
ORDER BY Maximum_Price_Per_Date.sales_date, Maximum_Price_Per_Date.id_client, Maximum_Price_Per_Date.ean, Channel.channel
Fiddle example
(thanks to @D-Shih for the setup)
generates (uninteresting rows omitted):
| sales_date | channel | id_client | ean | price |
|------------|---------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
answered Nov 12 at 5:57
Clockwork-Muse
10k32141
10k32141
Excellent!. Thanks for the naming convention, actually made things a lot clearer. It is an MQT, just wanted to keep ti simple.
– JuanD
Nov 12 at 13:06
add a comment |
Excellent!. Thanks for the naming convention, actually made things a lot clearer. It is an MQT, just wanted to keep ti simple.
– JuanD
Nov 12 at 13:06
Excellent!. Thanks for the naming convention, actually made things a lot clearer. It is an MQT, just wanted to keep ti simple.
– JuanD
Nov 12 at 13:06
Excellent!. Thanks for the naming convention, actually made things a lot clearer. It is an MQT, just wanted to keep ti simple.
– JuanD
Nov 12 at 13:06
add a comment |
up vote
1
down vote
You can try to use ROW_NUMBER
window function by sales_date
DESC in a subquery to get max row data for each channel
then use coalesce
and MAX
window function to make it.
Query 1:
WITH pricesCTE as (
SELECT price,sales_date,id_client,ean,cm.channel,ROW_NUMBER() OVER(PARTITION BY cm.channel ORDER BY sales_date DESC) rn
FROM (SELECT DISTINCT channel FROM channels_master) cm
LEFT JOIN prices_master pm on pm.channel = cm.channel
)
SELECT
coalesce(sales_date,MAX(sales_date) OVER(ORDER BY coalesce(price,0) DESC)) sales_date,
coalesce(id_client,MAX(id_client) OVER(ORDER BY coalesce(price,0) DESC)) id_client,
coalesce(ean,MAX(ean) OVER(ORDER BY coalesce(price,0) DESC)) ean,
channel,
coalesce(price,MAX(price) OVER(ORDER BY coalesce(price,0) DESC)) price
FROM
(
select *
from pricesCTE
where rn = 1
) t1
Results:
| sales_date | id_client | ean | channel | price |
|------------|-----------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
add a comment |
up vote
1
down vote
You can try to use ROW_NUMBER
window function by sales_date
DESC in a subquery to get max row data for each channel
then use coalesce
and MAX
window function to make it.
Query 1:
WITH pricesCTE as (
SELECT price,sales_date,id_client,ean,cm.channel,ROW_NUMBER() OVER(PARTITION BY cm.channel ORDER BY sales_date DESC) rn
FROM (SELECT DISTINCT channel FROM channels_master) cm
LEFT JOIN prices_master pm on pm.channel = cm.channel
)
SELECT
coalesce(sales_date,MAX(sales_date) OVER(ORDER BY coalesce(price,0) DESC)) sales_date,
coalesce(id_client,MAX(id_client) OVER(ORDER BY coalesce(price,0) DESC)) id_client,
coalesce(ean,MAX(ean) OVER(ORDER BY coalesce(price,0) DESC)) ean,
channel,
coalesce(price,MAX(price) OVER(ORDER BY coalesce(price,0) DESC)) price
FROM
(
select *
from pricesCTE
where rn = 1
) t1
Results:
| sales_date | id_client | ean | channel | price |
|------------|-----------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
add a comment |
up vote
1
down vote
up vote
1
down vote
You can try to use ROW_NUMBER
window function by sales_date
DESC in a subquery to get max row data for each channel
then use coalesce
and MAX
window function to make it.
Query 1:
WITH pricesCTE as (
SELECT price,sales_date,id_client,ean,cm.channel,ROW_NUMBER() OVER(PARTITION BY cm.channel ORDER BY sales_date DESC) rn
FROM (SELECT DISTINCT channel FROM channels_master) cm
LEFT JOIN prices_master pm on pm.channel = cm.channel
)
SELECT
coalesce(sales_date,MAX(sales_date) OVER(ORDER BY coalesce(price,0) DESC)) sales_date,
coalesce(id_client,MAX(id_client) OVER(ORDER BY coalesce(price,0) DESC)) id_client,
coalesce(ean,MAX(ean) OVER(ORDER BY coalesce(price,0) DESC)) ean,
channel,
coalesce(price,MAX(price) OVER(ORDER BY coalesce(price,0) DESC)) price
FROM
(
select *
from pricesCTE
where rn = 1
) t1
Results:
| sales_date | id_client | ean | channel | price |
|------------|-----------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
You can try to use ROW_NUMBER
window function by sales_date
DESC in a subquery to get max row data for each channel
then use coalesce
and MAX
window function to make it.
Query 1:
WITH pricesCTE as (
SELECT price,sales_date,id_client,ean,cm.channel,ROW_NUMBER() OVER(PARTITION BY cm.channel ORDER BY sales_date DESC) rn
FROM (SELECT DISTINCT channel FROM channels_master) cm
LEFT JOIN prices_master pm on pm.channel = cm.channel
)
SELECT
coalesce(sales_date,MAX(sales_date) OVER(ORDER BY coalesce(price,0) DESC)) sales_date,
coalesce(id_client,MAX(id_client) OVER(ORDER BY coalesce(price,0) DESC)) id_client,
coalesce(ean,MAX(ean) OVER(ORDER BY coalesce(price,0) DESC)) ean,
channel,
coalesce(price,MAX(price) OVER(ORDER BY coalesce(price,0) DESC)) price
FROM
(
select *
from pricesCTE
where rn = 1
) t1
Results:
| sales_date | id_client | ean | channel | price |
|------------|-----------|---------------|---------------|-------|
| 2015-12-01 | 7 | 7506205801143 | COMERCIAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | DISTRIBUIDOR | 35667 |
| 2015-12-01 | 7 | 7506205801143 | INSTITUCIONAL | 55667 |
| 2015-12-01 | 7 | 7506205801143 | NON_TRADE | 55667 |
answered Nov 12 at 3:38
D-Shih
25k61431
25k61431
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53255540%2ffill-gaps-of-dates-and-variable-within-group-postgresql%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown