Pivoting a redshift table
I think I am needing to pivot my database... or maybe there is some other function I can use to get the result I am looking for. Below is what my current dataset looks like (I actually have about 15 metrics):
+----------------------------------+---------+------------------------+----------------+
| ID | Metric 1| Metric 2 | Overall Column |
+----------------------------------+---------+------------------------+----------------+
| 1 | Red | Yellow | Red |
| 2 | Yellow | Yellow | Yellow |
| 3 | Yellow | | Yellow |
+----------------------------------+---------+------------------------+----------------+
The overall column already has logic in SQL to say 'Red' if any of the Metrics are Red (even if they are Yellow, too), and then 'Yellow' if any are Yellow. There are also cases where Two metrics can be Yellow, Red, etc. What I am looking to do is add a new column that will show specifically which metric (or metrics) caused the overall value of Red or Yellow. What I am thinking is some sort of pivot that will, for each ID, have metrics as a row value and the corresponding color also as a row value (if that makes sense), and then I can do a listagg function and then join that table back on to my original dataset based on the ID.
Pivot example, ignore col2 & col3..
+----------------------------------+---------+------------------------+----------------+
| ID | col1 | col2 | col3 |
+----------------------------------+---------+------------------------+----------------+
| 1 | Red | | |
| 1 | Yellow | | |
| 3 | Yellow | | |
+----------------------------------+---------+------------------------+----------------+
After this I can listagg that table to capture multiple colors and then join it to the original table. The only thing I am leaving out there is if there is both Red and Yellow metric for an individual ID and then I do a listagg, that would bring both Red and Yellow even though the overall value is based on the Red metric. Hoping the SQL experts can help me out here.
sql pivot amazon-redshift
add a comment |
I think I am needing to pivot my database... or maybe there is some other function I can use to get the result I am looking for. Below is what my current dataset looks like (I actually have about 15 metrics):
+----------------------------------+---------+------------------------+----------------+
| ID | Metric 1| Metric 2 | Overall Column |
+----------------------------------+---------+------------------------+----------------+
| 1 | Red | Yellow | Red |
| 2 | Yellow | Yellow | Yellow |
| 3 | Yellow | | Yellow |
+----------------------------------+---------+------------------------+----------------+
The overall column already has logic in SQL to say 'Red' if any of the Metrics are Red (even if they are Yellow, too), and then 'Yellow' if any are Yellow. There are also cases where Two metrics can be Yellow, Red, etc. What I am looking to do is add a new column that will show specifically which metric (or metrics) caused the overall value of Red or Yellow. What I am thinking is some sort of pivot that will, for each ID, have metrics as a row value and the corresponding color also as a row value (if that makes sense), and then I can do a listagg function and then join that table back on to my original dataset based on the ID.
Pivot example, ignore col2 & col3..
+----------------------------------+---------+------------------------+----------------+
| ID | col1 | col2 | col3 |
+----------------------------------+---------+------------------------+----------------+
| 1 | Red | | |
| 1 | Yellow | | |
| 3 | Yellow | | |
+----------------------------------+---------+------------------------+----------------+
After this I can listagg that table to capture multiple colors and then join it to the original table. The only thing I am leaving out there is if there is both Red and Yellow metric for an individual ID and then I do a listagg, that would bring both Red and Yellow even though the overall value is based on the Red metric. Hoping the SQL experts can help me out here.
sql pivot amazon-redshift
If your starting point is the result of SQL, you may be better to show us the starting data and the starting SQL. Then be explicitly clear about the final result you actually want. I've been using Redshift for over 5 years, so I believe I can help, but your question is too vague and rambling, sorry.
– MatBailie
Nov 12 at 14:46
@MatBailie The starting point is the first table I posted. It is created by a series of joins and case whens. Every 'Metric' column is just a logic field created using other fields. The final result I am wanting is the second table I posted. Table one has no duplicate IDs, with ~15 metrics. Table 2 has duplicate IDs since one ID can have a value for multiple metrics (hence the pivot)
– cexcelc
Nov 13 at 15:18
Then, as I said, you should be showing the data from before the joins and case whens. That's your starting data. Pivoting is often possible at the same time. Instead of creating the flat-file format, then pivoting, you just create the normalised format to begin with (better performance, often simpler code, etc, etc).
– MatBailie
Nov 13 at 15:27
For the desired results, you appear to be missing at least one column, the name/identifier of the metic that is red or yellow. You also say that you then want to listagg() over it and join it back to the first table. So that is NOT the final desired result. Please show actual example inputs and actual desired outputs, not incomplete examples of interim steps.
– MatBailie
Nov 13 at 15:30
add a comment |
I think I am needing to pivot my database... or maybe there is some other function I can use to get the result I am looking for. Below is what my current dataset looks like (I actually have about 15 metrics):
+----------------------------------+---------+------------------------+----------------+
| ID | Metric 1| Metric 2 | Overall Column |
+----------------------------------+---------+------------------------+----------------+
| 1 | Red | Yellow | Red |
| 2 | Yellow | Yellow | Yellow |
| 3 | Yellow | | Yellow |
+----------------------------------+---------+------------------------+----------------+
The overall column already has logic in SQL to say 'Red' if any of the Metrics are Red (even if they are Yellow, too), and then 'Yellow' if any are Yellow. There are also cases where Two metrics can be Yellow, Red, etc. What I am looking to do is add a new column that will show specifically which metric (or metrics) caused the overall value of Red or Yellow. What I am thinking is some sort of pivot that will, for each ID, have metrics as a row value and the corresponding color also as a row value (if that makes sense), and then I can do a listagg function and then join that table back on to my original dataset based on the ID.
Pivot example, ignore col2 & col3..
+----------------------------------+---------+------------------------+----------------+
| ID | col1 | col2 | col3 |
+----------------------------------+---------+------------------------+----------------+
| 1 | Red | | |
| 1 | Yellow | | |
| 3 | Yellow | | |
+----------------------------------+---------+------------------------+----------------+
After this I can listagg that table to capture multiple colors and then join it to the original table. The only thing I am leaving out there is if there is both Red and Yellow metric for an individual ID and then I do a listagg, that would bring both Red and Yellow even though the overall value is based on the Red metric. Hoping the SQL experts can help me out here.
sql pivot amazon-redshift
I think I am needing to pivot my database... or maybe there is some other function I can use to get the result I am looking for. Below is what my current dataset looks like (I actually have about 15 metrics):
+----------------------------------+---------+------------------------+----------------+
| ID | Metric 1| Metric 2 | Overall Column |
+----------------------------------+---------+------------------------+----------------+
| 1 | Red | Yellow | Red |
| 2 | Yellow | Yellow | Yellow |
| 3 | Yellow | | Yellow |
+----------------------------------+---------+------------------------+----------------+
The overall column already has logic in SQL to say 'Red' if any of the Metrics are Red (even if they are Yellow, too), and then 'Yellow' if any are Yellow. There are also cases where Two metrics can be Yellow, Red, etc. What I am looking to do is add a new column that will show specifically which metric (or metrics) caused the overall value of Red or Yellow. What I am thinking is some sort of pivot that will, for each ID, have metrics as a row value and the corresponding color also as a row value (if that makes sense), and then I can do a listagg function and then join that table back on to my original dataset based on the ID.
Pivot example, ignore col2 & col3..
+----------------------------------+---------+------------------------+----------------+
| ID | col1 | col2 | col3 |
+----------------------------------+---------+------------------------+----------------+
| 1 | Red | | |
| 1 | Yellow | | |
| 3 | Yellow | | |
+----------------------------------+---------+------------------------+----------------+
After this I can listagg that table to capture multiple colors and then join it to the original table. The only thing I am leaving out there is if there is both Red and Yellow metric for an individual ID and then I do a listagg, that would bring both Red and Yellow even though the overall value is based on the Red metric. Hoping the SQL experts can help me out here.
sql pivot amazon-redshift
sql pivot amazon-redshift
edited Nov 8 at 21:40
a_horse_with_no_name
291k46444537
291k46444537
asked Nov 8 at 20:55
cexcelc
155
155
If your starting point is the result of SQL, you may be better to show us the starting data and the starting SQL. Then be explicitly clear about the final result you actually want. I've been using Redshift for over 5 years, so I believe I can help, but your question is too vague and rambling, sorry.
– MatBailie
Nov 12 at 14:46
@MatBailie The starting point is the first table I posted. It is created by a series of joins and case whens. Every 'Metric' column is just a logic field created using other fields. The final result I am wanting is the second table I posted. Table one has no duplicate IDs, with ~15 metrics. Table 2 has duplicate IDs since one ID can have a value for multiple metrics (hence the pivot)
– cexcelc
Nov 13 at 15:18
Then, as I said, you should be showing the data from before the joins and case whens. That's your starting data. Pivoting is often possible at the same time. Instead of creating the flat-file format, then pivoting, you just create the normalised format to begin with (better performance, often simpler code, etc, etc).
– MatBailie
Nov 13 at 15:27
For the desired results, you appear to be missing at least one column, the name/identifier of the metic that is red or yellow. You also say that you then want to listagg() over it and join it back to the first table. So that is NOT the final desired result. Please show actual example inputs and actual desired outputs, not incomplete examples of interim steps.
– MatBailie
Nov 13 at 15:30
add a comment |
If your starting point is the result of SQL, you may be better to show us the starting data and the starting SQL. Then be explicitly clear about the final result you actually want. I've been using Redshift for over 5 years, so I believe I can help, but your question is too vague and rambling, sorry.
– MatBailie
Nov 12 at 14:46
@MatBailie The starting point is the first table I posted. It is created by a series of joins and case whens. Every 'Metric' column is just a logic field created using other fields. The final result I am wanting is the second table I posted. Table one has no duplicate IDs, with ~15 metrics. Table 2 has duplicate IDs since one ID can have a value for multiple metrics (hence the pivot)
– cexcelc
Nov 13 at 15:18
Then, as I said, you should be showing the data from before the joins and case whens. That's your starting data. Pivoting is often possible at the same time. Instead of creating the flat-file format, then pivoting, you just create the normalised format to begin with (better performance, often simpler code, etc, etc).
– MatBailie
Nov 13 at 15:27
For the desired results, you appear to be missing at least one column, the name/identifier of the metic that is red or yellow. You also say that you then want to listagg() over it and join it back to the first table. So that is NOT the final desired result. Please show actual example inputs and actual desired outputs, not incomplete examples of interim steps.
– MatBailie
Nov 13 at 15:30
If your starting point is the result of SQL, you may be better to show us the starting data and the starting SQL. Then be explicitly clear about the final result you actually want. I've been using Redshift for over 5 years, so I believe I can help, but your question is too vague and rambling, sorry.
– MatBailie
Nov 12 at 14:46
If your starting point is the result of SQL, you may be better to show us the starting data and the starting SQL. Then be explicitly clear about the final result you actually want. I've been using Redshift for over 5 years, so I believe I can help, but your question is too vague and rambling, sorry.
– MatBailie
Nov 12 at 14:46
@MatBailie The starting point is the first table I posted. It is created by a series of joins and case whens. Every 'Metric' column is just a logic field created using other fields. The final result I am wanting is the second table I posted. Table one has no duplicate IDs, with ~15 metrics. Table 2 has duplicate IDs since one ID can have a value for multiple metrics (hence the pivot)
– cexcelc
Nov 13 at 15:18
@MatBailie The starting point is the first table I posted. It is created by a series of joins and case whens. Every 'Metric' column is just a logic field created using other fields. The final result I am wanting is the second table I posted. Table one has no duplicate IDs, with ~15 metrics. Table 2 has duplicate IDs since one ID can have a value for multiple metrics (hence the pivot)
– cexcelc
Nov 13 at 15:18
Then, as I said, you should be showing the data from before the joins and case whens. That's your starting data. Pivoting is often possible at the same time. Instead of creating the flat-file format, then pivoting, you just create the normalised format to begin with (better performance, often simpler code, etc, etc).
– MatBailie
Nov 13 at 15:27
Then, as I said, you should be showing the data from before the joins and case whens. That's your starting data. Pivoting is often possible at the same time. Instead of creating the flat-file format, then pivoting, you just create the normalised format to begin with (better performance, often simpler code, etc, etc).
– MatBailie
Nov 13 at 15:27
For the desired results, you appear to be missing at least one column, the name/identifier of the metic that is red or yellow. You also say that you then want to listagg() over it and join it back to the first table. So that is NOT the final desired result. Please show actual example inputs and actual desired outputs, not incomplete examples of interim steps.
– MatBailie
Nov 13 at 15:30
For the desired results, you appear to be missing at least one column, the name/identifier of the metic that is red or yellow. You also say that you then want to listagg() over it and join it back to the first table. So that is NOT the final desired result. Please show actual example inputs and actual desired outputs, not incomplete examples of interim steps.
– MatBailie
Nov 13 at 15:30
add a comment |
1 Answer
1
active
oldest
votes
Redshift is currently based on Postgres 8.03 so it is missing a lot of functionality we've come to expect from Postgres over the last few years. So trying to come up with a solution involving unnest
, array
or lateral
is out of the question (I've learned this the hard way).
So barring the availability of all those new-fangled features, you can unpivot the source table and create a set of each id and its metrics by using union all
and creating a union for each metric column.
select a.id, metrics.metric
from tbl a
inner join (
select id, metric1 metric from tbl where metric1 is not null
union all select id, metric2 from tbl where metric2 is not null
union all select id, metric15 from tbl where metric15 is not null
) metrics ON metrics.id = a.id
order by a.id, metrics.metric
Results
id | item
---+--------
1 | red
1 | yellow
2 | blue
2 | green
2 | pink
3 | orange
SQL Fiddle
1
Your colours are in the wrong order...Red and yellow and pink and green, purple and orange and blue, I can sing a rainbow, sing a rainbow, sing a rainbow too.
– MatBailie
Nov 13 at 15:31
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53215986%2fpivoting-a-redshift-table%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Redshift is currently based on Postgres 8.03 so it is missing a lot of functionality we've come to expect from Postgres over the last few years. So trying to come up with a solution involving unnest
, array
or lateral
is out of the question (I've learned this the hard way).
So barring the availability of all those new-fangled features, you can unpivot the source table and create a set of each id and its metrics by using union all
and creating a union for each metric column.
select a.id, metrics.metric
from tbl a
inner join (
select id, metric1 metric from tbl where metric1 is not null
union all select id, metric2 from tbl where metric2 is not null
union all select id, metric15 from tbl where metric15 is not null
) metrics ON metrics.id = a.id
order by a.id, metrics.metric
Results
id | item
---+--------
1 | red
1 | yellow
2 | blue
2 | green
2 | pink
3 | orange
SQL Fiddle
1
Your colours are in the wrong order...Red and yellow and pink and green, purple and orange and blue, I can sing a rainbow, sing a rainbow, sing a rainbow too.
– MatBailie
Nov 13 at 15:31
add a comment |
Redshift is currently based on Postgres 8.03 so it is missing a lot of functionality we've come to expect from Postgres over the last few years. So trying to come up with a solution involving unnest
, array
or lateral
is out of the question (I've learned this the hard way).
So barring the availability of all those new-fangled features, you can unpivot the source table and create a set of each id and its metrics by using union all
and creating a union for each metric column.
select a.id, metrics.metric
from tbl a
inner join (
select id, metric1 metric from tbl where metric1 is not null
union all select id, metric2 from tbl where metric2 is not null
union all select id, metric15 from tbl where metric15 is not null
) metrics ON metrics.id = a.id
order by a.id, metrics.metric
Results
id | item
---+--------
1 | red
1 | yellow
2 | blue
2 | green
2 | pink
3 | orange
SQL Fiddle
1
Your colours are in the wrong order...Red and yellow and pink and green, purple and orange and blue, I can sing a rainbow, sing a rainbow, sing a rainbow too.
– MatBailie
Nov 13 at 15:31
add a comment |
Redshift is currently based on Postgres 8.03 so it is missing a lot of functionality we've come to expect from Postgres over the last few years. So trying to come up with a solution involving unnest
, array
or lateral
is out of the question (I've learned this the hard way).
So barring the availability of all those new-fangled features, you can unpivot the source table and create a set of each id and its metrics by using union all
and creating a union for each metric column.
select a.id, metrics.metric
from tbl a
inner join (
select id, metric1 metric from tbl where metric1 is not null
union all select id, metric2 from tbl where metric2 is not null
union all select id, metric15 from tbl where metric15 is not null
) metrics ON metrics.id = a.id
order by a.id, metrics.metric
Results
id | item
---+--------
1 | red
1 | yellow
2 | blue
2 | green
2 | pink
3 | orange
SQL Fiddle
Redshift is currently based on Postgres 8.03 so it is missing a lot of functionality we've come to expect from Postgres over the last few years. So trying to come up with a solution involving unnest
, array
or lateral
is out of the question (I've learned this the hard way).
So barring the availability of all those new-fangled features, you can unpivot the source table and create a set of each id and its metrics by using union all
and creating a union for each metric column.
select a.id, metrics.metric
from tbl a
inner join (
select id, metric1 metric from tbl where metric1 is not null
union all select id, metric2 from tbl where metric2 is not null
union all select id, metric15 from tbl where metric15 is not null
) metrics ON metrics.id = a.id
order by a.id, metrics.metric
Results
id | item
---+--------
1 | red
1 | yellow
2 | blue
2 | green
2 | pink
3 | orange
SQL Fiddle
answered Nov 12 at 15:35
Daniel Gimenez
10.4k22446
10.4k22446
1
Your colours are in the wrong order...Red and yellow and pink and green, purple and orange and blue, I can sing a rainbow, sing a rainbow, sing a rainbow too.
– MatBailie
Nov 13 at 15:31
add a comment |
1
Your colours are in the wrong order...Red and yellow and pink and green, purple and orange and blue, I can sing a rainbow, sing a rainbow, sing a rainbow too.
– MatBailie
Nov 13 at 15:31
1
1
Your colours are in the wrong order...
Red and yellow and pink and green, purple and orange and blue, I can sing a rainbow, sing a rainbow, sing a rainbow too.
– MatBailie
Nov 13 at 15:31
Your colours are in the wrong order...
Red and yellow and pink and green, purple and orange and blue, I can sing a rainbow, sing a rainbow, sing a rainbow too.
– MatBailie
Nov 13 at 15:31
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53215986%2fpivoting-a-redshift-table%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
If your starting point is the result of SQL, you may be better to show us the starting data and the starting SQL. Then be explicitly clear about the final result you actually want. I've been using Redshift for over 5 years, so I believe I can help, but your question is too vague and rambling, sorry.
– MatBailie
Nov 12 at 14:46
@MatBailie The starting point is the first table I posted. It is created by a series of joins and case whens. Every 'Metric' column is just a logic field created using other fields. The final result I am wanting is the second table I posted. Table one has no duplicate IDs, with ~15 metrics. Table 2 has duplicate IDs since one ID can have a value for multiple metrics (hence the pivot)
– cexcelc
Nov 13 at 15:18
Then, as I said, you should be showing the data from before the joins and case whens. That's your starting data. Pivoting is often possible at the same time. Instead of creating the flat-file format, then pivoting, you just create the normalised format to begin with (better performance, often simpler code, etc, etc).
– MatBailie
Nov 13 at 15:27
For the desired results, you appear to be missing at least one column, the name/identifier of the metic that is red or yellow. You also say that you then want to listagg() over it and join it back to the first table. So that is NOT the final desired result. Please show actual example inputs and actual desired outputs, not incomplete examples of interim steps.
– MatBailie
Nov 13 at 15:30