Pivoting a redshift table










0














I think I am needing to pivot my database... or maybe there is some other function I can use to get the result I am looking for. Below is what my current dataset looks like (I actually have about 15 metrics):



+----------------------------------+---------+------------------------+----------------+
| ID | Metric 1| Metric 2 | Overall Column |
+----------------------------------+---------+------------------------+----------------+
| 1 | Red | Yellow | Red |
| 2 | Yellow | Yellow | Yellow |
| 3 | Yellow | | Yellow |
+----------------------------------+---------+------------------------+----------------+


The overall column already has logic in SQL to say 'Red' if any of the Metrics are Red (even if they are Yellow, too), and then 'Yellow' if any are Yellow. There are also cases where Two metrics can be Yellow, Red, etc. What I am looking to do is add a new column that will show specifically which metric (or metrics) caused the overall value of Red or Yellow. What I am thinking is some sort of pivot that will, for each ID, have metrics as a row value and the corresponding color also as a row value (if that makes sense), and then I can do a listagg function and then join that table back on to my original dataset based on the ID.



Pivot example, ignore col2 & col3..



+----------------------------------+---------+------------------------+----------------+
| ID | col1 | col2 | col3 |
+----------------------------------+---------+------------------------+----------------+
| 1 | Red | | |
| 1 | Yellow | | |
| 3 | Yellow | | |
+----------------------------------+---------+------------------------+----------------+


After this I can listagg that table to capture multiple colors and then join it to the original table. The only thing I am leaving out there is if there is both Red and Yellow metric for an individual ID and then I do a listagg, that would bring both Red and Yellow even though the overall value is based on the Red metric. Hoping the SQL experts can help me out here.










share|improve this question























  • If your starting point is the result of SQL, you may be better to show us the starting data and the starting SQL. Then be explicitly clear about the final result you actually want. I've been using Redshift for over 5 years, so I believe I can help, but your question is too vague and rambling, sorry.
    – MatBailie
    Nov 12 at 14:46










  • @MatBailie The starting point is the first table I posted. It is created by a series of joins and case whens. Every 'Metric' column is just a logic field created using other fields. The final result I am wanting is the second table I posted. Table one has no duplicate IDs, with ~15 metrics. Table 2 has duplicate IDs since one ID can have a value for multiple metrics (hence the pivot)
    – cexcelc
    Nov 13 at 15:18










  • Then, as I said, you should be showing the data from before the joins and case whens. That's your starting data. Pivoting is often possible at the same time. Instead of creating the flat-file format, then pivoting, you just create the normalised format to begin with (better performance, often simpler code, etc, etc).
    – MatBailie
    Nov 13 at 15:27










  • For the desired results, you appear to be missing at least one column, the name/identifier of the metic that is red or yellow. You also say that you then want to listagg() over it and join it back to the first table. So that is NOT the final desired result. Please show actual example inputs and actual desired outputs, not incomplete examples of interim steps.
    – MatBailie
    Nov 13 at 15:30















0














I think I am needing to pivot my database... or maybe there is some other function I can use to get the result I am looking for. Below is what my current dataset looks like (I actually have about 15 metrics):



+----------------------------------+---------+------------------------+----------------+
| ID | Metric 1| Metric 2 | Overall Column |
+----------------------------------+---------+------------------------+----------------+
| 1 | Red | Yellow | Red |
| 2 | Yellow | Yellow | Yellow |
| 3 | Yellow | | Yellow |
+----------------------------------+---------+------------------------+----------------+


The overall column already has logic in SQL to say 'Red' if any of the Metrics are Red (even if they are Yellow, too), and then 'Yellow' if any are Yellow. There are also cases where Two metrics can be Yellow, Red, etc. What I am looking to do is add a new column that will show specifically which metric (or metrics) caused the overall value of Red or Yellow. What I am thinking is some sort of pivot that will, for each ID, have metrics as a row value and the corresponding color also as a row value (if that makes sense), and then I can do a listagg function and then join that table back on to my original dataset based on the ID.



Pivot example, ignore col2 & col3..



+----------------------------------+---------+------------------------+----------------+
| ID | col1 | col2 | col3 |
+----------------------------------+---------+------------------------+----------------+
| 1 | Red | | |
| 1 | Yellow | | |
| 3 | Yellow | | |
+----------------------------------+---------+------------------------+----------------+


After this I can listagg that table to capture multiple colors and then join it to the original table. The only thing I am leaving out there is if there is both Red and Yellow metric for an individual ID and then I do a listagg, that would bring both Red and Yellow even though the overall value is based on the Red metric. Hoping the SQL experts can help me out here.










share|improve this question























  • If your starting point is the result of SQL, you may be better to show us the starting data and the starting SQL. Then be explicitly clear about the final result you actually want. I've been using Redshift for over 5 years, so I believe I can help, but your question is too vague and rambling, sorry.
    – MatBailie
    Nov 12 at 14:46










  • @MatBailie The starting point is the first table I posted. It is created by a series of joins and case whens. Every 'Metric' column is just a logic field created using other fields. The final result I am wanting is the second table I posted. Table one has no duplicate IDs, with ~15 metrics. Table 2 has duplicate IDs since one ID can have a value for multiple metrics (hence the pivot)
    – cexcelc
    Nov 13 at 15:18










  • Then, as I said, you should be showing the data from before the joins and case whens. That's your starting data. Pivoting is often possible at the same time. Instead of creating the flat-file format, then pivoting, you just create the normalised format to begin with (better performance, often simpler code, etc, etc).
    – MatBailie
    Nov 13 at 15:27










  • For the desired results, you appear to be missing at least one column, the name/identifier of the metic that is red or yellow. You also say that you then want to listagg() over it and join it back to the first table. So that is NOT the final desired result. Please show actual example inputs and actual desired outputs, not incomplete examples of interim steps.
    – MatBailie
    Nov 13 at 15:30













0












0








0







I think I am needing to pivot my database... or maybe there is some other function I can use to get the result I am looking for. Below is what my current dataset looks like (I actually have about 15 metrics):



+----------------------------------+---------+------------------------+----------------+
| ID | Metric 1| Metric 2 | Overall Column |
+----------------------------------+---------+------------------------+----------------+
| 1 | Red | Yellow | Red |
| 2 | Yellow | Yellow | Yellow |
| 3 | Yellow | | Yellow |
+----------------------------------+---------+------------------------+----------------+


The overall column already has logic in SQL to say 'Red' if any of the Metrics are Red (even if they are Yellow, too), and then 'Yellow' if any are Yellow. There are also cases where Two metrics can be Yellow, Red, etc. What I am looking to do is add a new column that will show specifically which metric (or metrics) caused the overall value of Red or Yellow. What I am thinking is some sort of pivot that will, for each ID, have metrics as a row value and the corresponding color also as a row value (if that makes sense), and then I can do a listagg function and then join that table back on to my original dataset based on the ID.



Pivot example, ignore col2 & col3..



+----------------------------------+---------+------------------------+----------------+
| ID | col1 | col2 | col3 |
+----------------------------------+---------+------------------------+----------------+
| 1 | Red | | |
| 1 | Yellow | | |
| 3 | Yellow | | |
+----------------------------------+---------+------------------------+----------------+


After this I can listagg that table to capture multiple colors and then join it to the original table. The only thing I am leaving out there is if there is both Red and Yellow metric for an individual ID and then I do a listagg, that would bring both Red and Yellow even though the overall value is based on the Red metric. Hoping the SQL experts can help me out here.










share|improve this question















I think I am needing to pivot my database... or maybe there is some other function I can use to get the result I am looking for. Below is what my current dataset looks like (I actually have about 15 metrics):



+----------------------------------+---------+------------------------+----------------+
| ID | Metric 1| Metric 2 | Overall Column |
+----------------------------------+---------+------------------------+----------------+
| 1 | Red | Yellow | Red |
| 2 | Yellow | Yellow | Yellow |
| 3 | Yellow | | Yellow |
+----------------------------------+---------+------------------------+----------------+


The overall column already has logic in SQL to say 'Red' if any of the Metrics are Red (even if they are Yellow, too), and then 'Yellow' if any are Yellow. There are also cases where Two metrics can be Yellow, Red, etc. What I am looking to do is add a new column that will show specifically which metric (or metrics) caused the overall value of Red or Yellow. What I am thinking is some sort of pivot that will, for each ID, have metrics as a row value and the corresponding color also as a row value (if that makes sense), and then I can do a listagg function and then join that table back on to my original dataset based on the ID.



Pivot example, ignore col2 & col3..



+----------------------------------+---------+------------------------+----------------+
| ID | col1 | col2 | col3 |
+----------------------------------+---------+------------------------+----------------+
| 1 | Red | | |
| 1 | Yellow | | |
| 3 | Yellow | | |
+----------------------------------+---------+------------------------+----------------+


After this I can listagg that table to capture multiple colors and then join it to the original table. The only thing I am leaving out there is if there is both Red and Yellow metric for an individual ID and then I do a listagg, that would bring both Red and Yellow even though the overall value is based on the Red metric. Hoping the SQL experts can help me out here.







sql pivot amazon-redshift






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 8 at 21:40









a_horse_with_no_name

291k46444537




291k46444537










asked Nov 8 at 20:55









cexcelc

155




155











  • If your starting point is the result of SQL, you may be better to show us the starting data and the starting SQL. Then be explicitly clear about the final result you actually want. I've been using Redshift for over 5 years, so I believe I can help, but your question is too vague and rambling, sorry.
    – MatBailie
    Nov 12 at 14:46










  • @MatBailie The starting point is the first table I posted. It is created by a series of joins and case whens. Every 'Metric' column is just a logic field created using other fields. The final result I am wanting is the second table I posted. Table one has no duplicate IDs, with ~15 metrics. Table 2 has duplicate IDs since one ID can have a value for multiple metrics (hence the pivot)
    – cexcelc
    Nov 13 at 15:18










  • Then, as I said, you should be showing the data from before the joins and case whens. That's your starting data. Pivoting is often possible at the same time. Instead of creating the flat-file format, then pivoting, you just create the normalised format to begin with (better performance, often simpler code, etc, etc).
    – MatBailie
    Nov 13 at 15:27










  • For the desired results, you appear to be missing at least one column, the name/identifier of the metic that is red or yellow. You also say that you then want to listagg() over it and join it back to the first table. So that is NOT the final desired result. Please show actual example inputs and actual desired outputs, not incomplete examples of interim steps.
    – MatBailie
    Nov 13 at 15:30
















  • If your starting point is the result of SQL, you may be better to show us the starting data and the starting SQL. Then be explicitly clear about the final result you actually want. I've been using Redshift for over 5 years, so I believe I can help, but your question is too vague and rambling, sorry.
    – MatBailie
    Nov 12 at 14:46










  • @MatBailie The starting point is the first table I posted. It is created by a series of joins and case whens. Every 'Metric' column is just a logic field created using other fields. The final result I am wanting is the second table I posted. Table one has no duplicate IDs, with ~15 metrics. Table 2 has duplicate IDs since one ID can have a value for multiple metrics (hence the pivot)
    – cexcelc
    Nov 13 at 15:18










  • Then, as I said, you should be showing the data from before the joins and case whens. That's your starting data. Pivoting is often possible at the same time. Instead of creating the flat-file format, then pivoting, you just create the normalised format to begin with (better performance, often simpler code, etc, etc).
    – MatBailie
    Nov 13 at 15:27










  • For the desired results, you appear to be missing at least one column, the name/identifier of the metic that is red or yellow. You also say that you then want to listagg() over it and join it back to the first table. So that is NOT the final desired result. Please show actual example inputs and actual desired outputs, not incomplete examples of interim steps.
    – MatBailie
    Nov 13 at 15:30















If your starting point is the result of SQL, you may be better to show us the starting data and the starting SQL. Then be explicitly clear about the final result you actually want. I've been using Redshift for over 5 years, so I believe I can help, but your question is too vague and rambling, sorry.
– MatBailie
Nov 12 at 14:46




If your starting point is the result of SQL, you may be better to show us the starting data and the starting SQL. Then be explicitly clear about the final result you actually want. I've been using Redshift for over 5 years, so I believe I can help, but your question is too vague and rambling, sorry.
– MatBailie
Nov 12 at 14:46












@MatBailie The starting point is the first table I posted. It is created by a series of joins and case whens. Every 'Metric' column is just a logic field created using other fields. The final result I am wanting is the second table I posted. Table one has no duplicate IDs, with ~15 metrics. Table 2 has duplicate IDs since one ID can have a value for multiple metrics (hence the pivot)
– cexcelc
Nov 13 at 15:18




@MatBailie The starting point is the first table I posted. It is created by a series of joins and case whens. Every 'Metric' column is just a logic field created using other fields. The final result I am wanting is the second table I posted. Table one has no duplicate IDs, with ~15 metrics. Table 2 has duplicate IDs since one ID can have a value for multiple metrics (hence the pivot)
– cexcelc
Nov 13 at 15:18












Then, as I said, you should be showing the data from before the joins and case whens. That's your starting data. Pivoting is often possible at the same time. Instead of creating the flat-file format, then pivoting, you just create the normalised format to begin with (better performance, often simpler code, etc, etc).
– MatBailie
Nov 13 at 15:27




Then, as I said, you should be showing the data from before the joins and case whens. That's your starting data. Pivoting is often possible at the same time. Instead of creating the flat-file format, then pivoting, you just create the normalised format to begin with (better performance, often simpler code, etc, etc).
– MatBailie
Nov 13 at 15:27












For the desired results, you appear to be missing at least one column, the name/identifier of the metic that is red or yellow. You also say that you then want to listagg() over it and join it back to the first table. So that is NOT the final desired result. Please show actual example inputs and actual desired outputs, not incomplete examples of interim steps.
– MatBailie
Nov 13 at 15:30




For the desired results, you appear to be missing at least one column, the name/identifier of the metic that is red or yellow. You also say that you then want to listagg() over it and join it back to the first table. So that is NOT the final desired result. Please show actual example inputs and actual desired outputs, not incomplete examples of interim steps.
– MatBailie
Nov 13 at 15:30












1 Answer
1






active

oldest

votes


















0














Redshift is currently based on Postgres 8.03 so it is missing a lot of functionality we've come to expect from Postgres over the last few years. So trying to come up with a solution involving unnest, array or lateral is out of the question (I've learned this the hard way).



So barring the availability of all those new-fangled features, you can unpivot the source table and create a set of each id and its metrics by using union all and creating a union for each metric column.



select a.id, metrics.metric
from tbl a
inner join (
select id, metric1 metric from tbl where metric1 is not null
union all select id, metric2 from tbl where metric2 is not null
union all select id, metric15 from tbl where metric15 is not null
) metrics ON metrics.id = a.id
order by a.id, metrics.metric


Results



id | item
---+--------
1 | red
1 | yellow
2 | blue
2 | green
2 | pink
3 | orange


SQL Fiddle






share|improve this answer
















  • 1




    Your colours are in the wrong order... Red and yellow and pink and green, purple and orange and blue, I can sing a rainbow, sing a rainbow, sing a rainbow too.
    – MatBailie
    Nov 13 at 15:31











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53215986%2fpivoting-a-redshift-table%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














Redshift is currently based on Postgres 8.03 so it is missing a lot of functionality we've come to expect from Postgres over the last few years. So trying to come up with a solution involving unnest, array or lateral is out of the question (I've learned this the hard way).



So barring the availability of all those new-fangled features, you can unpivot the source table and create a set of each id and its metrics by using union all and creating a union for each metric column.



select a.id, metrics.metric
from tbl a
inner join (
select id, metric1 metric from tbl where metric1 is not null
union all select id, metric2 from tbl where metric2 is not null
union all select id, metric15 from tbl where metric15 is not null
) metrics ON metrics.id = a.id
order by a.id, metrics.metric


Results



id | item
---+--------
1 | red
1 | yellow
2 | blue
2 | green
2 | pink
3 | orange


SQL Fiddle






share|improve this answer
















  • 1




    Your colours are in the wrong order... Red and yellow and pink and green, purple and orange and blue, I can sing a rainbow, sing a rainbow, sing a rainbow too.
    – MatBailie
    Nov 13 at 15:31
















0














Redshift is currently based on Postgres 8.03 so it is missing a lot of functionality we've come to expect from Postgres over the last few years. So trying to come up with a solution involving unnest, array or lateral is out of the question (I've learned this the hard way).



So barring the availability of all those new-fangled features, you can unpivot the source table and create a set of each id and its metrics by using union all and creating a union for each metric column.



select a.id, metrics.metric
from tbl a
inner join (
select id, metric1 metric from tbl where metric1 is not null
union all select id, metric2 from tbl where metric2 is not null
union all select id, metric15 from tbl where metric15 is not null
) metrics ON metrics.id = a.id
order by a.id, metrics.metric


Results



id | item
---+--------
1 | red
1 | yellow
2 | blue
2 | green
2 | pink
3 | orange


SQL Fiddle






share|improve this answer
















  • 1




    Your colours are in the wrong order... Red and yellow and pink and green, purple and orange and blue, I can sing a rainbow, sing a rainbow, sing a rainbow too.
    – MatBailie
    Nov 13 at 15:31














0












0








0






Redshift is currently based on Postgres 8.03 so it is missing a lot of functionality we've come to expect from Postgres over the last few years. So trying to come up with a solution involving unnest, array or lateral is out of the question (I've learned this the hard way).



So barring the availability of all those new-fangled features, you can unpivot the source table and create a set of each id and its metrics by using union all and creating a union for each metric column.



select a.id, metrics.metric
from tbl a
inner join (
select id, metric1 metric from tbl where metric1 is not null
union all select id, metric2 from tbl where metric2 is not null
union all select id, metric15 from tbl where metric15 is not null
) metrics ON metrics.id = a.id
order by a.id, metrics.metric


Results



id | item
---+--------
1 | red
1 | yellow
2 | blue
2 | green
2 | pink
3 | orange


SQL Fiddle






share|improve this answer












Redshift is currently based on Postgres 8.03 so it is missing a lot of functionality we've come to expect from Postgres over the last few years. So trying to come up with a solution involving unnest, array or lateral is out of the question (I've learned this the hard way).



So barring the availability of all those new-fangled features, you can unpivot the source table and create a set of each id and its metrics by using union all and creating a union for each metric column.



select a.id, metrics.metric
from tbl a
inner join (
select id, metric1 metric from tbl where metric1 is not null
union all select id, metric2 from tbl where metric2 is not null
union all select id, metric15 from tbl where metric15 is not null
) metrics ON metrics.id = a.id
order by a.id, metrics.metric


Results



id | item
---+--------
1 | red
1 | yellow
2 | blue
2 | green
2 | pink
3 | orange


SQL Fiddle







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 12 at 15:35









Daniel Gimenez

10.4k22446




10.4k22446







  • 1




    Your colours are in the wrong order... Red and yellow and pink and green, purple and orange and blue, I can sing a rainbow, sing a rainbow, sing a rainbow too.
    – MatBailie
    Nov 13 at 15:31













  • 1




    Your colours are in the wrong order... Red and yellow and pink and green, purple and orange and blue, I can sing a rainbow, sing a rainbow, sing a rainbow too.
    – MatBailie
    Nov 13 at 15:31








1




1




Your colours are in the wrong order... Red and yellow and pink and green, purple and orange and blue, I can sing a rainbow, sing a rainbow, sing a rainbow too.
– MatBailie
Nov 13 at 15:31





Your colours are in the wrong order... Red and yellow and pink and green, purple and orange and blue, I can sing a rainbow, sing a rainbow, sing a rainbow too.
– MatBailie
Nov 13 at 15:31


















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53215986%2fpivoting-a-redshift-table%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Top Tejano songwriter Luis Silva dead of heart attack at 64

政党

天津地下鉄3号線