Joining tables based on the maximum id










4















I have found three questions which all seem to ask a similar question:



Getting max value from rows and joining to another table



Select only rows by join tables max value



Joining tables based on the maximum value



But I'm having a hard time wrapping my head around how exactly to join tables keeping only the maximum row of one of the tables when the maximum is in the id or index field itself.



I am looking for an answers that only require joins because this will allow the solution to work in a tool which generates queries for which it is easy to get it to generate the corresponding joins, although sub-queries are probably doable as well with a bit more effort. I found the answer below to be of particular interest:



SELECT DISTINCT b.id, b.filename, a1.name
FROM a a1
JOIN b
ON b.id = a1.id
LEFT JOIN a a2
ON a2.id = a1.id
AND a2.rank > a1.rank
WHERE a2.id IS NULL


However, in my case the ranking column is also the index, e.g. "id". I cannot compare for equality and greater than at the same time, because they will never be true at the same time!



Also, potentially complicating the situation is that a typical query in which I have need of this may join several tables (3-5 is not uncommon). So as a simplified example of my query:



SELECT
table1.field1, table1.field2, table1.field3,
table2.field1, table2.field2, table2.field3,
table3.field1, table3.field2, table3.field3,
table4.field1, table4.field2, table4.field3
FROM table1
INNER JOIN table2 ON
table1.field1 = table2.field1
AND table1.field2 = table2.field2
AND table2.field3 < 0
INNER JOIN table3 ON
table2.field1 = table3.field1
AND table2.field4 = table3.field4
INNER JOIN table4 ON
table1.field1 = table4.field1
AND table1.field2 = table4.field2


And what I want to do is to eliminate duplicates in table3 by only getting the row with the maximum id (e.g. MAX(table3.id)) for each unique combination of all the other fields. That is to say, the above query is returning something like this:



+-------+-------+-------+---------+
| table1| table2| table4|table3 |
+-------+-------+-------+---------+
| A | A | A | 1,... |
| A | A | A | 2,... |
| A | A | A | 3,... |
| A | A | A | MAX2,...|
| B | B | B | 1,... |
| B | B | B | 2,... |
| B | B | B | 3,... |
| B | B | B | MAX2,...|
+-------+-------+-------+---------+


(I'm just using A and B to denote that I'm talking about all the same values for the fields in table1, table2, and table4 for a particular set of rows.)



and I want to reduce it to this:



+-------+-------+-------+---------+
| table1| table2| table4|table3 |
+-------+-------+-------+---------+
| A | A | A | MAX1,...|
| B | B | B | MAX2,...|
+-------+-------+-------+---------+









share|improve this question
























  • How about SELECT table1.id, MAX(table3.field3) FROM (<query from above>) GROUP BY table1.id ?

    – 500 - Internal Server Error
    Nov 15 '18 at 19:10











  • @500-InternalServerError SELECT table1.field1, table1.field2, table1.field3, table2.field1, table2.field2, table2.field3, table3.field1, table3.field2, table3.field3, table4.field1, table4.field2, table4.field3 table1.id, MAX(table3.field3) FROM (<query from above>) GROUP BY table1.id ?

    – Michael
    Nov 15 '18 at 19:11






  • 1





    Sample data is best served as DDL + DML. Please edit your question to include it, your current attempt and your desired results. For more details, read this.

    – Zohar Peled
    Nov 15 '18 at 19:14











  • @Michael I think it should be SELECT table1.columns, table2.columns, table4.columns, MAX(table3.columns) and GROUP BY table1.columns, table2.columns, table4.columns. If you don't aggregate all columns from table3 then those should be put to the GROUP BY

    – Dávid Laczkó
    Nov 15 '18 at 19:16












  • @DávidLaczkó My understanding was that some databases don't guarantee that MAX and GROUP BY will play well together to return data from the same row. For instance, if table 3 has "3, A, RED", "4, B, BLUE", "5, C, WHITE" for id, field1 and field2, then MAX(id) will return 5 but the field1 and field2 values aren't guaranteed to be "C" and "WHITE"...

    – Michael
    Nov 15 '18 at 19:29















4















I have found three questions which all seem to ask a similar question:



Getting max value from rows and joining to another table



Select only rows by join tables max value



Joining tables based on the maximum value



But I'm having a hard time wrapping my head around how exactly to join tables keeping only the maximum row of one of the tables when the maximum is in the id or index field itself.



I am looking for an answers that only require joins because this will allow the solution to work in a tool which generates queries for which it is easy to get it to generate the corresponding joins, although sub-queries are probably doable as well with a bit more effort. I found the answer below to be of particular interest:



SELECT DISTINCT b.id, b.filename, a1.name
FROM a a1
JOIN b
ON b.id = a1.id
LEFT JOIN a a2
ON a2.id = a1.id
AND a2.rank > a1.rank
WHERE a2.id IS NULL


However, in my case the ranking column is also the index, e.g. "id". I cannot compare for equality and greater than at the same time, because they will never be true at the same time!



Also, potentially complicating the situation is that a typical query in which I have need of this may join several tables (3-5 is not uncommon). So as a simplified example of my query:



SELECT
table1.field1, table1.field2, table1.field3,
table2.field1, table2.field2, table2.field3,
table3.field1, table3.field2, table3.field3,
table4.field1, table4.field2, table4.field3
FROM table1
INNER JOIN table2 ON
table1.field1 = table2.field1
AND table1.field2 = table2.field2
AND table2.field3 < 0
INNER JOIN table3 ON
table2.field1 = table3.field1
AND table2.field4 = table3.field4
INNER JOIN table4 ON
table1.field1 = table4.field1
AND table1.field2 = table4.field2


And what I want to do is to eliminate duplicates in table3 by only getting the row with the maximum id (e.g. MAX(table3.id)) for each unique combination of all the other fields. That is to say, the above query is returning something like this:



+-------+-------+-------+---------+
| table1| table2| table4|table3 |
+-------+-------+-------+---------+
| A | A | A | 1,... |
| A | A | A | 2,... |
| A | A | A | 3,... |
| A | A | A | MAX2,...|
| B | B | B | 1,... |
| B | B | B | 2,... |
| B | B | B | 3,... |
| B | B | B | MAX2,...|
+-------+-------+-------+---------+


(I'm just using A and B to denote that I'm talking about all the same values for the fields in table1, table2, and table4 for a particular set of rows.)



and I want to reduce it to this:



+-------+-------+-------+---------+
| table1| table2| table4|table3 |
+-------+-------+-------+---------+
| A | A | A | MAX1,...|
| B | B | B | MAX2,...|
+-------+-------+-------+---------+









share|improve this question
























  • How about SELECT table1.id, MAX(table3.field3) FROM (<query from above>) GROUP BY table1.id ?

    – 500 - Internal Server Error
    Nov 15 '18 at 19:10











  • @500-InternalServerError SELECT table1.field1, table1.field2, table1.field3, table2.field1, table2.field2, table2.field3, table3.field1, table3.field2, table3.field3, table4.field1, table4.field2, table4.field3 table1.id, MAX(table3.field3) FROM (<query from above>) GROUP BY table1.id ?

    – Michael
    Nov 15 '18 at 19:11






  • 1





    Sample data is best served as DDL + DML. Please edit your question to include it, your current attempt and your desired results. For more details, read this.

    – Zohar Peled
    Nov 15 '18 at 19:14











  • @Michael I think it should be SELECT table1.columns, table2.columns, table4.columns, MAX(table3.columns) and GROUP BY table1.columns, table2.columns, table4.columns. If you don't aggregate all columns from table3 then those should be put to the GROUP BY

    – Dávid Laczkó
    Nov 15 '18 at 19:16












  • @DávidLaczkó My understanding was that some databases don't guarantee that MAX and GROUP BY will play well together to return data from the same row. For instance, if table 3 has "3, A, RED", "4, B, BLUE", "5, C, WHITE" for id, field1 and field2, then MAX(id) will return 5 but the field1 and field2 values aren't guaranteed to be "C" and "WHITE"...

    – Michael
    Nov 15 '18 at 19:29













4












4








4








I have found three questions which all seem to ask a similar question:



Getting max value from rows and joining to another table



Select only rows by join tables max value



Joining tables based on the maximum value



But I'm having a hard time wrapping my head around how exactly to join tables keeping only the maximum row of one of the tables when the maximum is in the id or index field itself.



I am looking for an answers that only require joins because this will allow the solution to work in a tool which generates queries for which it is easy to get it to generate the corresponding joins, although sub-queries are probably doable as well with a bit more effort. I found the answer below to be of particular interest:



SELECT DISTINCT b.id, b.filename, a1.name
FROM a a1
JOIN b
ON b.id = a1.id
LEFT JOIN a a2
ON a2.id = a1.id
AND a2.rank > a1.rank
WHERE a2.id IS NULL


However, in my case the ranking column is also the index, e.g. "id". I cannot compare for equality and greater than at the same time, because they will never be true at the same time!



Also, potentially complicating the situation is that a typical query in which I have need of this may join several tables (3-5 is not uncommon). So as a simplified example of my query:



SELECT
table1.field1, table1.field2, table1.field3,
table2.field1, table2.field2, table2.field3,
table3.field1, table3.field2, table3.field3,
table4.field1, table4.field2, table4.field3
FROM table1
INNER JOIN table2 ON
table1.field1 = table2.field1
AND table1.field2 = table2.field2
AND table2.field3 < 0
INNER JOIN table3 ON
table2.field1 = table3.field1
AND table2.field4 = table3.field4
INNER JOIN table4 ON
table1.field1 = table4.field1
AND table1.field2 = table4.field2


And what I want to do is to eliminate duplicates in table3 by only getting the row with the maximum id (e.g. MAX(table3.id)) for each unique combination of all the other fields. That is to say, the above query is returning something like this:



+-------+-------+-------+---------+
| table1| table2| table4|table3 |
+-------+-------+-------+---------+
| A | A | A | 1,... |
| A | A | A | 2,... |
| A | A | A | 3,... |
| A | A | A | MAX2,...|
| B | B | B | 1,... |
| B | B | B | 2,... |
| B | B | B | 3,... |
| B | B | B | MAX2,...|
+-------+-------+-------+---------+


(I'm just using A and B to denote that I'm talking about all the same values for the fields in table1, table2, and table4 for a particular set of rows.)



and I want to reduce it to this:



+-------+-------+-------+---------+
| table1| table2| table4|table3 |
+-------+-------+-------+---------+
| A | A | A | MAX1,...|
| B | B | B | MAX2,...|
+-------+-------+-------+---------+









share|improve this question
















I have found three questions which all seem to ask a similar question:



Getting max value from rows and joining to another table



Select only rows by join tables max value



Joining tables based on the maximum value



But I'm having a hard time wrapping my head around how exactly to join tables keeping only the maximum row of one of the tables when the maximum is in the id or index field itself.



I am looking for an answers that only require joins because this will allow the solution to work in a tool which generates queries for which it is easy to get it to generate the corresponding joins, although sub-queries are probably doable as well with a bit more effort. I found the answer below to be of particular interest:



SELECT DISTINCT b.id, b.filename, a1.name
FROM a a1
JOIN b
ON b.id = a1.id
LEFT JOIN a a2
ON a2.id = a1.id
AND a2.rank > a1.rank
WHERE a2.id IS NULL


However, in my case the ranking column is also the index, e.g. "id". I cannot compare for equality and greater than at the same time, because they will never be true at the same time!



Also, potentially complicating the situation is that a typical query in which I have need of this may join several tables (3-5 is not uncommon). So as a simplified example of my query:



SELECT
table1.field1, table1.field2, table1.field3,
table2.field1, table2.field2, table2.field3,
table3.field1, table3.field2, table3.field3,
table4.field1, table4.field2, table4.field3
FROM table1
INNER JOIN table2 ON
table1.field1 = table2.field1
AND table1.field2 = table2.field2
AND table2.field3 < 0
INNER JOIN table3 ON
table2.field1 = table3.field1
AND table2.field4 = table3.field4
INNER JOIN table4 ON
table1.field1 = table4.field1
AND table1.field2 = table4.field2


And what I want to do is to eliminate duplicates in table3 by only getting the row with the maximum id (e.g. MAX(table3.id)) for each unique combination of all the other fields. That is to say, the above query is returning something like this:



+-------+-------+-------+---------+
| table1| table2| table4|table3 |
+-------+-------+-------+---------+
| A | A | A | 1,... |
| A | A | A | 2,... |
| A | A | A | 3,... |
| A | A | A | MAX2,...|
| B | B | B | 1,... |
| B | B | B | 2,... |
| B | B | B | 3,... |
| B | B | B | MAX2,...|
+-------+-------+-------+---------+


(I'm just using A and B to denote that I'm talking about all the same values for the fields in table1, table2, and table4 for a particular set of rows.)



and I want to reduce it to this:



+-------+-------+-------+---------+
| table1| table2| table4|table3 |
+-------+-------+-------+---------+
| A | A | A | MAX1,...|
| B | B | B | MAX2,...|
+-------+-------+-------+---------+






sql sql-server tsql join






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 15 '18 at 19:09









Rahul Neekhra

6001627




6001627










asked Nov 15 '18 at 18:57









MichaelMichael

4,17383777




4,17383777












  • How about SELECT table1.id, MAX(table3.field3) FROM (<query from above>) GROUP BY table1.id ?

    – 500 - Internal Server Error
    Nov 15 '18 at 19:10











  • @500-InternalServerError SELECT table1.field1, table1.field2, table1.field3, table2.field1, table2.field2, table2.field3, table3.field1, table3.field2, table3.field3, table4.field1, table4.field2, table4.field3 table1.id, MAX(table3.field3) FROM (<query from above>) GROUP BY table1.id ?

    – Michael
    Nov 15 '18 at 19:11






  • 1





    Sample data is best served as DDL + DML. Please edit your question to include it, your current attempt and your desired results. For more details, read this.

    – Zohar Peled
    Nov 15 '18 at 19:14











  • @Michael I think it should be SELECT table1.columns, table2.columns, table4.columns, MAX(table3.columns) and GROUP BY table1.columns, table2.columns, table4.columns. If you don't aggregate all columns from table3 then those should be put to the GROUP BY

    – Dávid Laczkó
    Nov 15 '18 at 19:16












  • @DávidLaczkó My understanding was that some databases don't guarantee that MAX and GROUP BY will play well together to return data from the same row. For instance, if table 3 has "3, A, RED", "4, B, BLUE", "5, C, WHITE" for id, field1 and field2, then MAX(id) will return 5 but the field1 and field2 values aren't guaranteed to be "C" and "WHITE"...

    – Michael
    Nov 15 '18 at 19:29

















  • How about SELECT table1.id, MAX(table3.field3) FROM (<query from above>) GROUP BY table1.id ?

    – 500 - Internal Server Error
    Nov 15 '18 at 19:10











  • @500-InternalServerError SELECT table1.field1, table1.field2, table1.field3, table2.field1, table2.field2, table2.field3, table3.field1, table3.field2, table3.field3, table4.field1, table4.field2, table4.field3 table1.id, MAX(table3.field3) FROM (<query from above>) GROUP BY table1.id ?

    – Michael
    Nov 15 '18 at 19:11






  • 1





    Sample data is best served as DDL + DML. Please edit your question to include it, your current attempt and your desired results. For more details, read this.

    – Zohar Peled
    Nov 15 '18 at 19:14











  • @Michael I think it should be SELECT table1.columns, table2.columns, table4.columns, MAX(table3.columns) and GROUP BY table1.columns, table2.columns, table4.columns. If you don't aggregate all columns from table3 then those should be put to the GROUP BY

    – Dávid Laczkó
    Nov 15 '18 at 19:16












  • @DávidLaczkó My understanding was that some databases don't guarantee that MAX and GROUP BY will play well together to return data from the same row. For instance, if table 3 has "3, A, RED", "4, B, BLUE", "5, C, WHITE" for id, field1 and field2, then MAX(id) will return 5 but the field1 and field2 values aren't guaranteed to be "C" and "WHITE"...

    – Michael
    Nov 15 '18 at 19:29
















How about SELECT table1.id, MAX(table3.field3) FROM (<query from above>) GROUP BY table1.id ?

– 500 - Internal Server Error
Nov 15 '18 at 19:10





How about SELECT table1.id, MAX(table3.field3) FROM (<query from above>) GROUP BY table1.id ?

– 500 - Internal Server Error
Nov 15 '18 at 19:10













@500-InternalServerError SELECT table1.field1, table1.field2, table1.field3, table2.field1, table2.field2, table2.field3, table3.field1, table3.field2, table3.field3, table4.field1, table4.field2, table4.field3 table1.id, MAX(table3.field3) FROM (<query from above>) GROUP BY table1.id ?

– Michael
Nov 15 '18 at 19:11





@500-InternalServerError SELECT table1.field1, table1.field2, table1.field3, table2.field1, table2.field2, table2.field3, table3.field1, table3.field2, table3.field3, table4.field1, table4.field2, table4.field3 table1.id, MAX(table3.field3) FROM (<query from above>) GROUP BY table1.id ?

– Michael
Nov 15 '18 at 19:11




1




1





Sample data is best served as DDL + DML. Please edit your question to include it, your current attempt and your desired results. For more details, read this.

– Zohar Peled
Nov 15 '18 at 19:14





Sample data is best served as DDL + DML. Please edit your question to include it, your current attempt and your desired results. For more details, read this.

– Zohar Peled
Nov 15 '18 at 19:14













@Michael I think it should be SELECT table1.columns, table2.columns, table4.columns, MAX(table3.columns) and GROUP BY table1.columns, table2.columns, table4.columns. If you don't aggregate all columns from table3 then those should be put to the GROUP BY

– Dávid Laczkó
Nov 15 '18 at 19:16






@Michael I think it should be SELECT table1.columns, table2.columns, table4.columns, MAX(table3.columns) and GROUP BY table1.columns, table2.columns, table4.columns. If you don't aggregate all columns from table3 then those should be put to the GROUP BY

– Dávid Laczkó
Nov 15 '18 at 19:16














@DávidLaczkó My understanding was that some databases don't guarantee that MAX and GROUP BY will play well together to return data from the same row. For instance, if table 3 has "3, A, RED", "4, B, BLUE", "5, C, WHITE" for id, field1 and field2, then MAX(id) will return 5 but the field1 and field2 values aren't guaranteed to be "C" and "WHITE"...

– Michael
Nov 15 '18 at 19:29





@DávidLaczkó My understanding was that some databases don't guarantee that MAX and GROUP BY will play well together to return data from the same row. For instance, if table 3 has "3, A, RED", "4, B, BLUE", "5, C, WHITE" for id, field1 and field2, then MAX(id) will return 5 but the field1 and field2 values aren't guaranteed to be "C" and "WHITE"...

– Michael
Nov 15 '18 at 19:29












1 Answer
1






active

oldest

votes


















2














You can add a derived table to reduce the matching rows in TABLE3 to one per group. Another method would use a window function but you asked for a JOIN only



SELECT
table1.field1, table1.field2, table1.field3,
table2.field1, table2.field2, table2.field3,
table3.field1, table3.field2, table3.field3,
table4.field1, table4.field2, table4.field3
FROM table1
INNER JOIN table2 ON
table1.field1 = table2.field1
AND table1.field2 = table2.field2
AND table2.field3 < 0
INNER JOIN table3 ON
table2.field1 = table3.field1
AND table2.field4 = table3.field4

--here is the added derived table. Change column names as needed
INNER JOIN (select UID, ID = max(ID) from Table3 group by UID) x
on x.UID = table3.UID and x.mx = table3.ID

INNER JOIN table4 ON
table1.field1 = table4.field1
AND table1.field2 = table4.field2


Or, perhaps... something like below. It really depends on your schema and that's hard to understand with the sample data.



INNER JOIN (select field1, field4, mx = max(ID) from Table3 group by field1, field4) x
on x.field1 = table3.field1 and x.field4 = table3.field4 and x.mx = table3.ID


Here is an example. You'll notice that the last three column pairs are identical. You only want the last one, which is the max(id) for that grouping. What ever makes a row unique in relation to the rest of your data (not your primary key, but what you are joining with) is what you'd want to include int he derived table and join condition.



declare @table table (id int identity(1,1), f1 char(1), f2 char(1))
insert into @table
values
('a','b'),
('a','c'),
('a','a'),
('b','b'),
('b','b'),
('b','b')

select * from @table

select t1.*
from @table t1
inner join
(select f1, f2, mx = max(id) from @table group by f1, f2) t2 on
t1.f1 = t2.f1
and t1.f2 = t2.f2
and t1.id = t2.mx





share|improve this answer




















  • 1





    Sorry, a bit confused... I think UID is the id field of the table and ID is the max, so should that be ID=max(UID), and from Table3 group by UID?

    – Michael
    Nov 15 '18 at 19:42











  • UID was a poor name choice probably. So, UID here would be a foreign key (most likely to your other tables. So, either field1 or field4... or both if you need them. the ID would be the column where you want the max. If you want the max(ID) for each field1 / field4 pair, then just replace UID with these two columns and group by them and add them to the join clause (i'll add a snippet at the bottom in an edit)

    – scsimon
    Nov 15 '18 at 19:45











  • also i had table2 in the derived table, it was supposed to be table3. You are doing a self inner join is the point to limit the rows. and x.id should have been x.mx since that is what i aliased it as.

    – scsimon
    Nov 15 '18 at 19:47












  • Yeah, sorry I didn't really document the schema, I'm trying to solve the general case as I've got several queries I'm trying to fix, so you can just assume there is an "id" field which is the index (e.g. INTEGER PRIMARY KEY or whatever) and arbitrary numbered fields "field1", "field2" etc. up to however many fields there are for every table. If I'm understanding the select in the self-join should (just) include all the fields the previous table was joined on?

    – Michael
    Nov 15 '18 at 19:51












  • see the edit @Michael and see if that clears it up

    – scsimon
    Nov 15 '18 at 20:03










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53326207%2fjoining-tables-based-on-the-maximum-id%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














You can add a derived table to reduce the matching rows in TABLE3 to one per group. Another method would use a window function but you asked for a JOIN only



SELECT
table1.field1, table1.field2, table1.field3,
table2.field1, table2.field2, table2.field3,
table3.field1, table3.field2, table3.field3,
table4.field1, table4.field2, table4.field3
FROM table1
INNER JOIN table2 ON
table1.field1 = table2.field1
AND table1.field2 = table2.field2
AND table2.field3 < 0
INNER JOIN table3 ON
table2.field1 = table3.field1
AND table2.field4 = table3.field4

--here is the added derived table. Change column names as needed
INNER JOIN (select UID, ID = max(ID) from Table3 group by UID) x
on x.UID = table3.UID and x.mx = table3.ID

INNER JOIN table4 ON
table1.field1 = table4.field1
AND table1.field2 = table4.field2


Or, perhaps... something like below. It really depends on your schema and that's hard to understand with the sample data.



INNER JOIN (select field1, field4, mx = max(ID) from Table3 group by field1, field4) x
on x.field1 = table3.field1 and x.field4 = table3.field4 and x.mx = table3.ID


Here is an example. You'll notice that the last three column pairs are identical. You only want the last one, which is the max(id) for that grouping. What ever makes a row unique in relation to the rest of your data (not your primary key, but what you are joining with) is what you'd want to include int he derived table and join condition.



declare @table table (id int identity(1,1), f1 char(1), f2 char(1))
insert into @table
values
('a','b'),
('a','c'),
('a','a'),
('b','b'),
('b','b'),
('b','b')

select * from @table

select t1.*
from @table t1
inner join
(select f1, f2, mx = max(id) from @table group by f1, f2) t2 on
t1.f1 = t2.f1
and t1.f2 = t2.f2
and t1.id = t2.mx





share|improve this answer




















  • 1





    Sorry, a bit confused... I think UID is the id field of the table and ID is the max, so should that be ID=max(UID), and from Table3 group by UID?

    – Michael
    Nov 15 '18 at 19:42











  • UID was a poor name choice probably. So, UID here would be a foreign key (most likely to your other tables. So, either field1 or field4... or both if you need them. the ID would be the column where you want the max. If you want the max(ID) for each field1 / field4 pair, then just replace UID with these two columns and group by them and add them to the join clause (i'll add a snippet at the bottom in an edit)

    – scsimon
    Nov 15 '18 at 19:45











  • also i had table2 in the derived table, it was supposed to be table3. You are doing a self inner join is the point to limit the rows. and x.id should have been x.mx since that is what i aliased it as.

    – scsimon
    Nov 15 '18 at 19:47












  • Yeah, sorry I didn't really document the schema, I'm trying to solve the general case as I've got several queries I'm trying to fix, so you can just assume there is an "id" field which is the index (e.g. INTEGER PRIMARY KEY or whatever) and arbitrary numbered fields "field1", "field2" etc. up to however many fields there are for every table. If I'm understanding the select in the self-join should (just) include all the fields the previous table was joined on?

    – Michael
    Nov 15 '18 at 19:51












  • see the edit @Michael and see if that clears it up

    – scsimon
    Nov 15 '18 at 20:03















2














You can add a derived table to reduce the matching rows in TABLE3 to one per group. Another method would use a window function but you asked for a JOIN only



SELECT
table1.field1, table1.field2, table1.field3,
table2.field1, table2.field2, table2.field3,
table3.field1, table3.field2, table3.field3,
table4.field1, table4.field2, table4.field3
FROM table1
INNER JOIN table2 ON
table1.field1 = table2.field1
AND table1.field2 = table2.field2
AND table2.field3 < 0
INNER JOIN table3 ON
table2.field1 = table3.field1
AND table2.field4 = table3.field4

--here is the added derived table. Change column names as needed
INNER JOIN (select UID, ID = max(ID) from Table3 group by UID) x
on x.UID = table3.UID and x.mx = table3.ID

INNER JOIN table4 ON
table1.field1 = table4.field1
AND table1.field2 = table4.field2


Or, perhaps... something like below. It really depends on your schema and that's hard to understand with the sample data.



INNER JOIN (select field1, field4, mx = max(ID) from Table3 group by field1, field4) x
on x.field1 = table3.field1 and x.field4 = table3.field4 and x.mx = table3.ID


Here is an example. You'll notice that the last three column pairs are identical. You only want the last one, which is the max(id) for that grouping. What ever makes a row unique in relation to the rest of your data (not your primary key, but what you are joining with) is what you'd want to include int he derived table and join condition.



declare @table table (id int identity(1,1), f1 char(1), f2 char(1))
insert into @table
values
('a','b'),
('a','c'),
('a','a'),
('b','b'),
('b','b'),
('b','b')

select * from @table

select t1.*
from @table t1
inner join
(select f1, f2, mx = max(id) from @table group by f1, f2) t2 on
t1.f1 = t2.f1
and t1.f2 = t2.f2
and t1.id = t2.mx





share|improve this answer




















  • 1





    Sorry, a bit confused... I think UID is the id field of the table and ID is the max, so should that be ID=max(UID), and from Table3 group by UID?

    – Michael
    Nov 15 '18 at 19:42











  • UID was a poor name choice probably. So, UID here would be a foreign key (most likely to your other tables. So, either field1 or field4... or both if you need them. the ID would be the column where you want the max. If you want the max(ID) for each field1 / field4 pair, then just replace UID with these two columns and group by them and add them to the join clause (i'll add a snippet at the bottom in an edit)

    – scsimon
    Nov 15 '18 at 19:45











  • also i had table2 in the derived table, it was supposed to be table3. You are doing a self inner join is the point to limit the rows. and x.id should have been x.mx since that is what i aliased it as.

    – scsimon
    Nov 15 '18 at 19:47












  • Yeah, sorry I didn't really document the schema, I'm trying to solve the general case as I've got several queries I'm trying to fix, so you can just assume there is an "id" field which is the index (e.g. INTEGER PRIMARY KEY or whatever) and arbitrary numbered fields "field1", "field2" etc. up to however many fields there are for every table. If I'm understanding the select in the self-join should (just) include all the fields the previous table was joined on?

    – Michael
    Nov 15 '18 at 19:51












  • see the edit @Michael and see if that clears it up

    – scsimon
    Nov 15 '18 at 20:03













2












2








2







You can add a derived table to reduce the matching rows in TABLE3 to one per group. Another method would use a window function but you asked for a JOIN only



SELECT
table1.field1, table1.field2, table1.field3,
table2.field1, table2.field2, table2.field3,
table3.field1, table3.field2, table3.field3,
table4.field1, table4.field2, table4.field3
FROM table1
INNER JOIN table2 ON
table1.field1 = table2.field1
AND table1.field2 = table2.field2
AND table2.field3 < 0
INNER JOIN table3 ON
table2.field1 = table3.field1
AND table2.field4 = table3.field4

--here is the added derived table. Change column names as needed
INNER JOIN (select UID, ID = max(ID) from Table3 group by UID) x
on x.UID = table3.UID and x.mx = table3.ID

INNER JOIN table4 ON
table1.field1 = table4.field1
AND table1.field2 = table4.field2


Or, perhaps... something like below. It really depends on your schema and that's hard to understand with the sample data.



INNER JOIN (select field1, field4, mx = max(ID) from Table3 group by field1, field4) x
on x.field1 = table3.field1 and x.field4 = table3.field4 and x.mx = table3.ID


Here is an example. You'll notice that the last three column pairs are identical. You only want the last one, which is the max(id) for that grouping. What ever makes a row unique in relation to the rest of your data (not your primary key, but what you are joining with) is what you'd want to include int he derived table and join condition.



declare @table table (id int identity(1,1), f1 char(1), f2 char(1))
insert into @table
values
('a','b'),
('a','c'),
('a','a'),
('b','b'),
('b','b'),
('b','b')

select * from @table

select t1.*
from @table t1
inner join
(select f1, f2, mx = max(id) from @table group by f1, f2) t2 on
t1.f1 = t2.f1
and t1.f2 = t2.f2
and t1.id = t2.mx





share|improve this answer















You can add a derived table to reduce the matching rows in TABLE3 to one per group. Another method would use a window function but you asked for a JOIN only



SELECT
table1.field1, table1.field2, table1.field3,
table2.field1, table2.field2, table2.field3,
table3.field1, table3.field2, table3.field3,
table4.field1, table4.field2, table4.field3
FROM table1
INNER JOIN table2 ON
table1.field1 = table2.field1
AND table1.field2 = table2.field2
AND table2.field3 < 0
INNER JOIN table3 ON
table2.field1 = table3.field1
AND table2.field4 = table3.field4

--here is the added derived table. Change column names as needed
INNER JOIN (select UID, ID = max(ID) from Table3 group by UID) x
on x.UID = table3.UID and x.mx = table3.ID

INNER JOIN table4 ON
table1.field1 = table4.field1
AND table1.field2 = table4.field2


Or, perhaps... something like below. It really depends on your schema and that's hard to understand with the sample data.



INNER JOIN (select field1, field4, mx = max(ID) from Table3 group by field1, field4) x
on x.field1 = table3.field1 and x.field4 = table3.field4 and x.mx = table3.ID


Here is an example. You'll notice that the last three column pairs are identical. You only want the last one, which is the max(id) for that grouping. What ever makes a row unique in relation to the rest of your data (not your primary key, but what you are joining with) is what you'd want to include int he derived table and join condition.



declare @table table (id int identity(1,1), f1 char(1), f2 char(1))
insert into @table
values
('a','b'),
('a','c'),
('a','a'),
('b','b'),
('b','b'),
('b','b')

select * from @table

select t1.*
from @table t1
inner join
(select f1, f2, mx = max(id) from @table group by f1, f2) t2 on
t1.f1 = t2.f1
and t1.f2 = t2.f2
and t1.id = t2.mx






share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 15 '18 at 20:02

























answered Nov 15 '18 at 19:37









scsimonscsimon

22.2k51537




22.2k51537







  • 1





    Sorry, a bit confused... I think UID is the id field of the table and ID is the max, so should that be ID=max(UID), and from Table3 group by UID?

    – Michael
    Nov 15 '18 at 19:42











  • UID was a poor name choice probably. So, UID here would be a foreign key (most likely to your other tables. So, either field1 or field4... or both if you need them. the ID would be the column where you want the max. If you want the max(ID) for each field1 / field4 pair, then just replace UID with these two columns and group by them and add them to the join clause (i'll add a snippet at the bottom in an edit)

    – scsimon
    Nov 15 '18 at 19:45











  • also i had table2 in the derived table, it was supposed to be table3. You are doing a self inner join is the point to limit the rows. and x.id should have been x.mx since that is what i aliased it as.

    – scsimon
    Nov 15 '18 at 19:47












  • Yeah, sorry I didn't really document the schema, I'm trying to solve the general case as I've got several queries I'm trying to fix, so you can just assume there is an "id" field which is the index (e.g. INTEGER PRIMARY KEY or whatever) and arbitrary numbered fields "field1", "field2" etc. up to however many fields there are for every table. If I'm understanding the select in the self-join should (just) include all the fields the previous table was joined on?

    – Michael
    Nov 15 '18 at 19:51












  • see the edit @Michael and see if that clears it up

    – scsimon
    Nov 15 '18 at 20:03












  • 1





    Sorry, a bit confused... I think UID is the id field of the table and ID is the max, so should that be ID=max(UID), and from Table3 group by UID?

    – Michael
    Nov 15 '18 at 19:42











  • UID was a poor name choice probably. So, UID here would be a foreign key (most likely to your other tables. So, either field1 or field4... or both if you need them. the ID would be the column where you want the max. If you want the max(ID) for each field1 / field4 pair, then just replace UID with these two columns and group by them and add them to the join clause (i'll add a snippet at the bottom in an edit)

    – scsimon
    Nov 15 '18 at 19:45











  • also i had table2 in the derived table, it was supposed to be table3. You are doing a self inner join is the point to limit the rows. and x.id should have been x.mx since that is what i aliased it as.

    – scsimon
    Nov 15 '18 at 19:47












  • Yeah, sorry I didn't really document the schema, I'm trying to solve the general case as I've got several queries I'm trying to fix, so you can just assume there is an "id" field which is the index (e.g. INTEGER PRIMARY KEY or whatever) and arbitrary numbered fields "field1", "field2" etc. up to however many fields there are for every table. If I'm understanding the select in the self-join should (just) include all the fields the previous table was joined on?

    – Michael
    Nov 15 '18 at 19:51












  • see the edit @Michael and see if that clears it up

    – scsimon
    Nov 15 '18 at 20:03







1




1





Sorry, a bit confused... I think UID is the id field of the table and ID is the max, so should that be ID=max(UID), and from Table3 group by UID?

– Michael
Nov 15 '18 at 19:42





Sorry, a bit confused... I think UID is the id field of the table and ID is the max, so should that be ID=max(UID), and from Table3 group by UID?

– Michael
Nov 15 '18 at 19:42













UID was a poor name choice probably. So, UID here would be a foreign key (most likely to your other tables. So, either field1 or field4... or both if you need them. the ID would be the column where you want the max. If you want the max(ID) for each field1 / field4 pair, then just replace UID with these two columns and group by them and add them to the join clause (i'll add a snippet at the bottom in an edit)

– scsimon
Nov 15 '18 at 19:45





UID was a poor name choice probably. So, UID here would be a foreign key (most likely to your other tables. So, either field1 or field4... or both if you need them. the ID would be the column where you want the max. If you want the max(ID) for each field1 / field4 pair, then just replace UID with these two columns and group by them and add them to the join clause (i'll add a snippet at the bottom in an edit)

– scsimon
Nov 15 '18 at 19:45













also i had table2 in the derived table, it was supposed to be table3. You are doing a self inner join is the point to limit the rows. and x.id should have been x.mx since that is what i aliased it as.

– scsimon
Nov 15 '18 at 19:47






also i had table2 in the derived table, it was supposed to be table3. You are doing a self inner join is the point to limit the rows. and x.id should have been x.mx since that is what i aliased it as.

– scsimon
Nov 15 '18 at 19:47














Yeah, sorry I didn't really document the schema, I'm trying to solve the general case as I've got several queries I'm trying to fix, so you can just assume there is an "id" field which is the index (e.g. INTEGER PRIMARY KEY or whatever) and arbitrary numbered fields "field1", "field2" etc. up to however many fields there are for every table. If I'm understanding the select in the self-join should (just) include all the fields the previous table was joined on?

– Michael
Nov 15 '18 at 19:51






Yeah, sorry I didn't really document the schema, I'm trying to solve the general case as I've got several queries I'm trying to fix, so you can just assume there is an "id" field which is the index (e.g. INTEGER PRIMARY KEY or whatever) and arbitrary numbered fields "field1", "field2" etc. up to however many fields there are for every table. If I'm understanding the select in the self-join should (just) include all the fields the previous table was joined on?

– Michael
Nov 15 '18 at 19:51














see the edit @Michael and see if that clears it up

– scsimon
Nov 15 '18 at 20:03





see the edit @Michael and see if that clears it up

– scsimon
Nov 15 '18 at 20:03



















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53326207%2fjoining-tables-based-on-the-maximum-id%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Top Tejano songwriter Luis Silva dead of heart attack at 64

政党

天津地下鉄3号線