aaronburro Sup, B 53146 Posts user info edit post |
OK, I'm feeling really dumb right now. I've got a View, as so:
-------------- Facility Characterization Revision EffectiveDate ... LastDate --------------
Basically, I want to find the revision for a characterization and facility with the highest effective date such that a given date falls between EffectiveDate and LastDate. if LastDate is NULL, I assume that it is greater than the given date.
Here is the query I have come up with.
Quote : | "SELECT B.Facility, B.Characterization, B.Revision, B.EffectiveDate, B.LastDate FROM (SELECT Facility, Characterization, MAX (EffectiveDate) AS Highest FROM View_CharacterizationRevisions WHERE EffectiveDate <= @Date AND ((LastDate IS NULL) OR (LastDate >= @Date)) GROUP BY Facility, Characterization) A, View_CharacterizationRevisions B WHERE A.Facility = B.Facility AND A.Characterization = B.Characterization AND Highest = B.EffectiveDate " |
This seems awfully complicated. is there a better way to do this so that I don't have to use two freaking SELECT statements? I seem to recall that there was a way I could get the maximum of a group, but when I try that, it always makes me put every stupid field into the GROUP BY clause, which ends up giving me the wrong result. This query is going to run on MS SQL 2005.]2/4/2010 1:22:57 PM |
1985 All American 2175 Posts user info edit post |
If the entries are unique per date, you can just do top 1 and order by date desc
[Edited on February 4, 2010 at 1:34 PM. Reason : h] 2/4/2010 1:34:01 PM |
aaronburro Sup, B 53146 Posts user info edit post |
haha, I did use that for one, but i should have mentioned that I would also like to run this for all characterizations, too. In other words, I'd like to know the revision for each characterization and facility combination that has the highest effective date that matches the WHERE clause. And, effective dates are NOT guaranteed to be unique, for what it's worth] 2/4/2010 1:39:37 PM |
Golovko All American 27023 Posts user info edit post |
Quote : | "OK, I'm feeling really dumb right now" |
Is that because you made a SQL thread when there already is one? 2/4/2010 2:03:41 PM |
aaronburro Sup, B 53146 Posts user info edit post |
there are trillions, lol. I didn't see any current ones 2/4/2010 2:18:03 PM |
Wolfmarsh What? 5975 Posts user info edit post |
Mind posting any sample data and the desired result set? I dont get your explanation very well. 2/4/2010 2:19:49 PM |
aaronburro Sup, B 53146 Posts user info edit post |
Sure. Let's assume facility is always constant...
Row Characterization Revision EffectiveDate LastDate --- ---------------- -------- ------------- ------------- 1 Apples 0 01/01/2001 11/01/2002 2 Apples 1 02/04/2004 NULL 3 Apples 2 10/15/2002 NULL 4 Oranges 0 05/03/1984 NULL 5 Bananas 0 06/25/1981 01/01/2009 6 Bananas 1 01/01/2009 NULL
If I pass a date of 10/16/2002, I expect rows 3, 4, and 5 to be returned, for instance If I pass a date of 3/15/2009, I expect rows 2, 3, and 6 to be returned.]2/4/2010 2:34:28 PM |
Wolfmarsh What? 5975 Posts user info edit post |
In the second example, did you mean you expect 2, 4, and 6? 2/4/2010 2:51:05 PM |
disco_stu All American 7436 Posts user info edit post |
SELECT fields FROM table WHERE date >= EffectiveDate AND date <= COALESCE(LastDate, '12/31/9999')
This will turn NULL LastDates into 12/31/9999.
Which granted will give your query a Y10k bug, but hey, SQL doesn't currently support dates past 9999 anyway. Assuming humanity will even use a decimal calendar at that point.
[Edited on February 4, 2010 at 3:52 PM. Reason : y10k] 2/4/2010 3:35:45 PM |
Novicane All American 15416 Posts user info edit post |
i can hear the programmers in Y10K bitching already at you. 2/4/2010 4:20:44 PM |
aaronburro Sup, B 53146 Posts user info edit post |
wolfmarsh, yeah, 2,4,6 is right.
^^ so that's what coalesce does... interesting. How is that helping to find the MAX(EffectiveDate), though? Is it?] 2/4/2010 6:08:11 PM |
disco_stu All American 7436 Posts user info edit post |
I'm confused; do you want to feed it a date and then determine if that date is after the latest effective date in the table? I thought that you simply wanted records that were between to dates where the end date could be null and null dates should be ignored.
Coalesce effectively converts nulls but what it's really doing is returning the first non-null value in the list you provide.
[Edited on February 4, 2010 at 6:51 PM. Reason : Coalesce ] 2/4/2010 6:49:35 PM |
AngryOldMan Suspended 655 Posts user info edit post |
Just learn ruby so you don't have to think about this shit. 2/4/2010 6:51:39 PM |
aaronburro Sup, B 53146 Posts user info edit post |
no, I'm more concerned with a more efficient / easier way to find the max value and then get the rest of the info from the table without having to run a subquery first. 2/4/2010 10:57:43 PM |
KillaB All American 1652 Posts user info edit post |
My SQL is pretty mediocre, but my understanding is that since you can't do the max() function in the where clause, there's no better way to do this than a subquery. We use effective date a lot at the company I work for and while this is a drawback, the tradeoff of having some historical data and being able to follow the flow easily for a given item, is well worth the slightly higher transaction cost.
Someone please feel free to correct me if I'm way off base here. 2/5/2010 7:29:52 AM |
Wolfmarsh What? 5975 Posts user info edit post |
This is a tough one.
There is a way to do it with joins to eliminate the subquery, but have you actually looked at the performance of this?
Subselects can actually outperform joins in some situations. 2/5/2010 9:51:27 AM |
disco_stu All American 7436 Posts user info edit post |
I am an idiot and cannot determine what you're trying to do. Nor does your resultset above make sense.
You want to feed it a date, determine what the highest effective date is in your table, and do what?
Do you want to get the record that has the highest effective date that is less than the date that you give? What does end date have to do with this? 2/5/2010 10:06:44 AM |
aaronburro Sup, B 53146 Posts user info edit post |
yes, I want the record that has the highest effective date that is less than the date given, but end date must be greater than the date given, too.
I amended the result set above to be 2,4,6 as opposed to 2,3,6, btw. Overall, I'm trying to see if there is an easier way to eliminate the subquery, but apparently there is not. 2/6/2010 4:34:20 PM |
disco_stu All American 7436 Posts user info edit post |
That's what I don't get.
If you want a record that has the highest date, why are you getting multiple records back? 2, 3, and 6 have different dates (with the highest being 6). 2/8/2010 4:40:27 PM |
FroshKiller All American 51913 Posts user info edit post |
i basically didn't read this thread but can i just say 30-day indices
peace 2/8/2010 4:43:08 PM |
1985 All American 2175 Posts user info edit post |
don't use the coalesce on a static number. assuming last date is always in the past, use getdate() + 1, then those Y10K guys will be happy.
disco_stu I think his second example is only for the subquery, then the outer query would find a max on his second example
also, if your effective dates aren't unique for facility and charactarization, your query is going to return multiple results, im assuming this is ok?
[Edited on February 8, 2010 at 7:48 PM. Reason : .]
[Edited on February 8, 2010 at 7:49 PM. Reason : .]
[Edited on February 8, 2010 at 7:50 PM. Reason : .] 2/8/2010 7:36:06 PM |
aaronburro Sup, B 53146 Posts user info edit post |
nah, (2,4,6) gives what I want for the whole query. I'm looking for Max(EffectiveDate (Facility,Characterization)), if that makes any sense so I can find the revision. There should be in the result set, at most, one record for each Facility/Characterization pair that exists in the table. 2/8/2010 8:18:43 PM |
Wolfmarsh What? 5975 Posts user info edit post |
I got queries that do exactly what you want, but wasnt able to do so effectively without subqueries.
Thier performance is comparable to doing it with joins, just use the subquery and be done with it. 2/8/2010 9:09:55 PM |
KillaB All American 1652 Posts user info edit post |
Agree with the above. Plus if indexed properly, the overhead of the subquery is somewhat negated. 2/8/2010 11:42:41 PM |
1985 All American 2175 Posts user info edit post |
I've got a question:
Ive got a table that looks something like :
Quote : | " ID week value -- ---- ----- 1 2 12 1 2 7 2 8 5 5 8 2 5 8 10
" |
and I want to remove the rows with the same id and week that dont contain the maximal value
for example, what I want to remain is:
Quote : | " ID week value -- ---- ----- 1 2 12 2 8 5 5 8 10
" |
2/17/2010 1:03:59 PM |
BigMan157 no u 103354 Posts user info edit post |
make a temporary table SELECT id, week, MAX(value) FROM table GROUP BY id insert into temporary table truncate initial table copy temp table to initial table
i'm sure there's a much simpler way than that though 2/17/2010 1:25:11 PM |
FroshKiller All American 51913 Posts user info edit post |
Do you honestly want to delete the records, or do you just want that output? 2/17/2010 1:51:49 PM |
1985 All American 2175 Posts user info edit post |
I want to delete them. 2/17/2010 1:54:12 PM |
Wolfmarsh What? 5975 Posts user info edit post |
Does this table have an identity column to it, or is that all the data that is there? 2/17/2010 1:57:57 PM |
1985 All American 2175 Posts user info edit post |
It does have an identity column 2/17/2010 2:02:23 PM |
Wolfmarsh What? 5975 Posts user info edit post |
Whats the name of the column and ill give you the sql. 2/17/2010 2:03:15 PM |
1985 All American 2175 Posts user info edit post |
HSID 2/17/2010 2:10:08 PM |
Wolfmarsh What? 5975 Posts user info edit post |
Here is my SQL, including all of my test sql to make sure it worked:
--Create test table CREATE TABLE [dbo].[1985Data]( [HSID] [int] IDENTITY(1,1) NOT NULL, [ID] [int], [week] [int], [value] [int] )
--Clear my test table Delete From [1985Data]
--Add test data Insert Into [1985Data] (ID, week, value) VALUES (1, 2, 12) Insert Into [1985Data] (ID, week, value) VALUES (1, 2, 7) Insert Into [1985Data] (ID, week, value) VALUES (2, 8, 5) Insert Into [1985Data] (ID, week, value) VALUES (5, 8, 2) Insert Into [1985Data] (ID, week, value) VALUES (5, 8, 10)
--Actual delete statement Delete From [1985Data] Where HSID Not In ( Select HSID From [1985Data] Where value = (Select MAX(value) from [1985Data] as D2 where D2.id=[1985Data].ID) )
--Test output Select * From [1985Data] 2/17/2010 2:18:28 PM |
1985 All American 2175 Posts user info edit post |
that looks like it should work, thanks. my table is ~ 50 million records, and the duplicates are probably .05% of that so I was trying to approach it by subquerying the records that should be deleted rather than the ones that shouldn't. I don't know much about optimization, would that even make a difference? 2/17/2010 3:30:11 PM |
FroshKiller All American 51913 Posts user info edit post |
jesus christ dude 2/17/2010 3:34:30 PM |
1985 All American 2175 Posts user info edit post |
- Ya, i know. 2/17/2010 3:46:59 PM |
Wolfmarsh What? 5975 Posts user info edit post |
It wont make that much of a difference. Proper indexing would matter more.
I've had a lot of success using the tuning wizard in the past:
http://msdn.microsoft.com/en-us/library/ms979195.aspx#scalenethowto03_topic5
There may be a more current article, but thats what I have in my favorites. 2/17/2010 3:51:23 PM |
1985 All American 2175 Posts user info edit post |
Thanks a bunch, i appreciate it. 2/17/2010 4:00:16 PM |