[eluser]jleequeen[/eluser]
Hello all,
I didn't want to have to ask a non-codeigniter question, but I'm stumped and if anyone has any ideas, it would really help me out.
Basically I have 6 tables I'm trying to grab info from. The tables are:
Tracking (Holds tracking information like status, year, etc on a yearly basis for each water system)
Water System (Holds information on each water system, and is unique by ID)
Service_Area_Basins (holds yearly information for each sub-basin a water system might fall within. So for instance in 2007 "A" water system could have been in two different basins and then a percent field is also in the table to show how much area was covered within each basin, so it could be 60 40 say.
Service_Area_Counties (holds the same as service area basins, but for county information)
Basins (lookup table with list of basins)
Counties (lookup table with list of counties)
Ok, so basically I want to generate a list that has the following:
All plans still in draft status for 2007 and join the water system information for each along with the largest basin area and largest county area for each.
So the output would look something like...
ID, year, status, water system name, (other water system info), basin, county
I can do a 6 table join based off tracking and get the correct number of systems that are in draft just fine. But it's trying to grab the county and basin that has the highest percentage is when the problem comes in.
I'm trying to do this with joins and group by, but I'm not sure if I need some sub-queries as well. I hope this makes somewhat sense to someone. Like I said, I can get all the records that i should be getting with all the info just fine. It's when I try to add in grabbing the county name and basin and making sure that they are the largest percentage ones that causes the problem.
Here is an example of the SQL I have so far.
Code:
SELECT
tracking.pwsid,
tracking.`year`,
tracking.`status`,
water_systems.name,
water_systems.contact,
water_systems.title,
water_systems.address,
water_systems.city,
water_systems.state,
water_systems.zip,
water_systems.phone,
water_systems.fax,
water_systems.email,
basins.name,
service_area_basins.percent,
service_area_counties.percent,
counties.name
FROM
tracking
Left Join water_systems ON tracking.pwsid = water_systems.pwsid
Left Join service_area_basins ON tracking.pwsid = service_area_basins.pwsid AND tracking.`year` = service_area_basins.`year`
Left Join basins ON basins.id = service_area_basins.basin_id
Left Join service_area_counties ON service_area_counties.pwsid = tracking.pwsid AND service_area_counties.`year` = tracking.`year`
Left Join counties ON counties.id = service_area_counties.county_id
WHERE
tracking.`year` = '2007' AND
tracking.`status` = 'Draft'
GROUP BY
tracking.pwsid,
tracking.`year`
ORDER BY
tracking.pwsid ASC,
service_area_basins.percent DESC,
service_area_counties.percent DESC
This query gives me all 106 rows correctly, but in a situation where a water system might be within two basins, it's not giving me the highest percentage in all cases. I think the problem is somewhere in the ORDER BY. I've tried using MAX on the service_area_X.percents but that didn't get me out the highest percentage only.
I know this is a long post, but I've spent almost a day and a half on this and can't quite figure it out. Any suggestion would be awesome. I want to get this thing solved, so I can get back to writing CI code!