As the last sliver of sunlight cuts through the broken tape that’s supposed to keep my blackout curtains closed, dusk rolls in like an afternoon fog over Sausalito. It tumbles down the hills like cigar smoke rolls out of the mouth of the face that always seems to be at least fifty-percent obscured by shadow while thirty percent is overexposed leaving just enough to know their demands are serious. As the darkness leaks into my room, the only remaining light is the green diode from my charging cable. It’s time to start.
I lift the screen with my thumbs and I’m immediately blinded. Rookie mistake, I say to myself aloud — forgot to dim the screen. As my pupils slowly relax from their tack sharp constricted states, I am faced with something far worse than I could have ever imagined. Someone hacked my computer. There’s no way…
I’m immediately distracted by a wet sleeve. What the fuck? There’s not enough light to verify the origin or nature of the substance. I lift the sleeve of my robe to my nose, not sure what to expect. Something metallic with notes of honey and vanillin and maybe cyanin with cinnamaldehyde or phenol and smoke which I can’t exactly place, dare I, no no, don’t lick the sleeve it could be anything, ah fuck it I’m going to taste it, why would I lick the sleeve that makes no sense, just find the source and dip a finger and get a sample.
I swirl my finger around the surface of my desk tracing out the shape of a spiral before lifting the tip of my smallest digit to the tip of my anticipatory tongue, which is now telling me this is, nee was, a drink: bourbon with smoked salt and a dash of orgeat. The first time I tried that drink was a sweltering day in Laguna where the only shade was at a bar that had no indoor seating. Do I even own smoked salt?
Even though my IDE is set to Dark Modern, my eyes are still struggling to focus on the scrub jay blue and butter-butt yellow, as they battle for my attention as if I’ve forgotten to fill the birdfeeders again. Wait, have I? And now this? What in Hell am I looking at? There’s simply no way…
Let’s investigate:
Is this even my code?
No comments
Inconsistent capitalization
Subqueries
Nested Subqueries
No smartass naming
No other files or saved queries
There may as well be a cup of tea sitting here with the bag still soaking to create the equivalent of a liquid ash tray.
SELECT cast(100*Laxer_Federal_Maximum_Count /
(SELECT count(*)
FROM state_regulations
WHERE federal_mcl IS NOT NULL) AS DECIMAL(5,1)) AS Question_1,
cast(100*Stricter_State_Maximum_Count /
(SELECT count(*)
FROM (select contaminant,
state_mcl,
federal_mcl
FROM state_regulations
) AS SSM) AS DECIMAL(5,1)) AS Question_2,
CAST(100*Stricter_State_Coalesced_Count/
(SELECT COUNT(*)
FROM (SELECT contaminant,
state_mcl,
coalesce(federal_mcl, state_mcl + 1) AS federal_not_null
FROM state_regulations) AS UFC
WHERE federal_not_null > state_mcl) AS DECIMAL(5,1)) AS Question_3
FROM (
SELECT COUNT(LFM.contaminant) AS Laxer_Federal_Maximum_Count,
count(FM.contaminant) AS Federal_Maximum_Count,
count(SSM.contaminant) AS Stricter_State_Maximum_Count,
count(SSC.contaminant) AS Stricter_State_Coalesced_Count,
COUNT(UFC.contaminant) AS Unregulated_Federal_Coalesced_Count
FROM (
SELECT contaminant,
state_mcl,
federal_mcl
FROM state_regulations
WHERE federal_mcl IS NOT null) AS FM
LEFT JOIN (
SELECT contaminant,
state_mcl,
federal_mcl
FROM state_regulations
WHERE federal_mcl > state_mcl) AS SSM
ON FM.contaminant = SSM.contaminant
LEFT JOIN (
select contaminant,
state_mcl,
COALESCE(federal_mcl, state_mcl + 1) AS federal_not_null
from state_regulations) AS UFC
ON FM.contaminant = UFC.contaminant
LEFT JOIN (
SELECT contaminant,
state_mcl,
federal_mcl
FROM state_regulations
WHERE federal_mcl > state_mcl) AS SSC
ON UFC.contaminant = SSC.contaminant
LEFT JOIN (
SELECT contaminant,
state_mcl,
federal_mcl
FROM (
SELECT contaminant,
state_mcl,
federal_mcl
FROM state_regulations
WHERE federal_mcl IS NOT NULL) AS FM
WHERE federal_mcl > state_mcl) AS LFM
ON UFC.contaminant = LFM.contaminant
) AS All_Counts;
Hypotheses:
Someone broke in and wrote really shitty code while I rested my eyes while… hell’s bells, I just noticed my shoes are still on.
I was poisoned by cyanide and forced to write absolutely terrible code and somehow survived…
My computer was hacked and the hackers changed my beautiful code into something functional but absolutely disgusting as we see… wait, what is that?
Savvy. This is why I always keep my notes on paper — call me old school, but when the power fails or someone hacks everything you have, you can always rely on some good old ink and paper.
The last two pages are stuck together from mostly dried black ink. I must have been using the fountain pen again. Yes, here. The ink spill in the lower right corner, with a smear is holding the pages together. I lift up my hand and notice it’s not just a shadow darkening the fleshy hypothenar eminence, there’s clear evidence of ink that didn’t completely rinse off with soap and water. Looking back to the page, I can see the handwritten outline of a query in the upper left that starts with a battery of WITH clauses but it’s cut off by, what the fuck is this? A drink recipe for the smoked salt bourbon drink. My handwriting declines as I work my way down the page. At the bottom of the page, I have written a note to myself: this is going to be worse than a hangover, followed by an outline that ‘fixes’ the above code into no longer an organized string of queries but into a bewitching nest of chaos belched up from mid-summer sewage.
I go back and look at the code trying to determine where the sewer even starts.
WHAT
IN
THE
EIGHTH
LAYER
OF
HELL?
At least I have a repository that tracked all of the changes, right? I can feel a drop of sweat forming on the back of my neck as that telltale feeling in my stomach akin to stage-fright rears its ugly head because I already know the answer, but I’m just really holding onto the hope that I wasn’t as big a nincompoop as I
Wrong: 2 commits. The second immortalizes the atrocity above. The first commit brings us back to… why the hell would I commit this?
SELECT contaminant,
state_mcl,
federal_mcl
FROM state_regulations
LIMIT 1000
My soul is telling me there’s at least a slight shred of dignity in not committing a limitless SELECT asshole¹ query. It’s the kind of code that has you dangling from the rooftop with greasy fingers. But is there any justification that excuses the complete lack of comments, documentation, and procedural changes that led from point A to point damnation?
¹The late and Kurt Vonnegut immortalized the use of the asterisk, *, to represent an asshole in its most literal sense, medically referred to as an anus.
Memo to Self: Don’t commit an asshole in your SELECT statement.
No point in dwelling now. We’re on the clock and we need to get answers. Let’s go back to the assignment:
Part 1: A dark day in sunny California
A private investigator contacted me regarding a string of suspicious deaths in the sunny state of California. Each victim shared a similar profile—young, athletic, healthy, had elevated amounts of common toxins in their blood at the time of death. I have been tasked to chart the contaminant levels across the state and weigh them against what turned up in the victims’ blood. The null hypothesis is simple, if a little too neat: their exposure falls within the normal range anyone might swallow from the tap in those locations. Nothing more sinister than geography and bad plumbing.
The state of California is usually much stricter on contaminant levels than the federal government, so within this investigation, we’ll need to verify this as well as identify anywhere that the contamination levels meet or (if any) exceed the allowable safe levels set by the federal and state environmental protection agencies. Government agencies have always been forthcoming in informing the citizens when their drinking water is less than safe.
This seems like a good place to start. Let’s verify that state levels are stricter than federal levels:
SELECT *
FROM state_regulations
WHERE federal_mcl < state_mcl
LIMIT 1000;
Well this is a great start. An empty set staring right back at me like I’ve got something on my face. Wait, do I have something on my face? I can see something in the reflection on my screen.
There is ink on my lip. Oh shit, the ink spilled on the desk with the bourbon, and it’s also smeared on the sleeve of my robe. I guess silk was a bad choice. So was the ‘less than’ sign.
Stricter means the levels are expected to be lower, not greater for the state levels. Let’s go ahead and switch that and see what we get. Any values that don’t show up should be equal, since we’ve already found that none of the federal levels are lower than state levels.
This brings us to the next question. How many of the state standards are actually stricter than the federal standards? This should be easy, just throw in a COUNT function.
SELECT COUNT(contaminant) AS Number_of_Stricter_State_Maximums
FROM state_regulations
WHERE federal_mcl > state_mcl
There are 27 stricter maxima. But let’s consider the ubiquity of Prop 65 labels on every product sold in California warning you that it’s a chemical known to cause some horrible death by the Golden State — if California recognizes more chemicals to be dangerous than the federal government even tracks, then the inequality won’t tell us anything about those because null values aren’t compared to non-null values with mathematical operators. Let’s tweak the query to look into this:
SELECT contaminant,
state_mcl,
federal_mcl
FROM state_regulations
WHERE state_mcl IS NOT NULL AND federal_mcl IS NULL;
How many did you find?
Regardless, it’s more than none, so we have to address this. Typical imputation suggests setting values to 0. Does this seem reasonable?
Imputation of a value in place of null value is as easy as convincing a child that Santa Claus exists. Blissful ignorance is now just replaced by the COALESCE function.
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, 0) AS federal_not_null
FROM state_regulations
If federal_mcl is null, it replaces the null with 0. Run it.
Wonderful. Now test it with the earlier comparison to verify that we have more than 27 stricter maxima, since it should be considered that any limit is more strict than having no limit. To do this, we’ll have to add the coalesced value into the where clause:
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, 0) AS federal_max_coalesced
FROM state_regulations
WHERE federal_max_coalesced > state_mcl
Perfect.
This is why I hate running queries in the morning. Hold on, it’s not morning; we just watched the sun’s rays get snuffed out behind the horizon. Regardless, I just woke up and haven’t yet been kissed by caffeine. So what’s the problem here?
Damn it, of course:
Query order doesn’t follow the way it’s written.
Part 2: The order of execution
The order in which SQL processes this query is not how it’s written, but in the following manner:
FROM state_regulations
WHERE federal_max_coalesced > state_mcl
SELECT
contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, 0) AS federal_max_coalesced
The problem: We are trying to filter by federal_max_coalesced before it has been defined.
The solution: We can include the COALESCE function in the filter part of the query:
FROM state_regulations
WHERE COALESCE(federal_mcl, 0) > state_mcl
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, 0) AS federal_max_coalesced
So the query should read:
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, 0) AS federal_max_coalesced
FROM state_regulations
WHERE COALESCE(federal_mcl, 0) > state_mcl
Run this now. Did you get what we expected, exactly 27 records? Why should or shouldn’t this be expected? Is there a problem²? If so, how do we fix it?
²You damn well bet there’s a problem!
If you decided this was a bad idea from the beginning, what number did you pick? We could pick a ridiculously high number, but this may cause problems when plotting downstream. In this case, I’d rather keep it similar but not equal. Since we know that some of the values are really low, like in the thousandths, we can add 0.0001 to the state value and it will always be greater. Another possibility is multiplying by something like 1.0001 to guarantee we always get a slightly higher number. For simplicity here, I am just going to add 1.
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, state_mcl + 1) AS federal_not_null
FROM state_regulations
Why don’t you go ahead and change the above query to include the filter (WHERE clause) and verify we have more than 27 stricter state values than federal values. Counting values is easy when we’re looking at a small output, but we should really incorporate the COUNT function to do this for those times when you’re counting more than fingers and toes you found on a bloody basement floor and more on the magnitude of separating different hairs from the floor of a barbershop looking for one that matches. So why can’t we just add a COUNT(*) to our query?
If the answer in your head was that mixing aggregates and non-aggregates is like mixing gasoline with fertilizer, then we think alike. It’s the best way to get your query to cause massive problems. What we can do instead is use a subquery. Consider what we want to do:
FROM state_regulations
WHERE COALESCE(federal_mcl, state_mcl + 1) > state_mcl
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, state_mcl + 1) AS federal_max_coalescedFROM results from steps 1–3
SELECT COUNT(contaminant) AS Number_of_Stricter_State_Maximums
This is translated from processing order into structured query language as follows:
SELECT COUNT(subquery.contaminant) AS Number_of_Stricter_State_Maximums
FROM (
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, state_mcl + 1) AS federal_max_coalesced
FROM state_regulations
WHERE COALESCE(federal_mcl, state_mcl + 1) > state_mcl
) AS subquery
Glancing at my notebook, I can see a handwritten note:
Memo to Self: We MUST add an alias to all subqueries. Also, don’t name a subquery something meaningless like ‘subquery.’ Only assholes name their subqueries ‘subquery.’
I raised a good point. Let’s fix that.
SELECT COUNT(Stricter_State_Maximums.contaminant) AS Number_of_Stricter_State_Maximums
FROM (
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, state_mcl + 1) AS federal_max_coalesced
FROM state_regulations
WHERE COALESCE(federal_mcl, state_mcl + 1) > state_mcl
) AS Stricter_State_Maximums
Well that looks much…
No.
It looks disgusting. Looking at subqueries makes me feel like I’m cleaning out a cat’s litter box and while I’m leaning over to clean it out, a cat catted up a hairball on my back, and while it’s all warm, gooey, and wet, I know that there’s no way I can safely stand up without making more of a mess. Thanks, Morton. I know it was you — I know you’re the one that barfed a hairball on my back because Milton didn’t have the balance to do it!
I’m adding a new comment to my notebook:
Memo to Self: Subqueries stink. Really bad, like coming home to the foul, fetid odor of spoilt milk, rotten eggs, and gray maggot-covered meat emanating from the refrigerator with door ajar upon returning from a 3 week case.
Part 3: Chopping the body into pieces
How do we make this more…digestible? More digestible than cat fur shouldn’t be a hard sell. I’m just thinking of the painted advertisement on the side of a large brick building. Maybe near some smoke stacks or a shipyard.
The WITH clause: more digestible than cat fur!
Reduce h̶a̶i̶r̶b̶a̶l̶l̶s common table expulsions WITH common table expressions!
Let’s put this together again; start with the steps we established earlier:
FROM state_regulations
WHERE COALESCE(federal_mcl, state_mcl + 1) > state_mcl
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, state_mcl + 1) AS federal_max_coalescedFROM results from steps 1–3
SELECT COUNT(contaminant) AS Number_of_Stricter_State_Maximums
When we create a common table expression, we can actually do this in order by chopping up the subqueries in the body of our query. Our first query runs steps 1–3 in its own containment, like an interrogation room. We will capture its results inside of a WITH clause. Just like with the subquery, we will have to name it, only this happens at the beginning instead of at the end:
WITH cte AS (
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, state_mcl + 1) AS federal_max_coalesced
FROM state_regulations
WHERE COALESCE(federal_mcl, state_mcl + 1) > state_mcl
)
Then we add our query after this, citing the cte in our FROM clause:
SELECT COUNT(contaminant) AS Number_of_Stricter_State_Maximums
FROM cte
Extremely Important Memo to Self: Only assholes name their common table expressions ‘cte’.
Asshole approach:
WITH cte AS (
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, state_mcl + 1) AS federal_max_coalesced
FROM state_regulations
WHERE COALESCE(federal_mcl, state_mcl + 1) > state_mcl
)
SELECT COUNT(contaminant) AS Number_of_Stricter_State_Maximums
FROM cte
I’m not an asshole, so I’ll include a more meaningful name so you know exactly the purpose of the query within the WITH clause:
Not An Asshole Approach:
WITH Stricter_State_Maximums AS (
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, state_mcl + 1) AS federal_max_coalesced
FROM state_regulations
WHERE COALESCE(federal_mcl, state_mcl + 1) > state_mcl
)
SELECT COUNT(contaminant) AS Number_of_Stricter_State_Maximums
FROM Stricter_State_Maximums
Organized. Methodical. Readable.
Like a serial killer’s notes. Or like a detective’s notes when tracking down a serial killer.
Using this approach, we can assemble small queries like a circuit, all building up to one final query that is executed at the end. One final grand finale output that explodes like a firework display across the ceiling of your competitor’s office after they did an indoor training to demonstrate the proper way to use a fire extinguisher and failed. Bottom line: it’s exactly what you expected. The subquery approach often ends up like the San Diego firework display of 2013. I digress. Different case, different year, different computational problem.
Part 4: Stitching the pieces together
Now where were we? We were discussing the number of stricter state maximums. You know what would be really nice (other than a 3oz pour of Redbreast 27)? If we had all of the information provided in a single table: the number where federal is stricter than state (0), the number where state is stricter than federal (you should have found this is 42), and the last piece is where they are the same. Go ahead and query that, you’ll need it for the next step.
Adding additional queries as common table expressions is easy. We don’t repeat the WITH clause — this actually sets up the circuit that ends with the final SELECT clause. In between, however, we will need to separate each new query with a comma and an alias.
WITH
Stricter_State_Maximums AS (
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, state_mcl + 1) AS federal_max_coalesced
FROM state_regulations
WHERE COALESCE(federal_mcl, state_mcl + 1) > state_mcl
),
Stricter_Federal_Maximums AS (
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, state_mcl + 1) AS federal_max_coalesced
FROM state_regulations
WHERE COALESCE(federal_mcl, state_mcl + 1) < state_mcl
)
SELECT COUNT(contaminant) AS Number_of_Stricter_State_Standards
FROM stricter_state_maximums
Ignore the last SELECT clause, it’s not important here. Focus on the comma before Stricter_Federal_Maximums.
What we really want is to have all of the counts in a single output.
We have 2 main options:
Use a series of JOINs
Use a series of UNIONs
Consider this superficially:
The Joins will yield 1 row with 3 columns
The Union will yield 3 rows and 2 columns
Consider this computationally:
I will have to run 3 joins on every row of each dataset
This requires value matching, filtering, and shuffle
Size change depends on the type of join
I will just add new rows to existing tables in each column
No matching, no filtering, no shuffle
Size increases by the number of rows added by each union
For smaller datasets, a JOIN won’t be a problem:
WITH
Stricter_State_Maximums AS (
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, state_mcl + 1) AS federal_max_coalesced
FROM state_regulations
WHERE COALESCE(federal_mcl, state_mcl + 1) > state_mcl
),
Stricter_Federal_Maximums AS (
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, state_mcl + 1) AS federal_max_coalesced
FROM state_regulations
WHERE COALESCE(federal_mcl, state_mcl + 1) < state_mcl
),
Same_State_Federal_Maximums AS (
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, state_mcl + 1) AS federal_max_coalesced
FROM state_regulations
WHERE COALESCE(federal_mcl, state_mcl + 1) = state_mcl
)
SELECT COUNT(SSM.contaminant) AS Number_of_Stricter_State_Standards,
COUNT(SFM.contaminant) AS Number_of_Stricter_Federal_Standards,
COUNT(SM.contaminant) AS Number_of_Equal_Standards
FROM state_regulations AS SR
LEFT JOIN Stricter_State_Maximums AS SSM
ON SR.contaminant = SSM.contaminant
LEFT JOIN Stricter_Federal_Maximums AS SFM
ON SR.contaminant = SFM.contaminant
LEFT JOIN Same_State_Federal_Maximums AS SM
ON SR.contaminant = SM.contaminant
A UNION will be better for scaling up; as the tables get larger, the number of computations is additive with unions, whereas it is multiplicative with joins. The requirement for unions is that you have not only the same number of columns, they must be the same name and data type. If you’re keen on not following their requirements, say goodbye to your kneecaps.
WITH
Stricter_State_Maximums AS (
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, state_mcl + 1) AS federal_max_coalesced
FROM state_regulations
WHERE COALESCE(federal_mcl, state_mcl + 1) > state_mcl
),
Stricter_Federal_Maximums AS (
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, state_mcl + 1) AS federal_max_coalesced
FROM state_regulations
WHERE COALESCE(federal_mcl, state_mcl + 1) < state_mcl
),
Same_State_Federal_Maximums AS (
SELECT contaminant,
state_mcl,
federal_mcl,
COALESCE(federal_mcl, state_mcl + 1) AS federal_max_coalesced
FROM state_regulations
WHERE COALESCE(federal_mcl, state_mcl + 1) = state_mcl
)
SELECT COUNT(contaminant) AS Number_of_Levels, 'State' AS Stricter_Restriction_Levels
FROM Stricter_State_Maximums
UNION ALL
SELECT COUNT(contaminant), 'Federal'
FROM Stricter_Federal_Maximums
UNION ALL
SELECT COUNT(contaminant), 'Equal'
FROM Same_State_Federal_Maximums;
So now we have these values. Does anyone other than the person running the query know whether these numbers are high, low, meaningful, or are they as useful as handcuffs on a ghost? Well let me tell ya; the answer is a resounding no. For the longest time growing up, I actually thought my name was ‘no’ because it’s the word my parents said the most. Anyway, if I’m going to make my case to someone else, I should probably use something more relatable and universal, like percentages. So let’s refine this to show the following:
1. What percentage of federal restrictions are less strict (higher) than state restrictions?
2. What percentage of state restrictions are stricter (lower) than the set federal restrictions?
3. What percentage of all contaminants have a stricter state maximum than federal, assuming a NULL value represents a known contaminant that is unregulated at the federal level?
We’ll need to use our handy function CAST to convert some values to get them in the right format for calculations, but let’s focus on the query structure now. I’ll go with joins for this, because I’m going to need the values in different columns, not in different rows.
We’ll still use a few subqueries at the end, organized with a similar structure. We can continue to make ctes instead of doing this, but that itself will still get complicated with unions or joins. Before getting to the final SELECT clause, let’s look at the computation we’re trying to get for each of these questions:
Laxer_Federal_Maximum_Count / All_Federal_Maximum_Count * 100 AS Question_1
To get the value for ALL_Federal_Maximum_Count, we need to select the Federal Maximum Count from All_Counts. This number will be an integer value, which when divided will return an integer unless we explicitly cast it in the desired output format. This is the purpose of the CAST() AS DECIMAL(5,1) shell around the subquery.
CAST(SELECT Federal_Maximum_Count FROM All_Counts) AS DECIMAL(5,1)
So here it is..
WITH
Unregulated_Federal_Coalesced AS (
SELECT contaminant,
state_mcl,
COALESCE(federal_mcl, state_mcl + 1) AS federal_not_null
FROM state_regulations
),
Stricter_State_Coalesced AS (
SELECT contaminant,
state_mcl,
federal_not_null
FROM Unregulated_Federal_Coalesced
WHERE federal_not_null > state_mcl
),
Stricter_State_Maximums AS (
SELECT contaminant,
state_mcl,
federal_mcl
FROM state_regulations
WHERE federal_mcl > state_mcl
),
Federal_Maximums AS (
SELECT contaminant,
state_mcl,
federal_mcl
FROM state_regulations
WHERE federal_mcl IS NOT NULL
),
Laxer_Federal_Maximums AS (
SELECT contaminant,
state_mcl,
federal_mcl
FROM Federal_Maximums
WHERE federal_mcl > state_mcl
),
All_Counts AS (
SELECT COUNT(LFM.contaminant) AS Laxer_Federal_Maximum_Count,
COUNT(FM.contaminant) AS Federal_Maximum_Count,
COUNT(SSM.contaminant) AS Stricter_State_Maximum_Count,
COUNT(SSC.contaminant) AS Stricter_State_Coalesced_Count,
COUNT(UFC.contaminant) AS Unregulated_Federal_Coalesced_Count
FROM Unregulated_Federal_Coalesced AS UFC
LEFT JOIN
Stricter_State_Coalesced AS SSC
ON UFC.contaminant = SSC.contaminant
LEFT JOIN
Stricter_State_Maximums AS SSM
ON UFC.contaminant = SSM.contaminant
LEFT JOIN
Federal_Maximums AS FM
ON UFC.contaminant = FM.contaminant
LEFT JOIN
Laxer_Federal_Maximums AS LFM
ON UFC.contaminant = LFM.contaminant
)
SELECT
CAST(100*Laxer_Federal_Maximum_Count/
(SELECT Federal_Maximum_Count FROM All_Counts) AS DECIMAL(5,1)) AS Question_1,
CAST(100*Stricter_State_Maximum_Count/
(SELECT Unregulated_Federal_Coalesced_Count FROM All_Counts) AS DECIMAL(5,1)) AS Question_2,
CAST(100*Stricter_State_Coalesced_Count/
(SELECT Unregulated_Federal_Coalesced_Count FROM All_Counts) AS DECIMAL(5,1)) AS Question_3
FROM All_Counts
So now is the moment of truth. What are the answers to our questions? How will these answers bring us closer to our actual task? What is the next step?
The correct next step would be to change those fetid subqueries into common table expressions, and I expect you’ll do that.
I don’t know about you, but I need to step out and get some fresh air. As fresh air as one can get as the night air thins and the smog descends, obscuring the view across the orange glow of the street lamps. Each one harbors a cloud of insects each trying to penetrate the glass, not realizing its life would end upon reaching its sought after destination.
I imagine this isn’t what our victims were seeking. These weren’t overdoses on some narcotic bliss gone wrong. These were likely slow, painful deaths — each victim drinking the liquid that sustains life only while simultaneously snuffing it out.
Knowing where the safety thresholds stand may not seem important, but we will need to know what is deemed safe when looking at the specific stations that test consumers’ drinking water. We can’t trust the safety of the water, but we can trust the data. There’s something dark coming out of these artesian taps. To find out what that is, we’re going to need to investigate row by row with a fine-toothed query, but you can’t use aggregations on a row-by-row basis.
Or can you?