- Date format will be yyyymmdd (if not, assume that we converted in Oracle or Sybase)
SELECT a.name, SUM(s.price) FROM authors a
INNER JOIN books b ON (a.id=b.author_id)
LEFT OUTER JOIN sales s ON (b.id = s.book_id)
GROUP BY a.name
HAVING (s.date >= '20131101' AND s.date <= '20131130')
In SQL, this would be an issue due to the date column not existing in the group by. Additionally, because of the having statement, the left outer join is being treated as an inner join with respect to the results given.
Something like this could work (without going for an alternate approach such as Unions or sub-selects):
SELECT a.name, ISNULL(SUM(s.price), 0.0) FROM authors a
INNER JOIN books b ON (a.id=b.author_id)
LEFT OUTER JOIN sales s ON (b.id = s.book_id)
GROUP BY a.name, s.date
HAVING (ISNULL(s.date, '20131101') >= '20131101' AND ISNULL(s.date, '20131101') <= '20131130')
Edit: Actually might be better to keep the null in the results like the original query. It's definitely better to know whether a book sold 1000 copies for $0.00 each or if no copies were sold.
Instead of the Left Outer Join, couldn't you also make it an Inner Join to S.BOOK_ID? The question, if I read it correctly, only requires you to return sales in Nov 2013. If there is no data for a book_id, you want to exclude it from your returned data, no?
I would imagine it is up to interpretation. It would probably be a good thing to ask the interviewer about to show that you are thinking about edge cases.
If it were me, I'd prefer to see those that made no sales based on the question. I guess if I didn't want to see them, I would ask for "where sales are greater than $0.00".
SELECT authors.name, SUM(sales.price) FROM authors
INNER JOIN books ON authors.id = books.author_id
INNER JOIN sales ON books.id = sales.book_id
WHERE sales.date >= Nov 2013 AND sales.date < Dec 2013
GROUP BY authors.name
Although a coworker pointed out that inner joins would drop authors who haven't written books, and books that have no sales. If you want to include all of those, then use LEFT OUTER JOIN instead of INNER JOINs.
I mean total revenue, but I leave it unspecified in the hope that someone will ask for clarification. Same with if they should include authors with no sales during the time period.
I like this question because the base query, however you interpret the unclear parts, is easy, but it still allows candidates to impress me by asking about the "edge" cases. And even if they just makes assumptions, I can still ask them to do the other versions to see if they really know what they're talking about. I'm always amazed at how many people can't convert an inner join to a left join, or can't change a sum() to a count().
Given three tables:
find the total sales by author in November 2013. The result set should have two columns: author's name and their total sales.