subquery interview questions
Top subquery frequently asked interview questions
I tried:
UPDATE giveaways SET winner = '1' WHERE ID = (SELECT MAX(ID) FROM giveaways)
But it gives:
#1093 - You can't specify target table 'giveaways' for update in FROM
clause
This article seems relevant but I can't adapt it to my query. How can I get it to work?
Source: (StackOverflow)
I am an old-school MySQL user and have always preferred JOIN
over sub-query. But nowadays everyone uses sub-query and I hate it, I don't know why.
I lack the theoretical knowledge to judge for myself if there is any difference.
Is a sub-query as good as a JOIN
and therefore there is nothing to worry about?
Source: (StackOverflow)
I'm trying to execute this query:
SELECT mac, creation_date
FROM logs
WHERE logs_type_id=11
AND mac NOT IN (select consols.mac from consols)
But I get no results. I tested it, and I know that there is something wrong with the syntax. In MySQL such a query works perfectly. I've added a row to be sure that there is one mac
which does not exist in the consols
table, but still it isn't giving any results.
Source: (StackOverflow)
I have a table story_category
in my database with corrupt entries. The next query returns the corrupt entries:
SELECT *
FROM story_category
WHERE category_id NOT IN (
SELECT DISTINCT category.id
FROM category INNER JOIN
story_category ON category_id=category.id);
I tried to delete them executing:
DELETE FROM story_category
WHERE category_id NOT IN (
SELECT DISTINCT category.id
FROM category
INNER JOIN story_category ON category_id=category.id);
But I get the next error:
#1093 - You can't specify target table 'story_category' for update in FROM clause
How can I overcome this?
Source: (StackOverflow)
Using postgres 8.4, My goal is to update existing table:
CREATE TABLE public.dummy
(
address_id SERIAL,
addr1 character(40),
addr2 character(40),
city character(25),
state character(2),
zip character(5),
customer boolean,
supplier boolean,
partner boolean
)
WITH (
OIDS=FALSE
);
Initially i tested my query using insert statement:
insert into address customer,supplier,partner
SELECT
case when cust.addr1 is not null then TRUE else FALSE end customer,
case when suppl.addr1 is not null then TRUE else FALSE end supplier,
case when partn.addr1 is not null then TRUE else FALSE end partner
from (
SELECT *
from address) pa
left outer join cust_original cust
on (pa.addr1=cust.addr1 and pa.addr2=cust.addr2 and pa.city=cust.city
and pa.state=cust.state and substring(cust.zip,1,5) = pa.zip )
left outer join supp_original suppl
on (pa.addr1=suppl.addr1 and pa.addr2=suppl.addr2 and pa.city=suppl.city
and pa.state=suppl.state and pa.zip = substring(suppl.zip,1,5))
left outer join partner_original partn
on (pa.addr1=partn.addr1 and pa.addr2=partn.addr2 and pa.city=partn.city
and pa.state=partn.state and pa.zip = substring(partn.zip,1,5) )
where pa.address_id = address_id
being Newbie I'm failing converting to update statement ie., updating existing rows with values returned by select statement.
Any help is highly appreciated.
Source: (StackOverflow)
I've got a couple of duplicates in a database that I want to inspect, so what I did to see which are duplicates, I did this:
SELECT relevant_field
FROM some_table
GROUP BY relevant_field
HAVING COUNT(*) > 1
This way, I will get all rows with relevant_field occuring more than once. This query takes milliseconds to execute.
Now, I wanted to inspect each of the duplicates, so I thought I could SELECT each row in some_table with a relevant_field in the above query, so I did like this:
SELECT *
FROM some_table
WHERE relevant_field IN
(
SELECT relevant_field
FROM some_table
GROUP BY relevant_field
HAVING COUNT(*) > 1
)
This turns out to be extreeeemely slow for some reason (it takes minutes). What exactly is going on here to make it that slow? relevant_field is indexed.
Eventually I tried creating a view "temp_view" from the first query (SELECT relevant_field FROM some_table GROUP BY relevant_field HAVING COUNT(*) > 1)
, and then making my second query like this instead:
SELECT *
FROM some_table
WHERE relevant_field IN
(
SELECT relevant_field
FROM temp_view
)
And that works just fine. MySQL does this in some milliseconds.
Any SQL experts here who can explain what's going on?
Source: (StackOverflow)
With sequel I can easily do subqueries like this
User.where(:id => Account.where(..).select(:user_id))
This produces:
SELECT * FROM users WHERE id IN (SELECT user_id FROM accounts WHERE ..)
How can I do this using rails' 3 activerecord/ arel/ meta_where?
I do need/ want real subqueries, no ruby workarounds (using several queries).
Source: (StackOverflow)
Imagine the following (simplified) database layout:

We have many "holiday" records that relate to going to a particular Accommodation on a certain date etc.
I would like to pull from the database the "best" holiday going to each accommodation (i.e. lowest price), given a set of search criteria (e.g. duration, departure airport etc).
There will be multiple records with the same price, so then we need to choose by offer saving (descending), then by departure date ascending.
I can write SQL to do this that looks like this (I'm not saying this is necessarily the most optimal way):
SELECT *
FROM Holiday h1 INNER JOIN (
SELECT h2.HolidayID,
h2.AccommodationID,
ROW_NUMBER() OVER (
PARTITION BY h2.AccommodationID
ORDER BY OfferSaving DESC
) AS RowNum
FROM Holiday h2 INNER JOIN (
SELECT AccommodationID,
MIN(price) as MinPrice
FROM Holiday
WHERE TradeNameID = 58001
/*** Other Criteria Here ***/
GROUP BY AccommodationID
) mp
ON mp.AccommodationID = h2.AccommodationID
AND mp.MinPrice = h2.price
WHERE TradeNameID = 58001
/*** Other Criteria Here ***/
) x on h1.HolidayID = x.HolidayID and x.RowNum = 1
As you can see, this uses a subquery within another subquery.
However, for several reasons my preference would be to achieve this same result in NHibernate.
Ideally, this would be done with QueryOver - the reason being that I build up the search criteria dynamically and this is much easier with QueryOver's fluent interface. (I had started out hoping to use NHibernate Linq, but unfortunately it's not mature enough).
After a lot of effort (being a relative newbie to NHibernate) I was able to re-create the very inner query that fetches all accommodations and their min price.
public IEnumerable<HolidaySearchDataDto> CriteriaFindAccommodationFromPricesForOffers(IEnumerable<IHolidayFilter<PackageHoliday>> filters, int skip, int take, out bool hasMore)
{
IQueryOver<PackageHoliday, PackageHoliday> queryable = NHibernateSession.CurrentFor(NHibernateSession.DefaultFactoryKey).QueryOver<PackageHoliday>();
queryable = queryable.Where(h => h.TradeNameId == website.TradeNameID);
var accommodation = Null<Accommodation>();
var accommodationUnit = Null<AccommodationUnit>();
var dto = Null<HolidaySearchDataDto>();
// Apply search criteria
foreach (var filter in filters)
queryable = filter.ApplyFilter(queryable, accommodationUnit, accommodation);
var query1 = queryable
.JoinQueryOver(h => h.AccommodationUnit, () => accommodationUnit)
.JoinQueryOver(h => h.Accommodation, () => accommodation)
.SelectList(hols => hols
.SelectGroup(() => accommodation.Id).WithAlias(() => dto.AccommodationId)
.SelectMin(h => h.Price).WithAlias(() => dto.Price)
);
var list = query1.OrderByAlias(() => dto.Price).Asc
.Skip(skip).Take(take+1)
.Cacheable().CacheMode(CacheMode.Normal).List<object[]>();
// Cacheing doesn't work this way...
/*.TransformUsing(Transformers.AliasToBean<HolidaySearchDataDto>())
.Cacheable().CacheMode(CacheMode.Normal).List<HolidaySearchDataDto>();*/
hasMore = list.Count() == take;
var dtos = list.Take(take).Select(h => new HolidaySearchDataDto
{
AccommodationId = (string)h[0],
Price = (decimal)h[1],
});
return dtos;
}
So my question is...
Any ideas on how to achieve what I want using QueryOver, or if necessary Criteria API?
I'd prefer not to use HQL but if it is necessary than I'm willing to see how it can be done with that too (it makes it harder (or more messy) to build up the search criteria though).
If this just isn't doable using NHibernate, then I could use a SQL query. In which case, my question is can the SQL be improved/optimised?
Source: (StackOverflow)
This code doesn't work for MySQL 5.0, how to re-write it to make it work
DELETE FROM posts where id=(SELECT id FROM posts GROUP BY id HAVING ( COUNT(id) > 1 ))
I want to delete columns that dont have unique id. I will add that most of the time its only one id(I tried the in syntax and it doesnt work as well).
Source: (StackOverflow)
Is there any SQL subquery syntax that lets you define, literally, a temporary table?
For example, something like
SELECT
MAX(count) AS max,
COUNT(*) AS count
FROM
(
(1 AS id, 7 AS count),
(2, 6),
(3, 13),
(4, 12),
(5, 9)
) AS mytable
INNER JOIN someothertable ON someothertable.id=mytable.id
This would save having to do two or three queries: creating temporary table, putting data in it, then using it in a join.
I am using MySQL but would be interested in other databases that could do something like that.
Source: (StackOverflow)
I'm creating an SQL statement that will return a month by month summary on sales.
The summary will list some simple columns for the date, total number of sales and the total value of sales.
However, in addition to these columns, i'd like to include 3 more that will list the months best customer by amount spent. For these columns, I need some kind of inline subquery that can return their ID, Name and the Amount they spent.
My current effort uses an inline SELECT
statement, however, from my knowledge on how to implement these, you can only return one column and row per in-line statement.
To get around this with my scenario, I can of course create 3 separate in-line statements, however, besides this seeming impractical, it increases the query time more that necessary.
SELECT
DATE_FORMAT(OrderDate,'%M %Y') AS OrderMonth,
COUNT(OrderID) AS TotalOrders,
SUM(OrderTotal) AS TotalAmount,
(SELECT SUM(OrderTotal) FROM Orders WHERE DATE_FORMAT(OrderDate,'%M %Y') = OrderMonth GROUP BY OrderCustomerFK ORDER BY SUM(OrderTotal) DESC LIMIT 1) AS TotalCustomerAmount,
(SELECT OrderCustomerFK FROM Orders WHERE DATE_FORMAT(OrderDate,'%M %Y') = OrderMonth GROUP BY OrderCustomerFK ORDER BY SUM(OrderTotal) DESC LIMIT 1) AS CustomerID,
(SELECT CustomerName FROM Orders INNER JOIN Customers ON OrderCustomerFK = CustomerID WHERE DATE_FORMAT(OrderDate,'%M %Y') = OrderMonth GROUP BY OrderCustomerFK ORDER BY SUM(OrderTotal) DESC LIMIT 1) AS CustomerName
FROM Orders
GROUP BY DATE_FORMAT(OrderDate,'%m%y')
ORDER BY DATE_FORMAT(OrderDate,'%y%m') DESC
How can i better structure this query?
FULL ANSWER
After some tweaking of Dave Barkers solution, I have a final version for anyone in the future looking for help.
The solution by Dave Barker worked perfectly with the customer details, however, it made the simpler Total Sales and Total Sale Amount columns get some crazy figures.
SELECT
Y.OrderMonth, Y.TotalOrders, Y.TotalAmount,
Z.OrdCustFK, Z.CustCompany, Z.CustOrdTotal, Z.CustSalesTotal
FROM
(SELECT
OrdDate,
DATE_FORMAT(OrdDate,'%M %Y') AS OrderMonth,
COUNT(OrderID) AS TotalOrders,
SUM(OrdGrandTotal) AS TotalAmount
FROM Orders
WHERE OrdConfirmed = 1
GROUP BY DATE_FORMAT(OrdDate,'%m%y')
ORDER BY DATE_FORMAT(OrdDate,'%Y%m') DESC)
Y INNER JOIN
(SELECT
DATE_FORMAT(OrdDate,'%M %Y') AS CustMonth,
OrdCustFK,
CustCompany,
COUNT(OrderID) AS CustOrdTotal,
SUM(OrdGrandTotal) AS CustSalesTotal
FROM Orders INNER JOIN CustomerDetails ON OrdCustFK = CustomerID
WHERE OrdConfirmed = 1
GROUP BY DATE_FORMAT(OrdDate,'%m%y'), OrdCustFK
ORDER BY SUM(OrdGrandTotal) DESC)
Z ON Z.CustMonth = Y.OrderMonth
GROUP BY DATE_FORMAT(OrdDate,'%Y%m')
ORDER BY DATE_FORMAT(OrdDate,'%Y%m') DESC
Source: (StackOverflow)
I have this query i have written in postgresql
that returns and error saying:
[Err] ERROR:
LINE 3: FROM (SELECT DISTINCT (identifiant) AS made_only_recharge
This is the whole query:
SELECT COUNT (made_only_recharge) AS made_only_recharge
FROM (
SELECT DISTINCT (identifiant) AS made_only_recharge
FROM cdr_data
WHERE CALLEDNUMBER = '0130'
EXCEPT
SELECT DISTINCT (identifiant) AS made_only_recharge
FROM cdr_data
WHERE CALLEDNUMBER != '0130'
)
I have a similar query in oracle that works fine only change is where i have except , in oracle i place the minus key word. I am new to postgres and dont know what it is asking for. Whats the correct way of handling this?
Source: (StackOverflow)
Is there a way to specify the parent query field from within a subquery in mySQL?
For Example:
I have written a basic Bulletin Board type program in PHP.
In the database each post contains: id(PK) and parent_id(the id of the parent post). If the post is itself a parent, then its parent_id is set to 0.
I am trying to write a mySQL query that will find every parent post and the number of children that the parent has.
$query = "SELECT id, (
SELECT COUNT(1)
FROM post_table
WHERE parent_id = id
) as num_children
FROM post_table
WHERE parent_id = 0";
The tricky part is that the first id doesn't know that it should be referring to the second id that is outside of the subquery. I know that I can do SELECT id AS id_tmp and then refer to it inside the subquery, but then if I want to also return the id and keep "id" as the column name, then I'd have to do a query that returns me 2 columns with the same data (which seems messy to me)
$query = "SELECT id, id AS id_tmp,
(SELECT COUNT(1)
FROM post_table
WHERE parent_id = id_tmp) as num_children
FROM post_table
WHERE parent_id = 0";
The messy way works fine, but I feel an opportunity to learn something here so I thought I'd post the question.
Source: (StackOverflow)
This seems if it should be fairly simple, but I'm stumbling in trying to find a solution that works for me.
I have a member_contracts table that has the following (simplified) structure.
MemberID | ContractID | StartDate | End Date |
------------------------------------------------
1 1 2/1/2002 2/1/2003
2 2 3/1/2002 3/1/2003
3 3 4/1/2002 4/1/2003
1 4 2/1/2002 2/1/2004
2 5 3/1/2003 2/1/2004
3 6 4/1/2003 2/1/2004
I'm trying to create a query that will select the most recent contracts from this table. That being the following output for this small example:
MemberID | ContractID | StartDate | End Date |
------------------------------------------------
1 4 2/1/2002 2/1/2004
2 5 3/1/2003 2/1/2004
3 6 4/1/2003 2/1/2004
Doing this on a per-user basis is extremely simple since I can just use a subquery to select the max contractID for the specified user. I am using SQL server, so if there's a special way of doing it with that flavor, I'm open to using it. Personally, I'd like something that was engine agnostic.
But, how would I go about writing a query that would accomplish the goal for all the users?
EDIT: I should also add that I'm looking for the max contractID value for each user, not the most recent dates.
Source: (StackOverflow)
How can I make this query in Laravel:
SELECT
`p`.`id`,
`p`.`name`,
`p`.`img`,
`p`.`safe_name`,
`p`.`sku`,
`p`.`productstatusid`
FROM `products` p
WHERE `p`.`id` IN (
SELECT
`product_id`
FROM `product_category`
WHERE `category_id` IN ('223', '15')
)
AND `p`.`active`=1
I could also do this with a join, but I need this format for performance.
Thanks in advance!
Cheers,
Marc
Source: (StackOverflow)