Top 50 SQL Interview Questions with Answers
SQL is the universal language of data. Whether you are applying for a data analyst, data scientist, data engineer, or business analyst role, you will face SQL questions in your interview. This guide covers 50 essential SQL interview questions organized by difficulty and topic, with clear explanations and code examples.
Basic SQL Questions
1. What is the difference between WHERE and HAVING?
WHERE filters rows before grouping, while HAVING filters groups after aggregation. Use WHERE for conditions on individual rows and HAVING for conditions on aggregated values.
-- WHERE filters before aggregation
SELECT department, COUNT(*) as emp_count
FROM employees
WHERE salary > 50000
GROUP BY department;
-- HAVING filters after aggregation
SELECT department, COUNT(*) as emp_count
FROM employees
GROUP BY department
HAVING COUNT(*) > 10;
2. Explain the different types of JOINs
INNER JOIN: Returns only rows that have matching values in both tables.
LEFT JOIN: Returns all rows from the left table and matched rows from the right table. Unmatched right rows return NULL.
RIGHT JOIN: Returns all rows from the right table and matched rows from the left table.
FULL OUTER JOIN: Returns all rows when there is a match in either table.
CROSS JOIN: Returns the Cartesian product of both tables.
-- Find all customers and their orders (including customers with no orders)
SELECT c.customer_name, o.order_id, o.order_date
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id;
3. What is the difference between UNION and UNION ALL?
UNION combines result sets and removes duplicates. UNION ALL combines result sets and keeps all rows including duplicates. UNION ALL is faster because it skips the deduplication step.
-- UNION removes duplicates
SELECT city FROM customers
UNION
SELECT city FROM suppliers;
-- UNION ALL keeps all rows
SELECT city FROM customers
UNION ALL
SELECT city FROM suppliers;
4. How do you handle NULL values in SQL?
NULL represents missing or unknown data. Use IS NULL or IS NOT NULL for comparisons, as NULL = NULL returns false. Use COALESCE or IFNULL to provide default values.
-- Check for NULL
SELECT * FROM employees WHERE manager_id IS NULL;
-- Replace NULL with default value
SELECT name, COALESCE(phone, 'No phone') as phone
FROM customers;
5. What is the order of SQL clause execution?
The logical order of SQL execution is: FROM, JOIN, WHERE, GROUP BY, HAVING, SELECT, DISTINCT, ORDER BY, LIMIT. This is why you cannot use column aliases defined in SELECT within the WHERE clause.
Intermediate SQL Questions
6. Write a query to find the second highest salary
-- Using subquery
SELECT MAX(salary) as second_highest
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
-- Using LIMIT and OFFSET
SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
LIMIT 1 OFFSET 1;
-- Using window function
SELECT salary
FROM (
SELECT salary, DENSE_RANK() OVER (ORDER BY salary DESC) as rnk
FROM employees
) ranked
WHERE rnk = 2;
7. Find duplicate records in a table
SELECT email, COUNT(*) as count
FROM users
GROUP BY email
HAVING COUNT(*) > 1;
8. What is a self-join and when would you use it?
A self-join is when a table is joined with itself. Common use cases include finding hierarchical relationships (like employees and their managers) or comparing rows within the same table.
-- Find employees and their managers
SELECT e.name as employee, m.name as manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.employee_id;
9. Write a query to find customers who have never placed an order
-- Using LEFT JOIN
SELECT c.customer_id, c.customer_name
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;
-- Using NOT EXISTS
SELECT customer_id, customer_name
FROM customers c
WHERE NOT EXISTS (
SELECT 1 FROM orders o WHERE o.customer_id = c.customer_id
);
-- Using NOT IN
SELECT customer_id, customer_name
FROM customers
WHERE customer_id NOT IN (SELECT customer_id FROM orders);
10. Explain the difference between DELETE, TRUNCATE, and DROP
DELETE: Removes specific rows based on a condition. Can be rolled back. Triggers fire.
TRUNCATE: Removes all rows from a table. Faster than DELETE. Usually cannot be rolled back. Resets auto-increment.
DROP: Removes the entire table structure and data permanently.
Window Functions
11. What are window functions and why are they useful?
Window functions perform calculations across a set of rows related to the current row without collapsing them into a single output row like GROUP BY does. They are essential for running totals, rankings, and comparing rows.
12. Explain ROW_NUMBER, RANK, and DENSE_RANK
SELECT
name,
score,
ROW_NUMBER() OVER (ORDER BY score DESC) as row_num,
RANK() OVER (ORDER BY score DESC) as rank_val,
DENSE_RANK() OVER (ORDER BY score DESC) as dense_rank_val
FROM students;
-- If scores are 100, 100, 90, 85:
-- ROW_NUMBER: 1, 2, 3, 4 (always unique)
-- RANK: 1, 1, 3, 4 (skips numbers after ties)
-- DENSE_RANK: 1, 1, 2, 3 (no gaps after ties)
13. Calculate running total using window functions
SELECT
order_date,
amount,
SUM(amount) OVER (ORDER BY order_date) as running_total
FROM orders;
-- Running total by customer
SELECT
customer_id,
order_date,
amount,
SUM(amount) OVER (
PARTITION BY customer_id
ORDER BY order_date
) as customer_running_total
FROM orders;
14. Find the difference between current and previous row values
SELECT
month,
revenue,
LAG(revenue, 1) OVER (ORDER BY month) as prev_revenue,
revenue - LAG(revenue, 1) OVER (ORDER BY month) as revenue_change
FROM monthly_sales;
15. Find the top 3 products by sales in each category
WITH ranked_products AS (
SELECT
category,
product_name,
total_sales,
ROW_NUMBER() OVER (
PARTITION BY category
ORDER BY total_sales DESC
) as rn
FROM products
)
SELECT category, product_name, total_sales
FROM ranked_products
WHERE rn <= 3;
Common Table Expressions (CTEs)
16. What is a CTE and why use it over subqueries?
A CTE (Common Table Expression) is a temporary named result set defined within a query using the WITH clause. CTEs improve readability, can be referenced multiple times, and support recursive queries. They are often clearer than nested subqueries.
WITH high_value_customers AS (
SELECT customer_id, SUM(amount) as total_spent
FROM orders
GROUP BY customer_id
HAVING SUM(amount) > 10000
)
SELECT c.name, hvc.total_spent
FROM customers c
JOIN high_value_customers hvc ON c.customer_id = hvc.customer_id;
17. Write a recursive CTE to find all employees under a manager
WITH RECURSIVE org_chart AS (
-- Base case: the manager
SELECT employee_id, name, manager_id, 0 as level
FROM employees
WHERE employee_id = 1
UNION ALL
-- Recursive case: employees reporting to previous level
SELECT e.employee_id, e.name, e.manager_id, oc.level + 1
FROM employees e
JOIN org_chart oc ON e.manager_id = oc.employee_id
)
SELECT * FROM org_chart;
Query Optimization
18. How do indexes improve query performance?
Indexes create a data structure that allows the database to find rows without scanning the entire table. They speed up SELECT queries and WHERE clause filtering but slow down INSERT, UPDATE, and DELETE operations because the index must be updated.
19. When should you use indexes?
- Columns frequently used in WHERE clauses
- Columns used in JOIN conditions
- Columns used in ORDER BY
- Columns with high cardinality (many unique values)
Avoid indexing columns that are frequently updated, have low cardinality, or are rarely used in queries.
20. How do you read an execution plan?
Use EXPLAIN or EXPLAIN ANALYZE to view the query execution plan. Look for table scans (slow), index scans (faster), and nested loops (can be expensive). High cost values indicate potential optimization opportunities.
21. What causes slow queries and how do you fix them?
- Missing indexes: Add indexes on filtered or joined columns
- SELECT *: Select only needed columns
- Subqueries in SELECT: Rewrite as JOINs when possible
- Functions on indexed columns: WHERE YEAR(date_col) = 2025 cannot use an index on date_col
- Implicit type conversions: Comparing mismatched types prevents index usage
Advanced SQL Questions
22. Write a query to calculate month-over-month growth rate
WITH monthly_revenue AS (
SELECT
DATE_TRUNC('month', order_date) as month,
SUM(amount) as revenue
FROM orders
GROUP BY DATE_TRUNC('month', order_date)
)
SELECT
month,
revenue,
LAG(revenue) OVER (ORDER BY month) as prev_month_revenue,
ROUND(
(revenue - LAG(revenue) OVER (ORDER BY month)) * 100.0 /
LAG(revenue) OVER (ORDER BY month),
2
) as growth_rate_pct
FROM monthly_revenue;
23. Find users who logged in on consecutive days
WITH login_with_prev AS (
SELECT
user_id,
login_date,
LAG(login_date) OVER (
PARTITION BY user_id
ORDER BY login_date
) as prev_login
FROM logins
)
SELECT DISTINCT user_id
FROM login_with_prev
WHERE login_date = prev_login + INTERVAL '1 day';
24. Calculate 7-day rolling average
SELECT
date,
daily_sales,
AVG(daily_sales) OVER (
ORDER BY date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) as rolling_7_day_avg
FROM daily_sales_table;
25. Pivot data from rows to columns
-- Using CASE statements
SELECT
product_id,
SUM(CASE WHEN quarter = 'Q1' THEN sales ELSE 0 END) as Q1,
SUM(CASE WHEN quarter = 'Q2' THEN sales ELSE 0 END) as Q2,
SUM(CASE WHEN quarter = 'Q3' THEN sales ELSE 0 END) as Q3,
SUM(CASE WHEN quarter = 'Q4' THEN sales ELSE 0 END) as Q4
FROM quarterly_sales
GROUP BY product_id;
Practice Questions 26-50
Aggregation and Grouping (26-30)
26. Find the average order value by customer segment.
27. Count the number of orders per day of the week.
28. Find products that have never been ordered.
29. Calculate the percentage of total sales for each product.
30. Find the customer with the highest total spend in each city.
Date and Time (31-35)
31. Find all orders placed in the last 30 days.
32. Calculate the age of each customer in years.
33. Find the first and last order date for each customer.
34. Group sales by fiscal quarter (April-March).
35. Find the time difference between consecutive orders for each customer.
Advanced JOINs (36-40)
36. Find products that are more expensive than the average in their category.
37. Identify customers who have ordered all products in a specific category.
38. Find pairs of customers from the same city.
39. Calculate the overlap between two user segments.
40. Find the manager with the most direct reports.
Window Functions Advanced (41-45)
41. Calculate the median salary by department.
42. Find the longest streak of consecutive login days per user.
43. Identify users whose purchase amount increased for 3 consecutive months.
44. Calculate the cumulative distribution of order values.
45. Find the first order of each type for every customer.
Real-World Scenarios (46-50)
46. Calculate customer retention rate month over month.
47. Build a cohort analysis showing user retention by signup month.
48. Find users who churned (no activity in 30+ days after being active).
49. Calculate the conversion rate from trial to paid subscription.
50. Build a recommendation query: find products frequently bought together.
Tips for SQL Interview Success
- Think out loud: Explain your approach before writing code
- Start simple: Get a working solution first, then optimize
- Handle edge cases: Ask about NULLs, duplicates, and empty results
- Use meaningful aliases: Make your query readable
- Practice without IDE help: Many interviews use plain text editors
To prepare for your data interview, make sure your resume highlights your SQL proficiency with specific examples of complex queries and data analysis projects you have completed.