How important is SQL for data analyst interviews?

SQL is the most critical technical skill for data analyst interviews. Nearly every data analyst interview includes SQL questions, and many companies make SQL proficiency a hard requirement. You should be comfortable writing queries without IDE assistance and be able to solve problems involving JOINs, aggregations, and window functions.

Should I memorize SQL syntax for interviews?

You should know core SQL syntax well enough to write queries without looking up basic commands. However, interviewers understand that minor syntax variations exist between databases. Focus on understanding concepts and logic rather than perfect memorization. Being able to explain what a query does matters more than perfect semicolon placement.

What SQL topics are most commonly tested in interviews?

The most commonly tested SQL topics are JOINs (especially LEFT JOIN scenarios), GROUP BY with HAVING clauses, subqueries and CTEs, window functions (ROW_NUMBER, RANK, LAG/LEAD), and query optimization. Date manipulation and handling NULL values also appear frequently.

Top 50 SQL Interview Questions with Answers

Published February 11, 2026

SQL is the universal language of data. Whether you are applying for a data analyst, data scientist, data engineer, or business analyst role, you will face SQL questions in your interview. This guide covers 50 essential SQL interview questions organized by difficulty and topic, with clear explanations and code examples.

Basic SQL Questions

1. What is the difference between WHERE and HAVING?

WHERE filters rows before grouping, while HAVING filters groups after aggregation. Use WHERE for conditions on individual rows and HAVING for conditions on aggregated values.

-- WHERE filters before aggregation
SELECT department, COUNT(*) as emp_count
FROM employees
WHERE salary > 50000
GROUP BY department;

-- HAVING filters after aggregation
SELECT department, COUNT(*) as emp_count
FROM employees
GROUP BY department
HAVING COUNT(*) > 10;

2. Explain the different types of JOINs

INNER JOIN: Returns only rows that have matching values in both tables.

LEFT JOIN: Returns all rows from the left table and matched rows from the right table. Unmatched right rows return NULL.

RIGHT JOIN: Returns all rows from the right table and matched rows from the left table.

FULL OUTER JOIN: Returns all rows when there is a match in either table.

CROSS JOIN: Returns the Cartesian product of both tables.

-- Find all customers and their orders (including customers with no orders)
SELECT c.customer_name, o.order_id, o.order_date
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id;

3. What is the difference between UNION and UNION ALL?

UNION combines result sets and removes duplicates. UNION ALL combines result sets and keeps all rows including duplicates. UNION ALL is faster because it skips the deduplication step.

-- UNION removes duplicates
SELECT city FROM customers
UNION
SELECT city FROM suppliers;

-- UNION ALL keeps all rows
SELECT city FROM customers
UNION ALL
SELECT city FROM suppliers;

4. How do you handle NULL values in SQL?

NULL represents missing or unknown data. Use IS NULL or IS NOT NULL for comparisons, as NULL = NULL returns false. Use COALESCE or IFNULL to provide default values.

-- Check for NULL
SELECT * FROM employees WHERE manager_id IS NULL;

-- Replace NULL with default value
SELECT name, COALESCE(phone, 'No phone') as phone
FROM customers;

5. What is the order of SQL clause execution?

The logical order of SQL execution is: FROM, JOIN, WHERE, GROUP BY, HAVING, SELECT, DISTINCT, ORDER BY, LIMIT. This is why you cannot use column aliases defined in SELECT within the WHERE clause.

Intermediate SQL Questions

6. Write a query to find the second highest salary

-- Using subquery
SELECT MAX(salary) as second_highest
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);

-- Using LIMIT and OFFSET
SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
LIMIT 1 OFFSET 1;

-- Using window function
SELECT salary
FROM (
    SELECT salary, DENSE_RANK() OVER (ORDER BY salary DESC) as rnk
    FROM employees
) ranked
WHERE rnk = 2;

7. Find duplicate records in a table

SELECT email, COUNT(*) as count
FROM users
GROUP BY email
HAVING COUNT(*) > 1;

8. What is a self-join and when would you use it?

A self-join is when a table is joined with itself. Common use cases include finding hierarchical relationships (like employees and their managers) or comparing rows within the same table.

-- Find employees and their managers
SELECT e.name as employee, m.name as manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.employee_id;

9. Write a query to find customers who have never placed an order

-- Using LEFT JOIN
SELECT c.customer_id, c.customer_name
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;

-- Using NOT EXISTS
SELECT customer_id, customer_name
FROM customers c
WHERE NOT EXISTS (
    SELECT 1 FROM orders o WHERE o.customer_id = c.customer_id
);

-- Using NOT IN
SELECT customer_id, customer_name
FROM customers
WHERE customer_id NOT IN (SELECT customer_id FROM orders);

10. Explain the difference between DELETE, TRUNCATE, and DROP

DELETE: Removes specific rows based on a condition. Can be rolled back. Triggers fire.

TRUNCATE: Removes all rows from a table. Faster than DELETE. Usually cannot be rolled back. Resets auto-increment.

DROP: Removes the entire table structure and data permanently.

Window Functions

11. What are window functions and why are they useful?

Window functions perform calculations across a set of rows related to the current row without collapsing them into a single output row like GROUP BY does. They are essential for running totals, rankings, and comparing rows.

12. Explain ROW_NUMBER, RANK, and DENSE_RANK

SELECT
    name,
    score,
    ROW_NUMBER() OVER (ORDER BY score DESC) as row_num,
    RANK() OVER (ORDER BY score DESC) as rank_val,
    DENSE_RANK() OVER (ORDER BY score DESC) as dense_rank_val
FROM students;

-- If scores are 100, 100, 90, 85:
-- ROW_NUMBER: 1, 2, 3, 4 (always unique)
-- RANK: 1, 1, 3, 4 (skips numbers after ties)
-- DENSE_RANK: 1, 1, 2, 3 (no gaps after ties)

13. Calculate running total using window functions

SELECT
    order_date,
    amount,
    SUM(amount) OVER (ORDER BY order_date) as running_total
FROM orders;

-- Running total by customer
SELECT
    customer_id,
    order_date,
    amount,
    SUM(amount) OVER (
        PARTITION BY customer_id
        ORDER BY order_date
    ) as customer_running_total
FROM orders;

14. Find the difference between current and previous row values

SELECT
    month,
    revenue,
    LAG(revenue, 1) OVER (ORDER BY month) as prev_revenue,
    revenue - LAG(revenue, 1) OVER (ORDER BY month) as revenue_change
FROM monthly_sales;

15. Find the top 3 products by sales in each category

WITH ranked_products AS (
    SELECT
        category,
        product_name,
        total_sales,
        ROW_NUMBER() OVER (
            PARTITION BY category
            ORDER BY total_sales DESC
        ) as rn
    FROM products
)
SELECT category, product_name, total_sales
FROM ranked_products
WHERE rn <= 3;

Common Table Expressions (CTEs)

16. What is a CTE and why use it over subqueries?

A CTE (Common Table Expression) is a temporary named result set defined within a query using the WITH clause. CTEs improve readability, can be referenced multiple times, and support recursive queries. They are often clearer than nested subqueries.

WITH high_value_customers AS (
    SELECT customer_id, SUM(amount) as total_spent
    FROM orders
    GROUP BY customer_id
    HAVING SUM(amount) > 10000
)
SELECT c.name, hvc.total_spent
FROM customers c
JOIN high_value_customers hvc ON c.customer_id = hvc.customer_id;

17. Write a recursive CTE to find all employees under a manager

WITH RECURSIVE org_chart AS (
    -- Base case: the manager
    SELECT employee_id, name, manager_id, 0 as level
    FROM employees
    WHERE employee_id = 1

    UNION ALL

    -- Recursive case: employees reporting to previous level
    SELECT e.employee_id, e.name, e.manager_id, oc.level + 1
    FROM employees e
    JOIN org_chart oc ON e.manager_id = oc.employee_id
)
SELECT * FROM org_chart;

Query Optimization

18. How do indexes improve query performance?

Indexes create a data structure that allows the database to find rows without scanning the entire table. They speed up SELECT queries and WHERE clause filtering but slow down INSERT, UPDATE, and DELETE operations because the index must be updated.

19. When should you use indexes?

Columns frequently used in WHERE clauses
Columns used in JOIN conditions
Columns used in ORDER BY
Columns with high cardinality (many unique values)

Avoid indexing columns that are frequently updated, have low cardinality, or are rarely used in queries.

20. How do you read an execution plan?

Use EXPLAIN or EXPLAIN ANALYZE to view the query execution plan. Look for table scans (slow), index scans (faster), and nested loops (can be expensive). High cost values indicate potential optimization opportunities.

21. What causes slow queries and how do you fix them?

Missing indexes: Add indexes on filtered or joined columns
SELECT *: Select only needed columns
Subqueries in SELECT: Rewrite as JOINs when possible
Functions on indexed columns: WHERE YEAR(date_col) = 2025 cannot use an index on date_col
Implicit type conversions: Comparing mismatched types prevents index usage

Advanced SQL Questions

22. Write a query to calculate month-over-month growth rate

WITH monthly_revenue AS (
    SELECT
        DATE_TRUNC('month', order_date) as month,
        SUM(amount) as revenue
    FROM orders
    GROUP BY DATE_TRUNC('month', order_date)
)
SELECT
    month,
    revenue,
    LAG(revenue) OVER (ORDER BY month) as prev_month_revenue,
    ROUND(
        (revenue - LAG(revenue) OVER (ORDER BY month)) * 100.0 /
        LAG(revenue) OVER (ORDER BY month),
        2
    ) as growth_rate_pct
FROM monthly_revenue;

23. Find users who logged in on consecutive days

WITH login_with_prev AS (
    SELECT
        user_id,
        login_date,
        LAG(login_date) OVER (
            PARTITION BY user_id
            ORDER BY login_date
        ) as prev_login
    FROM logins
)
SELECT DISTINCT user_id
FROM login_with_prev
WHERE login_date = prev_login + INTERVAL '1 day';

24. Calculate 7-day rolling average

SELECT
    date,
    daily_sales,
    AVG(daily_sales) OVER (
        ORDER BY date
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ) as rolling_7_day_avg
FROM daily_sales_table;

25. Pivot data from rows to columns

-- Using CASE statements
SELECT
    product_id,
    SUM(CASE WHEN quarter = 'Q1' THEN sales ELSE 0 END) as Q1,
    SUM(CASE WHEN quarter = 'Q2' THEN sales ELSE 0 END) as Q2,
    SUM(CASE WHEN quarter = 'Q3' THEN sales ELSE 0 END) as Q3,
    SUM(CASE WHEN quarter = 'Q4' THEN sales ELSE 0 END) as Q4
FROM quarterly_sales
GROUP BY product_id;

Practice Questions 26-50

Aggregation and Grouping (26-30)

26. Find the average order value by customer segment.

27. Count the number of orders per day of the week.

28. Find products that have never been ordered.

29. Calculate the percentage of total sales for each product.

30. Find the customer with the highest total spend in each city.

Date and Time (31-35)

31. Find all orders placed in the last 30 days.

32. Calculate the age of each customer in years.

33. Find the first and last order date for each customer.

34. Group sales by fiscal quarter (April-March).

35. Find the time difference between consecutive orders for each customer.

Advanced JOINs (36-40)

36. Find products that are more expensive than the average in their category.

37. Identify customers who have ordered all products in a specific category.

38. Find pairs of customers from the same city.

39. Calculate the overlap between two user segments.

40. Find the manager with the most direct reports.

Window Functions Advanced (41-45)

41. Calculate the median salary by department.

42. Find the longest streak of consecutive login days per user.

43. Identify users whose purchase amount increased for 3 consecutive months.

44. Calculate the cumulative distribution of order values.

45. Find the first order of each type for every customer.

Real-World Scenarios (46-50)

46. Calculate customer retention rate month over month.

47. Build a cohort analysis showing user retention by signup month.

48. Find users who churned (no activity in 30+ days after being active).

49. Calculate the conversion rate from trial to paid subscription.

50. Build a recommendation query: find products frequently bought together.

Tips for SQL Interview Success

Think out loud: Explain your approach before writing code
Start simple: Get a working solution first, then optimize
Handle edge cases: Ask about NULLs, duplicates, and empty results
Use meaningful aliases: Make your query readable
Practice without IDE help: Many interviews use plain text editors

To prepare for your data interview, make sure your resume highlights your SQL proficiency with specific examples of complex queries and data analysis projects you have completed.