Top SQL Interview Questions and Answers for 2025



Lupa will help you hire top talent in Latin America.
Book a Free ConsultationLupa helps you build, manage, and pay your remote team. We deliver pre-vetted candidates within a week!
Book a Free ConsultationSQL remains one of the most critical skills for tech professionals in 2025. Whether you're a beginner preparing for your first technical interview or an experienced database professional looking to refresh your knowledge, mastering sql interview questions is essential for career advancement. In this comprehensive guide, we'll cover 75 of the most common SQL interview questions and answers, organized by category to help you focus on specific areas that need strengthening.
SQL Fundamentals (Basic Level Questions)
1. What is SQL and why is it important?
SQL (Structured Query Language) is a standardized programming language used for managing relational databases and performing various operations on the data they contain. It's important because it allows users to access, manipulate, and retrieve data from databases efficiently. SQL is the foundation for most data operations in business applications.
2. What is the difference between SQL and MySQL?
SQL is a language used to communicate with databases, while MySQL is a specific database management system (DBMS) that uses SQL. Think of SQL as the language and MySQL as a particular platform that implements that language, similar to how HTML is a language and Chrome is a platform that interprets it.
3. What are the main categories of SQL commands?
SQL commands are divided into several categories:
- DDL (Data Definition Language): Commands like CREATE, ALTER, DROP, TRUNCATE that define database structure
- DML (Data Manipulation Language): Commands like SELECT, INSERT, UPDATE, DELETE that manipulate data
- DCL (Data Control Language): Commands like GRANT, REVOKE that control access to data
- TCL (Transaction Control Language): Commands like COMMIT, ROLLBACK that manage transactions
4. What is a primary key in SQL?
A primary key is a column or combination of columns that uniquely identifies each row in a table. It must contain unique values and cannot contain NULL values. Primary keys enforce entity integrity and are essential for establishing relationships between tables.
5. What is a foreign key in SQL?
A foreign key is a column or group of columns in a table that references the primary key of another table. It creates a link between two tables, enforcing referential integrity and establishing relationships between related data.
6. What are constraints in SQL?
Constraints are rules enforced on data columns in tables to ensure data integrity. Common constraints include:
- NOT NULL: Ensures a column cannot have NULL values
- UNIQUE: Ensures all values in a column are different
- PRIMARY KEY: Uniquely identifies each row
- FOREIGN KEY: Links data between tables
- CHECK: Ensures values in a column meet specific conditions
- DEFAULT: Sets a default value for a column
7. Explain the basic SELECT statement syntax.
The basic SELECT statement retrieves data from a database:
SELECT column1, column2, ...
FROM table_name
WHERE condition
ORDER BY column1 [ASC|DESC];- SELECT specifies which columns to retrieve
- FROM specifies the table
- WHERE filters rows based on conditions
- ORDER BY sorts the result set
8. What is the difference between DELETE and TRUNCATE commands?
- DELETE removes specific rows based on a condition and can be rolled back (it's a DML command)
- TRUNCATE removes all rows from a table without logging individual row deletions (it's a DDL command), making it faster but not rollback-capable in most databases
9. How do you insert data into a table?
Using the INSERT INTO statement:
INSERT INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...);10. How do you update data in a table?
Using the UPDATE statement:
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;11. What is the WHERE clause used for?
The WHERE clause filters records based on specified conditions, returning only the rows that match those conditions. It's used with SELECT, UPDATE, and DELETE statements.
12. What is the difference between CHAR and VARCHAR data types?
- CHAR is fixed-length: if you define CHAR(10), it always uses 10 bytes of storage regardless of the string length
- VARCHAR is variable-length: VARCHAR(10) uses only the space needed for the actual string plus some overhead
13. What is a NULL value in SQL?
NULL represents a missing or unknown value. It's not the same as zero, an empty string, or a space. NULL values require special handling in queries using IS NULL or IS NOT NULL operators.
14. How do you create a table in SQL?
Using the CREATE TABLE statement:
CREATE TABLE table_name (
column1 datatype constraints,
column2 datatype constraints,
...
);
15. What is the difference between DROP and TRUNCATE?
- DROP completely removes a table structure from the database
- TRUNCATE removes all rows from a table but keeps the table structure intact
SQL Query Components and Data Manipulation
16. What is the difference between WHERE and HAVING clauses?
- WHERE filters rows before grouping in a GROUP BY clause
- HAVING filters groups after the GROUP BY clause has been applied
- WHERE works on individual rows, while HAVING works on grouped rows
17. Explain the different types of JOINs in SQL.
- INNER JOIN: Returns records with matching values in both tables
- LEFT JOIN: Returns all records from the left table and matched records from the right table
- RIGHT JOIN: Returns all records from the right table and matched records from the left table
- FULL JOIN: Returns all records when there's a match in either table
- SELF JOIN: Joins a table to itself
18. What are aggregate functions in SQL?
Aggregate functions perform calculations on a set of values and return a single value. Common aggregate functions include:
- COUNT(): Counts the number of rows
- SUM(): Calculates the sum of values
- AVG(): Calculates the average of values
- MIN(): Finds the minimum value
- MAX(): Finds the maximum value
19. How does the GROUP BY clause work?
GROUP BY groups rows that have the same values in specified columns into summary rows. It's often used with aggregate functions to perform calculations on each group rather than the entire result set.
SELECT department, COUNT(*) as employee_count
FROM employees
GROUP BY department;20. What is the ORDER BY clause used for?
ORDER BY sorts the result set by one or more columns, either in ascending (ASC, the default) or descending (DESC) order.
21. What is a subquery in SQL?
A subquery is a query nested inside another query. It can be used in SELECT, INSERT, UPDATE, or DELETE statements, typically in the WHERE clause. Subqueries can return single values, lists, or result sets that the outer query can use.
SELECT employee_name
FROM employees
WHERE department_id IN (SELECT id FROM departments WHERE location = 'New York');22. What is the difference between UNION and UNION ALL?
- UNION combines the results of two or more SELECT statements and removes duplicate rows
- UNION ALL combines the results without removing duplicates, making it faster when duplicates aren't a concern
23. How do you use the LIKE operator for pattern matching?
LIKE is used with the WHERE clause for pattern matching with wildcards:
- % represents zero or more characters
- _ represents a single character
SELECT * FROM customers WHERE last_name LIKE 'Sm%'; -- Names starting with "Sm"
SELECT * FROM products WHERE code LIKE 'A_C'; -- Codes like "ABC", "ADC", etc.24. What are Common Table Expressions (CTEs)?
CTEs provide a way to write auxiliary statements for use in a larger query. They're defined using the WITH clause and can make complex queries more readable and maintainable.
WITH employee_counts AS (
SELECT department_id, COUNT(*) as count
FROM employees
GROUP BY department_id
)
SELECT d.name, ec.count
FROM departments d
JOIN employee_counts ec ON d.id = ec.department_id;25. How do you use the CASE statement in SQL?
CASE provides conditional logic in SQL queries, similar to if-then-else statements in other programming languages:
SELECT product_name,
CASE
WHEN stock_quantity = 0 THEN 'Out of stock'
WHEN stock_quantity < 10 THEN 'Low stock'
ELSE 'In stock'
END AS stock_status
FROM products;26. What is the difference between INNER JOIN and LEFT JOIN?
- INNER JOIN returns only the rows where there's a match in both tables
- LEFT JOIN returns all rows from the left table and matching rows from the right table (with NULL values for non-matches)
27. How do you join three or more tables in SQL?
By chaining multiple JOIN clauses:
SELECT o.order_id, c.customer_name, p.product_name
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id;28. What are self-joins used for?
Self-joins are used when a table needs to be joined with itself, typically when a table contains hierarchical data or when rows in the same table are related to each other.
-- Finding employees and their managers
SELECT e.name as employee, m.name as manager
FROM employees e
JOIN employees m ON e.manager_id = m.employee_id;29. How do you use the COALESCE function?
COALESCE returns the first non-NULL expression in a list, useful for handling NULL values:
SELECT product_name, COALESCE(discount, 0) as discount
FROM products;30. What is the difference between IN and EXISTS?
- IN compares a value to a list of specified values
- EXISTS checks whether a subquery returns any rows
IN is often better when the subquery result set is small, while EXISTS can be more efficient for large result sets.
Database Design and Normalization
31. What is normalization in database design?
Normalization is the process of organizing data in a database to reduce redundancy and improve data integrity. It involves dividing large tables into smaller, related tables and defining relationships between them according to rules designed to protect the data and make the database more flexible.
32. Explain the First Normal Form (1NF).
A table is in First Normal Form if:
- It has no repeating groups or arrays
- All column values are atomic (indivisible)
- Each row is unique (typically ensured by a primary key)
Example violation: A "Phone_Numbers" column containing multiple phone numbers separated by commas.
33. Explain the Second Normal Form (2NF).
A table is in Second Normal Form if:
- It's already in 1NF
- All non-key attributes are fully functionally dependent on the entire primary key (not just part of it)
This is mainly relevant for tables with composite primary keys.
34. Explain the Third Normal Form (3NF).
A table is in Third Normal Form if:
- It's already in 2NF
- All non-key attributes are directly dependent on the primary key and not on other non-key attributes (no transitive dependencies)
Example violation: If a table has columns for ZIP code and City, where City depends on ZIP code rather than directly on the primary key.
35. What is Boyce-Codd Normal Form (BCNF)?
BCNF is a stricter version of 3NF. A table is in BCNF if for every functional dependency X → Y, X is a superkey (meaning X can uniquely identify all rows in the table).
36. What is denormalization and when would you use it?
Denormalization is the process of adding redundant data to tables to improve read performance. It's used when:
- Read performance is more critical than write performance
- Complex joins are slowing down queries
- The application is read-heavy with relatively few updates
37. What is an Entity-Relationship Diagram (ERD)?
An ERD is a visual representation of the data structure in a database. It shows entities (tables), attributes (columns), and relationships between entities, helping to design and understand database schemas.
38. Explain the different types of relationships in database design.
- One-to-One (1:1): Each record in Table A relates to exactly one record in Table B
- One-to-Many (1:N): Each record in Table A relates to multiple records in Table B
- Many-to-Many (M:N): Multiple records in Table A relate to multiple records in Table B (implemented using a junction table)
39. What is referential integrity?
Referential integrity ensures that relationships between tables remain consistent. It means that if a foreign key exists, it either refers to a valid primary key in the parent table or is NULL (if allowed).
40. How do you implement a Many-to-Many relationship in SQL?
Using a junction (or bridge) table that contains foreign keys to both related tables:
-- Books and Authors have a many-to-many relationship
CREATE TABLE books (
book_id INT PRIMARY KEY,
title VARCHAR(100)
);
CREATE TABLE authors (
author_id INT PRIMARY KEY,
name VARCHAR(100)
);
CREATE TABLE book_authors (
book_id INT,
author_id INT,
PRIMARY KEY (book_id, author_id),
FOREIGN KEY (book_id) REFERENCES books(book_id),
FOREIGN KEY (author_id) REFERENCES authors(author_id)
);
41. What is cardinality in database relationships?
Cardinality defines the numerical relationship between entities, specifying how many instances of one entity can be related to instances of another entity (one-to-one, one-to-many, many-to-many).
42. What is a composite key?
A composite key is a primary key composed of multiple columns that together uniquely identify each row in a table.
Advanced SQL Features and Functions
43. What are window functions in SQL?
Window functions perform calculations across a set of table rows related to the current row, similar to aggregate functions but without grouping rows into a single output row. Common window functions include:
- ROW_NUMBER(): Assigns a unique number to each row
- RANK(): Assigns a rank with gaps for ties
- DENSE_RANK(): Assigns a rank without gaps for ties
- NTILE(): Divides rows into specified number of groups
SELECT employee_name, salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) as salary_rank
FROM employees;44. How do LAG and LEAD functions work?
- LAG accesses data from previous rows in the result set
- LEAD accesses data from subsequent rows
SELECT order_date, order_amount,
LAG(order_amount) OVER (ORDER BY order_date) as previous_order_amount,
LEAD(order_amount) OVER (ORDER BY order_date) as next_order_amount
FROM orders;45. How do you write a recursive query in SQL?
Using Common Table Expressions (CTEs) with a recursive member:
-- Finding all employees in a hierarchical organization
WITH RECURSIVE employee_hierarchy AS (
-- Base case: top-level employees with no manager
SELECT employee_id, name, manager_id, 1 as level
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive case: employees with managers
SELECT e.employee_id, e.name, e.manager_id, eh.level + 1
FROM employees e
JOIN employee_hierarchy eh ON e.manager_id = eh.employee_id
)
SELECT * FROM employee_hierarchy;46. What is a PIVOT operation in SQL?
PIVOT transforms rows into columns, converting unique values from one column into multiple columns in the output:
-- Converting quarterly sales data from rows to columns
SELECT product_name,
[Q1] as Q1_Sales, [Q2] as Q2_Sales,
[Q3] as Q3_Sales, [Q4] as Q4_Sales
FROM (
SELECT product_name, quarter, sales_amount
FROM sales
) as source_data
PIVOT (
SUM(sales_amount)
FOR quarter IN ([Q1], [Q2], [Q3], [Q4])
) as pivot_table;47. What are JSON functions in SQL?
Modern SQL databases provide functions for working with JSON data, allowing you to:
- Extract values from JSON documents
- Check if a path exists in JSON
- Modify JSON documents
- Convert between JSON and relational formats
-- PostgreSQL example
SELECT id, data->>'name' as customer_name
FROM customers
WHERE data->>'status' = 'active';48. How do you handle date and time operations in SQL?
SQL provides various functions for date/time manipulation:
-- Extract parts of dates
SELECT EXTRACT(YEAR FROM order_date) as order_year,
EXTRACT(MONTH FROM order_date) as order_month
FROM orders;
-- Date arithmetic
SELECT order_date,
order_date + INTERVAL '7 days' as delivery_date
FROM orders;
-- Date difference
SELECT DATEDIFF(day, order_date, ship_date) as processing_days
FROM orders;49. What are stored procedures and when would you use them?
Stored procedures are precompiled collections of SQL statements stored in the database that can be executed as a single unit. They're useful for:
- Encapsulating complex business logic
- Improving security by controlling access to data
- Reducing network traffic by executing multiple statements at once
- Enabling code reuse
CREATE PROCEDURE update_product_price(
IN product_id INT,
IN new_price DECIMAL(10,2)
)
BEGIN
UPDATE products
SET price = new_price
WHERE id = product_id;
END;
50. What are triggers in SQL?
Triggers are special stored procedures that automatically execute when specific events occur in the database, such as INSERT, UPDATE, or DELETE operations. They're useful for:
- Enforcing complex business rules
- Auditing changes to data
- Maintaining derived data automatically
CREATE TRIGGER update_inventory
AFTER INSERT ON order_items
FOR EACH ROW
BEGIN
UPDATE inventory
SET quantity = quantity - NEW.quantity
WHERE product_id = NEW.product_id;
END;
Database Performance Optimization
51. What are indexes in SQL and how do they improve performance?
Indexes are database structures that improve the speed of data retrieval operations by providing quick access paths to data. They work similar to a book's index, allowing the database to find data without scanning the entire table.
52. What is the difference between clustered and non-clustered indexes?
- Clustered indexes determine the physical order of data in a table. A table can have only one clustered index.
- Non-clustered indexes create a separate structure that points to the data. A table can have multiple non-clustered indexes.
Clustered indexes are typically faster for range queries, while non-clustered indexes are better for selective queries.
53. What is query optimization?
Query optimization is the process of improving SQL query performance by:
- Rewriting queries to be more efficient
- Adding appropriate indexes
- Restructuring database schema
- Analyzing and improving execution plans
54. How do you analyze query performance in SQL?
Using EXPLAIN or EXPLAIN PLAN statements to view the execution plan:
-- PostgreSQL
EXPLAIN ANALYZE SELECT * FROM customers WHERE region = 'West';
-- MySQL
EXPLAIN SELECT * FROM customers WHERE region = 'West';
-- SQL Server
SET SHOWPLAN_ALL ON;
GO
SELECT * FROM customers WHERE region = 'West';
GO
SET SHOWPLAN_ALL OFF;55. What is database partitioning?
Partitioning divides large tables into smaller, more manageable pieces called partitions, while still being treated as a single logical table. Types include:
- Horizontal partitioning (sharding): Splits rows across multiple tables
- Vertical partitioning: Splits columns across multiple tables
- Range partitioning: Partitions based on ranges of values
- List partitioning: Partitions based on lists of discrete values
- Hash partitioning: Distributes rows based on a hash function
56. What are materialized views?
Materialized views store the results of a query physically, unlike regular views which are virtual. They improve performance for complex queries by pre-computing and storing results, but require refreshing when underlying data changes.
57. How can you optimize JOIN operations?
- Join on indexed columns
- Use appropriate join types (INNER vs. OUTER)
- Filter data before joining (in the WHERE clause)
- Consider denormalizing for frequently joined tables
- Use covering indexes that include all columns referenced in the query
58. What is query caching?
Query caching stores the results of frequently executed queries to avoid repeated computation. When the same query is executed again, results can be retrieved from the cache instead of re-executing the query.
59. How do you handle large result sets efficiently?
- Use pagination (LIMIT/OFFSET or ROW_NUMBER)
- Stream results rather than loading everything into memory
- Filter data as much as possible before retrieving
- Consider asynchronous processing for very large datasets
- Use cursors for row-by-row processing when appropriate
60. What is the impact of NULL values on index performance?
NULL values can affect index performance differently depending on the database system:
- Some systems don't include NULL values in indexes
- Others require special handling for NULL values
- Composite indexes may not be used if any component column contains NULL
In general, avoiding NULL values in indexed columns can improve performance.
Transaction Management and ACID Properties
61. What are ACID properties in database transactions?
ACID is an acronym representing the four key properties that guarantee reliable processing of database transactions:
- Atomicity: A transaction is treated as a single, indivisible unit that either completes entirely or fails entirely
- Consistency: A transaction brings the database from one valid state to another
- Isolation: Concurrent transactions execute as if they were sequential
- Durability: Once a transaction is committed, it remains so even in the event of power loss or crashes
62. Explain transaction isolation levels.
Transaction isolation levels determine how transaction integrity is visible to other users and systems:
- Read Uncommitted: Allows dirty reads (seeing uncommitted changes)
- Read Committed: Prevents dirty reads but allows non-repeatable reads
- Repeatable Read: Prevents dirty and non-repeatable reads but allows phantom reads
- Serializable: Prevents all concurrency issues but with the lowest performance
63. What is a deadlock and how can it be prevented?
A deadlock occurs when two or more transactions are waiting for each other to release locks, resulting in a standstill. Prevention strategies include:
- Accessing tables in the same order in all transactions
- Keeping transactions short and simple
- Using appropriate isolation levels
- Setting deadlock detection timeouts
- Using optimistic concurrency control where appropriate
64. What is the difference between pessimistic and optimistic locking?
- Pessimistic locking locks resources when they're first accessed and holds locks until the transaction completes, preventing other transactions from modifying the data
- Optimistic locking allows multiple transactions to proceed without locking, but checks at commit time if data has been modified by another transaction
65. How do savepoints work in transactions?
Savepoints mark a point within a transaction to which you can later roll back without affecting the entire transaction:
BEGIN TRANSACTION;
INSERT INTO orders VALUES (1, 'Customer A', CURRENT_DATE);
SAVEPOINT after_order;
INSERT INTO order_items VALUES (1, 101, 2);
-- If something goes wrong with order items
ROLLBACK TO SAVEPOINT after_order;
-- Try again with different items
INSERT INTO order_items VALUES (1, 102, 1);
COMMIT;
PostgreSQL-Specific Interview Questions
66. What makes PostgreSQL different from other database systems?
PostgreSQL offers several distinctive features:
- Advanced data types (arrays, JSON, hstore, geometric types)
- Table inheritance
- Sophisticated locking mechanism
- Multi-Version Concurrency Control (MVCC)
- Extensibility through custom functions, operators, and data types
- Support for foreign data wrappers
- Robust security features
- Strong standards compliance
67. Explain PostgreSQL's MVCC (Multi-Version Concurrency Control).
MVCC allows PostgreSQL to handle concurrent access to the database without using read locks. Instead of overwriting data directly, it creates new versions of rows when they're modified. This allows:
- Readers to see a consistent snapshot of data without being blocked by writers
- Writers to modify data without blocking readers
- Better scalability for read-heavy workloads
68. What is Write-Ahead Logging (WAL) in PostgreSQL?
WAL is PostgreSQL's mechanism for ensuring data integrity. Changes are first recorded in the WAL before being applied to the actual data files. This provides:
- Crash recovery: After a crash, the database can be restored to a consistent state by replaying WAL records
- Point-in-time recovery: The database can be restored to any point in time
- Replication: WAL records can be sent to standby servers to keep them in sync
69. How does PostgreSQL handle full-text search?
PostgreSQL provides sophisticated full-text search capabilities:
- Text search configurations for different languages
- Ranking of search results
- Phrase searching
- Stemming and stop words
- Indexing for performance
-- Creating a full-text search index
CREATE INDEX idx_fts_product_description
ON products USING GIN (to_tsvector('english', description));
-- Performing a full-text search
SELECT title, description
FROM products
WHERE to_tsvector('english', description) @@ to_tsquery('english', 'comfortable & chair');70. What are PostgreSQL extensions and how do they work?
Extensions are packages that add functionality to PostgreSQL without modifying core code. Popular extensions include:
- PostGIS for spatial data
- pgcrypto for cryptographic functions
- hstore for key-value pairs
- pg_stat_statements for query performance monitoring
-- Installing an extension
CREATE EXTENSION postgis;
Real-World SQL Problem-Solving Questions
71. How would you find duplicate records in a table?
Using GROUP BY with HAVING:
SELECT email, COUNT(*)
FROM customers
GROUP BY email
HAVING COUNT(*) > 1;72. How would you calculate running totals in SQL?
Using window functions:
SELECT order_date, order_amount,
SUM(order_amount) OVER (ORDER BY order_date) as running_total
FROM orders;73. How would you find gaps in sequential data?
Using self-joins or window functions:
-- Finding gaps in employee IDs
WITH numbers AS (
SELECT generate_series(
(SELECT MIN(employee_id) FROM employees),
(SELECT MAX(employee_id) FROM employees)
) AS id
)
SELECT n.id AS missing_id
FROM numbers n
LEFT JOIN employees e ON n.id = e.employee_id
WHERE e.employee_id IS NULL;74. How would you pivot data from rows to columns dynamically?
Using dynamic SQL or CROSSTAB functions:
-- PostgreSQL example with crosstab
SELECT * FROM crosstab(
'SELECT product_id, quarter, sales
FROM quarterly_sales
ORDER BY 1, 2',
'SELECT DISTINCT quarter FROM quarterly_sales ORDER BY 1'
) AS ct (product_id INT, q1 NUMERIC, q2 NUMERIC, q3 NUMERIC, q4 NUMERIC);75. How would you implement a hierarchical query to show an organizational structure?
Using recursive CTEs:
WITH RECURSIVE org_hierarchy AS (
-- Base case: CEO (no manager)
SELECT id, name, manager_id, 0 as level, ARRAY[name] as path
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive case: all other employees
SELECT e.id, e.name, e.manager_id, oh.level + 1, oh.path || e.name
FROM employees e
JOIN org_hierarchy oh ON e.manager_id = oh.id
)
SELECT level, repeat(' ', level) || name as org_chart
FROM org_hierarchy
ORDER BY path;
Looking to Hire Top SQL Talent for Your Team?
Finding skilled database professionals who can write efficient queries and design robust database systems is challenging. At Lupa, we specialize in connecting US companies with premium SQL talent from Latin America who are fully aligned on timezone, culture, and technical skills.
Our rigorous vetting process ensures you get access to professionals who can ace these sql interview questions and deliver real value to your team. Unlike traditional recruiters focused on volume, we prioritize match quality and deep technical assessment to find candidates who are truly the right fit for your specific needs.
Book a discovery call with Lupa today to learn how our strategic, high-integrity recruiting approach can help you build your dream data team without the timezone friction or cultural misalignment that comes with traditional offshore hiring.




















