Q1
Walk us through how you would write a SQL query to identify duplicate records in a customer table and then remove them without losing valid data.
Why they ask this:* They're testing your SQL proficiency, data cleaning skills, and understanding of data integrity—core competencies for a Data Analyst.
Q2
Explain the difference between INNER JOIN, LEFT JOIN, and RIGHT JOIN. When would you use each one in a real analysis scenario?
Why they ask this:* This tests foundational SQL knowledge and your ability to think about data relationships, which is critical for extracting insights from multiple data sources.
Q3
You have a dataset with missing values in a key column. What methods would you consider to handle this, and what are the trade-offs of each approach?
Why they ask this:* They want to assess your problem-solving approach to data quality issues and whether you understand how different imputation methods can bias analysis results.
Q4
How would you approach analyzing a dataset with 5 million rows in Excel versus a database tool like SQL or Python? What are the limitations and advantages of each?