Data & Databricks Test Automation Engineer
About the job
Data & Databricks Test Automation Engineer
Company Overview
Our client is a global leader in financial and technology-driven services, delivering innovative data solutions to some of the world's largest institutional clients. They are seeking a Test Automation Engineer with a strong focus on Databricks and modern data platforms to ensure the quality, performance, and reliability of their data solutions.
Role Overview
As a Data & Databricks Test Automation Engineer, you will be responsible for developing and implementing automated testing frameworks across Databricks-based data platforms. You'll collaborate closely with data engineering teams to validate pipelines, ensure data quality, and strengthen the integrity of a modern Lakehouse architecture.
Key Responsibilities
Databricks Testing
-
Design and implement automated testing for Databricks notebooks and workflows.
-
Develop test frameworks for Delta Lake tables and ACID transactions.
-
Build automated validation processes for structured streaming pipelines.
-
Validate Delta Live Tables implementations.
Data Pipeline Testing
-
Automate testing for ETL/ELT processes within Databricks.
-
Implement Spark job testing and performance validation.
-
Create and manage test cases for data ingestion and transformation workflows.
-
Test Unity Catalog configurations, access controls, and governance models.
Quality Assurance
-
Design and execute data quality test strategies and reconciliation processes.
-
Implement performance testing for large-scale Spark jobs.
-
Ensure compliance with internal data governance and quality standards.
Monitoring & Reporting
-
Develop test monitoring frameworks and dashboards.
-
Automate quality reporting and produce actionable test metrics.
-
Maintain clear test documentation and version control across projects.
Requirements & Qualifications
Education
-
Bachelor's degree in Computer Science, Data Science, Engineering, or related field.
-
Certifications in Databricks or data testing tools are advantageous.
Technical Skills
-
2+ years' hands-on experience with Databricks (Spark).
-
Strong programming experience with Python (PySpark) and SQL.
-
Exposure to data testing frameworks and tools.
-
Familiarity with AWS services (S3, Glue, Lambda) or similar cloud platforms.
-
Knowledge of Delta Lake and Lakehouse architectures.
-
Proficiency with version control systems such as Git.
Additional Skills
-
Strong analytical and problem-solving mindset.
-
Experience in large-scale data processing environments.
-
Understanding of data governance, compliance, and data quality best practices.
-
Previous experience working within Agile or DevOps teams.
Platform Knowledge
-
Databricks workspace and notebook development.
-
Delta Lake and Delta Live Tables.
-
Unity Catalog testing for governance and permissions.
-
Spark optimization and performance analysis.
