Job Overview
We are seeking a highly motivated and skilled Python Data Engineer to join our dynamic team at Virtusa. As a Data Engineer, you will play a pivotal role in designing, developing, and maintaining our data infrastructure and pipelines. This entry-level role is perfect for someone passionate about data, software development, and cloud technologies, eager to contribute to high-impact projects.
Key Deliverables
- Design, develop, and maintain scalable and reliable data pipelines using Python, DBT, and Snowflake to support data ingestion, transformation, and storage, ensuring high data quality and availability.
- Develop and maintain Infrastructure as Code (IaC) using Terraform or CloudFormation to automate the setup and management of data infrastructure components on AWS.
- Implement and maintain CI/CD pipelines using tools like GitHub Actions or Jenkins to automate testing, integration, and deployment of data pipelines.
- Conduct thorough code reviews and adhere to coding best practices to ensure code quality, maintainability, and security across all data engineering projects.
- Proactively monitor data pipelines for performance bottlenecks and cost inefficiencies, implementing optimizations to improve throughput, reduce latency, and minimize operational expenditures.
- Create and maintain comprehensive documentation for data pipelines, software designs, and operational procedures to facilitate knowledge sharing and ensure audit readiness.
Essential Requirements
- Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
- Strong proficiency in Python programming language, with experience in data manipulation libraries such as Pandas and data pipeline frameworks like Apache Beam or Airflow.
- Solid understanding of SQL and experience working with relational databases such as PostgreSQL, MySQL, or cloud-based data warehouses like Snowflake or AWS Redshift.
- Familiarity with cloud computing platforms, particularly AWS, and its data-related services such as S3, Glue, Lambda, Athena, and IAM.
- Basic understanding of data modeling concepts and experience designing schemas for analytical and transactional databases.
Preferred Qualifications
- Experience with data build tool (DBT) for data transformation and modeling.
- Experience with Infrastructure as Code (IaC) tools like Terraform or CloudFormation to automate infrastructure provisioning and management.
- Familiarity with CI/CD pipelines and automated testing frameworks.
- Exposure to Agile development methodologies and experience working in cross-functional teams.
- Knowledge of data governance principles and data quality management practices.
Skills
Must-Have Skills
- Technical: Python (proficient in data manipulation libraries), SQL (advanced), AWS (S3, Glue, Lambda, Athena), DBT, Git
- Domain Knowledge: Data warehousing principles, ETL processes, Data modeling
- Behavioral & Interpersonal: Strong communication skills, Collaboration within Agile teams, Problem-solving attitude
- Process & SOP: Experience in documenting data pipeline processes, Code review participation, Adherence to coding standards
- Analytical & Problem-Solving: Ability to debug and optimize data pipelines, Root cause analysis
Good-to-Have Skills
- Advanced Technical: Experience with orchestration tools like Airflow, Knowledge of Spark or other distributed processing frameworks, Proficiency in IaC
- Additional Certifications: AWS Certified Data Engineer, Python certifications
- Cross-Functional Exposure: Interacting with data scientists and Business Stakeholders
- Leadership Traits: Mentoring junior data engineers
- Continuous Improvement: Familiarity with data quality monitoring techniques
Additional Information
- This is a full-time, onsite position based in Chennai, Tamil Nadu, India.
- Standard working hours apply, with potential for flexible scheduling arrangements after the initial probation period.
- You will report to a Senior Data Engineer or Engineering Manager within the Data Engineering team.
- Standard onboarding procedures and tool training will be provided upon joining.