Qualification & Experience:
Minimum of 8 years of experience as a Data Scientist/Engineer with demonstrated expertise in data engineering and cloud computing technologies
Technical Responsibilities
- Excellent proficiency in Python, with a strong focus on developing advanced skills
- Extensive exposure to NLP and image processing concepts
- Proficient in version control systems like Git
- In-depth understanding of Azure deployments
- Expertise in OCR, ML model training, and transfer learning
- Experience working with unstructured data formats such as PDFs, DOCX, and images
- Strong familiarity with data science best practices and the ML lifecycle
- Strong experience with data pipeline development, ETL processes, and data engineering tools such as Apache Airflow, PySpark, or Databricks
- Familiarity with cloud computing platforms like Azure, AWS, or GCP, including services like Azure Data Factory, S3, Lambda, and BigQuery
- Tool Exposure: Advanced understanding and hands-on experience with Git, Azure, Python, R programming and data engineering tools such as Snowflake, Databricks, or PySpark
- Data mining, cleaning and engineering: Leading the identification and merging of relevant data sources, ensuring data quality, and resolving data inconsistencies
- Cloud Solutions Architecture: Designing and deploying scalable data engineering workflows on cloud platforms such as Azure, AWS, or GCP
- Data Analysis: Executing complex analyses against business requirements using appropriate tools and technologies
- Software Development: Leading the development of reusable, version-controlled code under minimal supervision
- Big Data Processing: Developing solutions to handle large-scale data processing using tools like Hadoop, Spark, or Databricks
Principal Duties & Key Responsibilities:
- Leading data extraction from multiple sources, including PDFs, images, databases, and APIs
- Driving optical character recognition (OCR) processes to digitize data from images
- Applying advanced natural language processing (NLP) techniques to understand complex data
- Developing and implementing highly accurate statistical models and data engineering pipelines to support critical business decisions and continuously monitor their performance
- Designing and managing scalable cloud-based data architectures using Azure, AWS, or GCP services
- Collaborating closely with business domain experts to identify and drive key business value drivers
- Documenting model design choices, algorithm selection processes, and dependencies
- Effectively collaborating in cross-functional teams within the CoE and across the organization
- Proactively seeking opportunities to contribute beyond assigned tasks
- Required Competencies:
- Exceptional communication and interpersonal skills
- Proficiency in Microsoft Office 365 applications
- Ability to work independently, demonstrate initiative, and provide strategic guidance
- Strong networking, communication, and people skills
- Outstanding organizational skills with the ability to work independently and as part of a team
- Excellent technical writing skills
- Effective problem-solving abilities
- Flexibility and adaptability to work flexible hours as required
Key competencies / Values:
- Client Focus: Tailoring skills and understanding client needs to deliver exceptional results
- Excellence: Striving for excellence defined by clients, delivering high-quality work
- Trust: Building and retaining trust with clients, colleagues, and partners
- Teamwork: Collaborating effectively to achieve collective success
- Responsibility: Taking ownership of performance and safety, ensuring accountability
- People: Creating an inclusive environment