Key Responsibilities:
- Work closely with clients to understand their business requirements and design data solutions that meet their needs.
- Develop and implement end-to-end data solutions that include data ingestion, data storage, data processing, and data visualization components.
- Design and implement data architectures that are scalable, secure, and compliant with industry standards.
- Work with data engineers, data analysts, and other stakeholders to ensure the successful delivery of data solutions.
- Participate in presales activities, including solution design, proposal creation, and client presentations.
- Act as a technical liaison between the client and our internal teams, providing technical guidance and expertise throughout the project lifecycle.
- Stay up-to-date with industry trends and emerging technologies related to data architecture and engineering.
- Develop and maintain relationships with clients to ensure their ongoing satisfaction and identify opportunities for additional business.
- Understands Entire End to End AI Life Cycle starting from Ingestion to Inferencing along with Operations.
- Exposure to Gen AI Emerging technologies.
- Exposure to Kubernetes Platform and hands on deploying and containorizing Applications.
- Good Knowledge on Data Governance, data warehousing and data modelling.
Requirements:
- Bachelors or Masters degree in Computer Science, Data Science, or related field.
- 10+ years of experience as a Data Solution Architect, with a proven track record of designing and implementing end-to-end data solutions.
- Strong technical background in data architecture, data engineering, and data management.
- Extensive experience on working with any of the hadoop flavours preferably Data Fabric.
- Experience with presales activities such as solution design, proposal creation, and client presentations.
- Familiarity with cloud-based data platforms (e.g., AWS, Azure, Google Cloud) and related technologies such as data warehousing, data lakes, and data streaming.
- Experience with Kubernetes and Gen AI tools and tech stack.
- Excellent communication and interpersonal skills, with the ability to effectively communicate technical concepts to both technical and non-technical audiences.
- Strong problem-solving skills, with the ability to analyze complex data systems and identify areas for improvement.
- Strong project management skills, with the ability to manage multiple projects simultaneously and prioritize tasks effectively.
Tools and Tech Stack:
Data Architecture and Engineering:
Hadoop Ecosystem:
- Preferred: Cloudera Data Platform (CDP) or Data Fabric.
- Tools: HDFS, Hive, Spark, HBase, Oozie.
Data Warehousing:
- Cloud-based: Azure Synapse, Amazon Redshift, Google Big Query, Snowflake, Azure Synapsis and Azure Data Bricks
- On-premises: Teradata, Vertica
Data Integration and ETL Tools:
- Apache NiFi, Talend, Informatica, Azure Data Factory, Glue.
Cloud Platforms:
- Azure (preferred for its Data Services and Synapse integration), AWS, or GCP.
Cloud-native Components:
- Data Lakes: Azure Data Lake Storage, AWS S3, or Google Cloud Storage.
- Data Streaming: Apache Kafka, Azure Event Hubs, AWS Kinesis.
HPE Platforms:
- Data Fabric, AI Essentials or Unified Analytics, HPE MLDM and HPE MLDE
AI and Gen AI Technologies:
AI Lifecycle Management:
- MLOps: MLflow, KubeFlow, Azure ML, or SageMaker, Ray
- Inference tools: TensorFlow Serving, K Serve, Seldon
Generative AI:
- Frameworks: Hugging Face Transformers, LangChain.
- Tools: OpenAI API (e.g., GPT-4)
Orchestration and Deployment:
Kubernetes:
- Platforms: Azure Kubernetes Service (AKS) or Amazon EKS or Google Kubernetes Engine (GKE) or Open Source K8
- Tools: Helm
CI/CD for Data Pipelines and Applications:
- Jenkins, GitHub Actions, GitLab CI, or Azure DevOps