AI Governance & Secure AI Systems Engineer

EvonSys

Hyderabad, India

2-5 Years

Save

Posted a day ago
Be among the first 10 applicants

Early Applicant

Job Description

Designation: AI Governance & Secure AI Systems Engineer

(Engineering the controls, evaluation infrastructure, and assurance practices that govern production AI systems.)

Experience: 2 - 5 years (in ML Engineering, AI security, or AI/ML Governance, with hands-on exposure to production ML or LLM systems)

Location: Hyderabad / Chennai - India (Hybrid)

Employment Type: Full-Time, Permanent

Work Mode: Hybrid

Reports To: Head of IT

Operational Scope: Global; cross-functional collaboration with Data Science, ML Engineering, Security, and Product

About the Role

As AI systems move into production, the associated engineering risks become material: model drift, unintended agent behavior, prompt injection, and training-data integrity failures. Teams delivering AI capabilities require a technical counterpart capable of addressing these risks through rigorous, evidence-based engineering.

We are seeking an AI Governance & Secure AI Systems Engineer to operationalize this program. The role encompasses designing the control framework, building evaluation infrastructure, conducting adversarial testing against production systems, and embedding governance throughout the ML lifecycle — anchored in NIST AI RMF 1.0, ISO/IEC 42001, OWASP LLM Top 10, and MITRE ATLAS. This is a deeply technical position at the intersection of ML engineering and governance.

Key Responsibilities

Operationalize the AI governance program: implement and evolve controls mapped to NIST AI RMF 1.0 and ISO/IEC 42001, maintain the control-to-evidence mapping, and automate evidence collection where feasible.
Maintain the AI/ML system inventory: model cards, datasheets for datasets, system cards, intended-use documentation, and EU AI Act risk-tier classifications for every production model and agentic system.
Lead AI red-teaming and adversarial testing: prompt injection (direct and indirect), jailbreak resistance, data exfiltration, training-data poisoning, model inversion, membership inference, and tool-use abuse in agentic systems, mapped to OWASP LLM Top 10 and MITRE ATLAS.
Engineer model evaluation and assurance: design fairness, robustness, and calibration tests using Fairlearn and AI Fairness 360; conduct adversarial robustness assessments; and produce interpretability artifacts using SHAP, LIME, and integrated gradients for in-scope models.
Build and maintain evaluation pipelines: establish reproducible, regression-tested evaluation harnesses using Promptfoo, LangSmith, DeepEval, Garak, and PyRIT; integrate evaluations into CI/CD so that every model and prompt change is gated by quantitative criteria.
Embed secure MLOps practices: partner with ML Engineering on signed models, training-data integrity, hardened inference endpoints, model registries, drift and performance monitoring, rollback paths, and runtime guardrails for tool-using agents.
Govern third-party and open-source AI artifacts: review foundation-model APIs and Hugging Face artifacts; enforce model and dataset provenance, artifact scanning, and ML supply-chain controls; manage content-moderation guardrails such as Llama Guard, NeMo Guardrails, and Azure AI Content Safety.
Produce audit evidence for AI-specific obligations: ISO/IEC 42001, EU AI Act, and customer AI assurance questionnaires, supported by concrete technical artifacts in addition to policy documentation.
Conduct AI governance reviews: serve as the embedded technical reviewer for new AI use cases, providing sign-off on architecture, evaluation plans, and control coverage prior to launch.

Required Qualifications

Two or more years of experience in ML engineering, AI security, or AI/ML governance, with hands-on exposure to ML or LLM systems in production.
Demonstrated proficiency with NIST AI RMF 1.0, ISO/IEC 42001, OWASP LLM Top 10, MITRE ATLAS, and the EU AI Act risk-tier framework, including the ability to map controls to specific technical implementations.
Practical experience evaluating or testing LLM-based applications (RAG, agents, fine-tuned models) for safety, robustness, bias, and adversarial resilience.
Strong Python skills and proficiency with at least one ML framework (PyTorch, TensorFlow, or Hugging Face Transformers), including the ability to read and modify training, fine-tuning, and inference code.
Working knowledge of MLOps tooling, including model registries, experiment tracking, CI/CD for ML, and monitoring systems such as MLflow, Weights & Biases, or Kubeflow.
Proven ability to design and document technical control frameworks, evaluation methodologies, and risk assessments suitable for adoption by engineering teams.

Preferred Qualifications

Hands-on experience with AI red-teaming and evaluation tools such as Garak, PyRIT, Promptfoo, DeepEval, and Inspect AI.
Experience with adversarial robustness libraries (ART, CleverHans, TextAttack) and formal model evaluation methodology.
Substantive ML supply-chain security experience, including model signing (e.g., Sigstore for models), dataset provenance, Hugging Face artifact scanning, and SBOM/MBOM for AI systems.
Experience operationalizing agentic AI safety, including tool-use sandboxing, capability scoping, and runtime guardrails.
Contributions to AI safety, evaluation, or governance communities, such as published research, working group participation, or open-source evaluation suites.
Familiarity with content-moderation guardrail systems (Llama Guard, Azure AI Content Safety, NeMo Guardrails) deployed at scale.

Preferred Certifications

Any of the following are considered an asset: ISO/IEC 42001 Lead Implementer; AWS Certified Machine Learning Engineer – Associate (MLA-C01); AWS Certified AI Practitioner (AIF-C01); Microsoft AI-102; or a recognized AI red-teaming or adversarial ML credential.

Success Metrics (First 12 Months)

100% of production AI systems inventoried with current model cards, system cards, and EU AI Act risk classifications.
AI risk reviews completed within agreed service levels, targeting no more than 10 business days for new use cases.
Automated evaluation harness covering every production LLM feature, with model and prompt changes gated in CI/CD.
Zero unmitigated High or Critical findings in deployed AI systems at any point in time.
Audit-evidence readiness for ISO/IEC 42001 and applicable AI regulations, with the control-to-evidence mapping maintained on an ongoing basis.
A minimum of two AI red-team exercises per quarter, with documented remediation and regression tests added to the evaluation suite.

What We Offer

A foundational technical role in building responsible AI within a rapidly evolving organization.
Budget and autonomy to select the evaluation, red-teaming, and MLOps tooling appropriate to the program.
Direct engagement with leading AI safety research, tooling, and professional communities.
A hybrid work model, comprehensive benefits, and a culture that values rigorous engineering.

Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for every employee. All qualified applicants will receive consideration without regard to race, religion, color, national origin, gender, gender identity, sexual orientation, age, marital status, veteran status, or disability.