Search by job, company or skills

Apex Systems

Senior Machine Learning Engineer

Save
  • Posted a day ago
  • Be among the first 10 applicants
Early Applicant

Job Description

As a Senior Machine Learning Engineer (Model Training & Evaluation) , you will own the end-to-end training and evaluation cycle for our document AI models.

Working closely with the Principal Machine Learning Engineer, you will transform research direction into reliable, reproducible, and scalable experimentation pipelines, ensuring model improvements are measurable and production ready.

This role is ideal for engineers who thrive at the intersection of applied #ML research and production-grade engineering, combining deep technical expertise with strong experimental rigor.

Key Responsibilities:

· Own the end-to-end training pipeline, including data ingestion, orchestration, checkpointing, and result logging

· Execute large-scale experiments with strong emphasis on reproducibility and traceability

· Implement and validate new optimization techniques and training objectives in collaboration with senior ML leadership

· Continuously improve pipeline efficiency to reduce iteration time while maintaining experiment quality

Evaluation & Benchmarking

· Design and maintain comprehensive evaluation and benchmarking frameworks

· Define clear success metrics across accuracy, latency, memory usage, and domain coverage

· Build automated evaluation pipelines to detect regressions across model checkpoints

· Analyze results to identify patterns in model performance and quality trade-offs

· Partner with Data teams to ensure improvements in training data translate to measurable gains

· Maintain and evolve benchmarking methodologies aligned with industry best practices

Infrastructure & Collaboration

· Partner with Platform Engineering on distributed training infrastructure and experiment tracking systems

· Develop internal tooling to support model analysis and research workflows

· Contribute to team standards around reproducibility, experiment tracking, and documentation

· Collaborate with Platform teams to support model deployment, optimization, and serving.

Education & Experience

· MS or PhD in Computer Science, Engineering, Mathematics, or related field

· 5+ years of experience in #MachineLearning, Applied #AI, or related areas

· Proven experience training and evaluating large-scale language and/or vision-language models

· Strong background in building evaluation frameworks and benchmarking systems

· Model optimization or efficient training techniques

Technical Expertise

· Deep understanding of model optimization and compression (e.g., quantization, pruning)

· Strong proficiency in #Python and #PyTorch, including distributed training frameworks (e.g., #DeepSpeed, FSDP)

· Expertise in evaluation methodology and benchmark design

· Experience with experiment tracking and reproducibility practices

· Familiarity with vision-language model architectures and document AI challenges

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 149017067

Similar Jobs

Bengaluru, India

Skills:

snowflake GithubMachine LearningDeep LearningTensorflowPytorchMLopsSparkGitlabAzurePythonAWSLLMOpsAiAgentic AIAgentic Coding Frameworks

Bengaluru, India

Skills:

graph databases PytorchNeo4jPythonMlself-hosted LLMsknowledge graphsDPOAI engineeringHugging Facevector databasesSGLangLangSmithembedding-based retrievalBM25ArangoDBLangfuseTensorRT-LLMfine-tuningSFTArizevLLMBraintrustLoRAQLoRA

Bengaluru, India

Skills:

Open CvTensorflowPytorchMLopsPythonAWSLangchainedge computingHuggingFaceGovector databasesRAG based applicationsLLM architectures

Bengaluru, India

Skills:

GcpDockerAzureKubernetesPythonAWSAirflowTFXMLflow

Bengaluru, India

Skills:

PytorchPythonexperiment trackingDeepSpeedbenchmarking systemsevaluation frameworksFSDPreproducibility practicesCompressionmodel optimization