Search by job, company or skills

TestUnity - A Crowdsourced Testing Platform

Machine Learning Engineer

This job is no longer accepting applications

  • Posted 13 days ago

Job Description

Mandatory Skills: Python (3.9+), Py Spark & Spark Internals, Databricks, Statistics/ML Libraries (stats models,

scikit-learn, SciPy, Pandas, NumPy DID, Synthetic Control, A/B testing, hypothesis testing, panel data methods),

API Development, Azure Cloud Platform, Kubernetes, Docker, Py Test.

Role Overview:

We're looking for an ML Engineer to join our Test & Learn Platform team. You'll build and scale our

experimentation and causal inference services — from statistical engines to API integrations and cloud pipelines

— empowering business teams globally to make data-driven decisions.

Responsibilities:

  • Develop and maintain statistical/ML modules (DID, Synthetic Control, A/B Testing, Multi-Treatment

Effects) in Python

  • Build and extend Fast API services and integrate them with our web application via SDK wrappers
  • Design and optimize large-scale data pipelines using PySpark, Delta Lake, and Azure Data Lake
  • Profile and resolve OOM issues in PySpark jobs - optimize memory allocation, partitioning, broadcast

joins, caching strategies, and Spark configurations

  • Deploy and manage workloads on Databricks, including job clusters, notebooks, and Delta Lake tables
  • Containerize and deploy services using Docker, Kubernetes, and CI/CD pipelines
  • Ensure code quality and security via Sonar Cloud, Snyk, and PyTest
  • Collaborate with data scientists and product teams to translate research into production-ready modules

Requirements:

  • 3+ years of production experience in Python (3.9+)
  • PySpark & Spark Internals - strong experience with Spark memory model, executor tuning, shuffle

optimization, and diagnosing/resolving OOM errors (broadcast thresholds, partition skew, spill-to-disk,

GC tuning)

  • Databricks - hands-on with job orchestration, cluster configuration, notebook workflows, and Delta Lake

optimization (Z-ordering, compaction, caching)

  • Causal Inference & Experimentation - DID, synthetic control, A/B testing, hypothesis testing, 5. Statistics/ML Libraries - statsmodels, scikit-learn, scipy, pandas, numpy
  • API Development - building RESTful services with FastAPI (or similar)
  • Cloud (Azure) - Azure Storage, Azure ML, Data Lake
  • Docker & Kubernetes - containerization and orchestration for ML workloads
  • Testing - writing robust unit/integration tests with pytest

Good-to-Have:

  • Experience with Celery/Redis for async task orchestration
  • Familiarity with Polars, PyArrow, or SQL Alchemy
  • Background in econometrics or experimental design
  • Spark UI profiling and performance benchmarking
  • CI/CD tooling (Sonar Cloud, Snyk, GitHub Actions)Tips: Provide a summary of the role, what success in the position looks like, and how this role fits into the organization overall.

Responsibilities

[Be specific when describing each of the responsibilities. Use gender-neutral, inclusive language.]

Example: Determine and develop user requirements for systems in production, to ensure maximum usability

Qualifications

[Some qualifications you may want to include are Skills, Education, Experience, or Certifications.]

Example: Excellent verbal and written communication skills

Skills: lake,spark,azure,data,api,a/b testing,cloud,ml,testing,docker

More Info

Job Type:
Industry:
Employment Type:

Job ID: 149167311

Similar Jobs

Bengaluru, India

Skills:

S3Machine LearningPytorchEc2XGBoostPythonAWSrioxarrayAirflowscikit-learnLightGBMGDALMLflowxarraySagemakerrasteriogeopandas

Bengaluru, India

Skills:

PysparkLogistic RegressionFactor AnalysisPredictive AnalysisSqlNlpCluster AnalysisPythonStatistical ModellingLLMsStatistical tools and techniquesMultivariate Regression

Bengaluru, India

Skills:

Machine LearningApache AirflowTensorflowPytorchDockerFlaskPythonAWSApache SparkApi DevelopmentDeep LearningDjangoJenkinsGcpMLopsFastAPIAzureKubernetesComputer VisionHugging Face TransformersRayPrefectGitHub ActionsDaskLangChainGenerative AIDagster

Bengaluru, India

Skills:

PysparkDatabricksSql

Bengaluru, India

Skills:

BigQueryMachine LearningGoogle Cloud PlatformTensorflowGitTypescriptPytorchJavascriptBitbucketTerraformDockerSqlitePostgresFastAPIPythonKubernetesPytestArgo WorkflowsCloudflare WorkersScikit-LearnAirFlowStreamlitArgoCD