JOB OVERVIEW
The ideal candidate will be responsible for turning warehouse, manufacturing, and quality data into production-grade planning models, optimization engines, and computer vision systems that measurably improve throughput, labor productivity, fulfillment accuracy, yield, and defect detection across Callaway's Global Operations. This role centers on planning and optimized execution in the warehouse and on computer vision for quality and operations, and increasingly on generative AI, large language models, and intelligent agents that automate operational workflows and decision support, building solutions end to end on Microsoft Azure and Snowflake. The Data Scientist will partner closely with warehouse operations, manufacturing engineering, quality, and supply planning teams across the US, Mexico, and Asia to scope problems, define success metrics tied to operational KPIs, and deploy and monitor solutions in production.
Key Responsibilities:
- Design, build, and deploy optimization models for warehouse planning and execution — slotting and SKU-to-location assignment, pick-path and travel-distance minimization, wave and batch planning, order batching and zoning, and dock and door scheduling.
- Develop labor and capacity planning models, including task-level workload prediction, staffing and shift optimization, and dynamic work allocation, to balance pick, pack, and putaway throughput against service-level targets.
- Build and deploy computer vision models for automated quality inspection — surface-defect, dimensional, and cosmetic inspection and pass/fail classification across club, ball, and component lines.
- Apply computer vision to operations, including package and label verification, pick and putaway confirmation, carton and pallet counting and dimensioning, OCR of serials and lot codes, and process-compliance and safety monitoring.
- Formulate and solve linear, mixed-integer, and constraint-optimization problems, and design discrete-event and Monte Carlo simulations to stress-test layout changes, automation investments, and peak-season scenarios before physical implementation.
- Build and maintain data and image pipelines on Microsoft Azure, and productionize models and inference endpoints with monitoring, drift detection, and automated retraining and fallback logic.
- Design and deploy generative-AI and agentic solutions for operations — LLM-powered assistants and multi-agent workflows that retrieve and reason over SOPs, work instructions, quality records, and ERP/WMS data, automate document and email extraction, and deliver decision support to planners and floor teams.
- Translate model and optimization output into Power BI dashboards and operational workflows, and communicate findings, assumptions, and trade-offs clearly to technical and non-technical stakeholders.
Requirements:
- Bachelor's or Master's degree in Computer Science, Data Science, Statistics, Operations Research, Industrial or Mechanical Engineering, Applied Mathematics, or a related field.
- 3+ years of experience in a data science role, preferably in a warehouse, manufacturing, or supply chain environment.
- Strong proficiency in Python for data science and machine learning (pandas, NumPy, scikit-learn) and at least one deep-learning framework (PyTorch or TensorFlow).
- Demonstrated computer vision experience — building, training, and evaluating image classification, object detection, or segmentation models on real-world, preferably industrial or inspection, imagery.
- Experience with optimization and operations research — linear, integer, or constraint programming applied to a real planning, scheduling, or routing problem.
- Hands-on experience with Microsoft Azure for data and machine learning workloads (Azure Machine Learning, Azure Data Factory or Synapse, and Azure AI / Computer Vision).
- Experience building applications with large language models and generative AI — prompt engineering, retrieval-augmented generation (RAG), function and tool calling, and agentic workflows, ideally on Azure OpenAI or Azure AI Foundry.
- Strong SQL and experience with a cloud data warehouse, with Snowflake strongly preferred.
- Excellent communication skills and the ability to work collaboratively across cross-functional, multi-time-zone teams.
Technical Skills:
- Warehouse planning and optimized execution: The ideal candidate should be able to model and solve real warehouse problems — slotting, pick-path optimization, wave and batch planning, labor and capacity planning, and in-warehouse replenishment — using mixed-integer and constraint solvers such as OR-Tools, Gurobi, CPLEX, or PuLP, and to validate designs with discrete-event and Monte Carlo simulation tools such as Simio, AnyLogic, or SimPy.
- Computer vision: The candidate should have hands-on expertise with modern CV architectures and toolchains, including CNNs, object detection and segmentation (YOLO, Detectron2, Mask R-CNN), defect and anomaly detection with limited labels, and transfer learning, using PyTorch or TensorFlow alongside Azure AI Vision and Custom Vision. They should own the full lifecycle from image-acquisition strategy and annotation through edge or cloud deployment and human-in-the-loop feedback.
- Azure cloud and MLOps: The candidate should be proficient with Microsoft Azure for data and ML, including Azure Machine Learning (training, model registries, and managed or edge endpoints), Azure Data Factory and Synapse, Blob and Data Lake storage, and IoT Edge or AKS for inference, supported by Git, MLflow, Docker, and CI/CD, with model and drift monitoring and retraining in place.
- Data engineering and SQL: The candidate should write performant, well-tested SQL against Snowflake (window functions, CTEs, semi-structured VARIANT/JSON handling, and query optimization for large fact tables) and build reproducible feature and image pipelines with clear data lineage and documentation.
- Generative AI, agents, and LLMs: The candidate should be able to design and ship LLM-based applications and autonomous or human-in-the-loop agents — using retrieval-augmented generation, function and tool calling, and orchestration frameworks such as LangChain, LlamaIndex, or Semantic Kernel — grounded in operational data and deployed on Azure OpenAI or Azure AI Foundry, with attention to evaluation, guardrails, latency, and cost.
- Operations analytics and visualization: The candidate should apply classification, regression, anomaly detection, and statistical process control to quality and process data, and translate results into Power BI semantic models and DAX (within Microsoft Fabric) to deliver decision-ready dashboards, communicating complex analytical results to non-technical stakeholders in a clear and visually effective manner.