Search by job, company or skills

realpage, inc.

Senior Data Scientist

Save
new job description bg glownew job description bg glow
  • Posted 2 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Overview

We are looking for an end-to-end Data Scientist to design, build, and maintain ML-powered systems that solve core data quality and classification problems across the business. You will own the full lifecycle — from exploratory analysis and feature engineering through model training, deployment, and ongoing performance monitoring. The work spans entity resolution ( identifying duplicate records across large datasets) and multi-class classification models that drive decision-making across a variety of business domains.

Responsibilities

What You'll Do

Own the end-to-end model lifecycle: problem framing, data exploration, feature engineering, model training, evaluation, deployment, and monitoring

  • Build and maintain entity resolution systems that detect duplicate records using supervised ML and string similarity techniques
  • Develop classification models that categorize unstructured or semi-structured data into meaningful business categories
  • Engineer features from messy, real-world text data — names, addresses, free-text fields — using string matching algorithms, phonetic encoding, n-grams, and other NLP techniques
  • Design candidate retrieval and indexing strategies to make models performant at scale
  • Tune thresholds, scoring logic, and rule-based overrides to balance precision and recall for production use cases
  • Maintain production model artifacts and data pipelines, ensuring models stay current as underlying data evolves
  • Collaborate with engineering and product teams to understand requirements and translate business problems into well-scoped modeling tasks

Qualifications

  • 10+ years of experience building and deploying ML models end-to-end (not just notebooks)
  • Strong Python skills — pandas, NumPy, scikit-learn, XGBoost or similar gradient boosting frameworks
  • Hands-on experience with record linkage, entity resolution, or deduplication problems
  • Experience building classification models (binary and multi-class) on structured and semi-structured data
  • Deep familiarity with string similarity algorithms: edit distance, sequence matching, phonetic encoding, shingling
  • Strong feature engineering instincts — ability to extract signal from noisy, inconsistently formatted data
  • Comfort working with large serialized data structures and understanding memory/performance tradeoffs in production contexts
  • Experience with SQL and relational databases (PostgreSQL or similar)
  • Clear communication skills — ability to explain model behavior and tradeoffs to non-technical stakeholders

Nice to Have

  • Experience with blocking and indexing strategies for scalable record linkage
  • Background in NLP, text normalization, or information extraction
  • Familiarity with model serving in API contexts (Flask, FastAPI , or similar)
  • Experience in data quality, master data management, or marketplace domains
  • Exposure to deep learning frameworks ( PyTorch , TensorFlow) for text classification

More Info

Job Type:
Industry:
Function:
Employment Type:

About Company

Job ID: 148220045

Similar Jobs

Hyderabad, India

Skills:

data engineering SpotfireMachine LearningPower BiTableauSqlMLopsDatabricksPredictive AnalyticsPythonGenerative AIAisemantic searchStatistical Modeling

Hyderabad, India

Skills:

record linkage PostgreSQLDeduplicationSqlNumpyPandasXGBoostPythonphonetic encodingclassification modelsscikit-learnstring similarity algorithmsedit distanceentity resolutionsequence matchingfeature engineeringshingling

Hyderabad, India

Skills:

HadoopApache SparkDeep LearningGcpData VisualizationAzurePythonAWSLLMsGenAI modelsensemble methodsRadvanced machine learning techniques

Hyderabad, India

Skills:

SparkDatabricksPythonAzureMachine LearningMLopsAzure ML

Hyderabad, India

Skills:

Hypothesis TestingTensorflowPytorchOcrPythonGenerative AIPrompt engineeringMachine learning frameworksStatisticsModelingFeature engineeringspacyLarge language modelsTransformers