Data Scientist / Data Engineer (Analytics + Data Pipelines)

Eurisko

India

Fresher

Save

Posted 16 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

We're hiring a Data Scientist / Data Engineer to help us turn raw data into reliable datasets, insights, and models that drive real decisions. This role blends strong data engineering (pipelines, quality, orchestration) with hands-on data science (analysis, experimentation, forecasting, ML when needed). You'll work closely with product and engineering teams to build data products that are accurate, scalable, and actionable.

What you'll do

Design and build end-to-end data pipelines (batch and, if applicable, streaming).
Collect, clean, transform, and model data into well-structured datasets for analytics and ML.
Develop and maintain a data warehouse/lake model (dimensional modeling, data marts, curated layers).
Implement data quality checks, observability, lineage, and monitoring.
Perform exploratory analysis and deliver insights via dashboards, notebooks, and stakeholder-ready summaries.
Build and deploy ML models when needed (forecasting, churn/segmentation, anomaly detection, recommendations).
Run experiments / A/B testing support (metrics definitions, evaluation, statistical validity).
Collaborate with backend teams to define event schemas, tracking plans, and data contracts.
Optimize performance and cost across storage, compute, and queries.

Must-have skills

Strong SQL and solid programming skills (Python preferred).
Experience building pipelines using tools like Airflow / Dagster / Prefect (or equivalent).
Strong knowledge of data modeling (star schema, slowly changing dimensions, event modeling).
Experience with at least one of: PostgreSQL / MySQL / BigQuery / Snowflake / Redshift.
Proven ability to validate data correctness and implement data quality frameworks.
Comfortable communicating insights and technical trade-offs to non-technical stakeholders.

Nice-to-have skills

Streaming: Kafka / Kinesis / PubSub, real-time processing (Spark Streaming / Flink).
Big data: Spark, distributed compute, partitioning strategies.
Lakehouse: Iceberg / Delta / Hudi, object storage (S3/GCS/Azure Blob).
MLOps: MLflow, model monitoring, feature stores, deployment pipelines.
BI: Superset / Power BI / Looker / Metabase, semantic layers.
Cloud: AWS/Azure/GCP (IAM, networking basics, managed data services).
Experience with privacy/security compliance (PII handling, retention policies, access controls).

What we value