Search by job, company or skills

Persistent

Sr Data Engineer

Save
new job description bg glownew job description bg glow
  • Posted 3 days ago
  • Over 50 applicants
Quick Apply

Job Description

What You'll Do:

  • Design and Develop: Analytics workloads using Apache Spark and Scala for big data processing.
  • Create and Optimize: Data transformation pipelines using Spark or Apache Flink.
  • Migrate Workloads: From cloud platforms to open-source Apache Spark infrastructure on Kubernetes.
  • Implement Optimization: Performance techniques for large-scale data processing

Expertise You'll Bring:

  • Scala Programming: Focus on functional programming paradigm.
  • Apache Spark: Extensive experience with core concepts and APIs, including:
  • Spark SQL and DataFrame APIs
  • Spark Structured Streaming
  • Spark MLlib for analytics
  • Distributed Computing: Strong understanding of big data processing frameworks.
  • Data Modeling: Expertise in optimization techniques for large-scale datasets.
  • Performance Tuning: Proficiency in optimizing Spark jobs.
  • Lakehouse Storage: Good understanding of technologies like Delta Lake and Apache Iceberg

More Info

About Company

We are a trusted Digital Engineering and Enterprise Modernization partner, combining deep technical expertise and industry experience to help our clients anticipate what's next. Our offerings and proven solutions create a unique competitive advantage for our clients by giving them the power to see beyond and rise above. We work with many industry-leading organizations across the world, including 12 of the 30 most innovative global companies, 60% of the largest banks in the US and India, and numerous innovators across the healthcare ecosystem.

Job ID: 107757673

Similar Jobs

Pune, India

Skills:

SqlNeo4jPythonAirflowStep FunctionsPineconeLangChain AgentsPrefectSemantic KernelArangoDBAWS NeptuneLlamaIndex AgentsFAISSsemantic indexingElastic Vector SearchDSPyMilvusWeaviatecontext chunkingembedding generation

Bengaluru, India

Skills:

ApisPysparkMicroservicesSqlDevopsHiveTerraformDockerSparkShell scriptingKubernetesPythonEtlAirflowHDFSBig Data EcosystemCI CDIceberg

Bengaluru, India

Skills:

snowflake Power BiAzure DatabricksSqlMicrosoft Fabric

Pune, India

Skills:

PysparkRDSApache SparkRedshiftSqlELTApache AirflowTddPythonEtlAWS EMRUnity CatalogDelta SharingRedshift SpectrumAthenaDatabricks Delta tablesData modeling techniques

Pune, India

Skills:

snowflake GithubData QualityData GovernanceSqlPythonGenAIData ObservabilityDataOps