
Search by job, company or skills
About Company: Straive:
Straive is a market leading Content and Data Technology company providing data services, subject matter expertise, & technology solutions to multiple domains.
Data Analytics & Al Solutions, Data Al Powered Operations and Education & Learning form the core pillarsof the company's long-term vision. The company is a specialized solutions provider to business information providers in finance, insurance, legal, realestate, life sciences and logistics. Straive continues to be the leading content services provider to research and education publishers.
Data Analytics & Al Services: Our Data Solutions business has become critical to our client's success. We use technology and Al with human experts-in loop to create data assets that our clients use to power their data products and their end customers workflows. As our clients expect us to become their future-fit Analytics and Al partner, they look to us for help in building data analytics and Al enterprise capabilities for them. With a client-base scoping 30 countries worldwide, Straive's multi-geographical resource pool is strategically located in eight countries - India, Philippines, USA, Nicaragua, Vietnam, United Kingdom, and the company headquarters in Singapore.
Website: https://www.straive.com/
Job Overview:
Key Responsibilities:
Pipeline Orchestration: Design, develop, and maintain scalable data pipelines using Apache Airflow (Cloud Composer) and GCP services (Dataflow, Pub/Sub, Cloud Functions).
Data Modeling: Design and optimize sophisticated BigQuery data schemas that support both real-time (streaming) and batch processing. Implement advanced Partitioning and Clustering strategies to maximize query performance, minimize data scan costs, and ensure a scalable foundation for downstream analytics.
Quality & Governance: Implement automated data quality checks and monitoring to ensure the integrity of the data ecosystem.
Advanced Analytics & SQL: Write and optimize highly complex SQL queries for data modeling, performance tuning, and troubleshooting deep-seated data issues.
Data Visualization: Build and maintain PowerBI dashboards. This includes configuring the On-premises Data Gateway (if applicable) or using the Power BI Google BigQuery connector with DirectQuery/Import modes to ensure low-latency reporting.
Performance Tuning: Monitor and optimize Dataflow (Apache Beam) jobs and Dataproc (Spark/Hadoop) clusters, focusing on auto-scaling configurations and minimizing shuffle to reduce latency and cloud compute costs.
Requirements:
Job Details:
Education:
Technical Skills Needed:
Keywords:
Email: [Confidential Information]
Job ID: 143227111