Search by job, company or skills

Rimini Street

AI Data Engineer

Save
new job description bg glownew job description bg glow
  • Posted 4 days ago
  • Be among the first 20 applicants
Early Applicant

Job Description

About Rimini Street, Inc.

Rimini Street, Inc. (Nasdaq: RMNI), a Russell 2000® Company, is a proven, trusted global provider of end-to-end, mission-critical enterprise software support, managed services and innovative Agentic AI ERP solutions, and is the leading third-party support provider for Oracle, SAP and VMware software.

Our comprehensive portfolio of unified solutions help run, manage, support, customize, configure, connect, protect, monitor, and optimize enterprise application, database and technology software, enabling our clients to achieve better business outcomes, significantly reduce costs and reallocate resources towards strategic projects.

The Company has signed thousands of contracts with Fortune Global 100, Fortune 500, midmarket, public sector and government organizations who selected Rimini Street as their trusted, proven mission-critical enterprise software solutions provider and achieved better operational outcomes, realized billions of US dollars in savings and funded AI and other innovation investments.

We are actively seeking a AI Data Engineer – Agentic ERP Platform. This role is based in India.

About Rimini Street, India, GCC.

Rimini Street Inc, HQ : Las Vegas, NV, USA a disruptor in third party ERP support services, established undisputed leadership and as a natural progression, entered India with Rimini Street, India GCC India kick starting operations in Hyderabad, in 2013 with Global Client Onboarding Services, IT shared services and Global Service Development. In no time, Rimini Street, India GCC started Bengaluru operations going up the value chain with more complex product development (Oracle, SAP, Peoplesoft, JDE etc.) & advanced services (Managed services, Professional services, Security Managed Services etc).

Rimini Street, India GCC gained valuable share in bringing the reputation to Rimini Street Inc of being a global provider of unified support and managed service solutions for enterprise software. Today, Rimini Street, India GCC is a family of about 800+ full time talented individuals, thanks to the remarkable talent that has supported the expansion.

Rimini Street, India has nicely emerged as Global Capability Centre (GCC), and proudly says, if you are best of the best, you belong to Rimini. We are on a mission to contribute significantly to our Rimini ONE program, a turnkey Rimini Street service program that offers a comprehensive set of unified, integrated services that can run, manage, support, customize, configure, connect, protect, monitor, and optimize your Oracle and SAP ERP, database, and technology software.

Position Summary

The AI Data Engineer is responsible for building the knowledge layer of Rimini Street's Agentic ERP Platform—the data pipelines, RAG (Retrieval-Augmented Generation) systems, and embedding infrastructure that give AI agents access to the right information at the right time. This role owns how knowledge is ingested, processed, indexed, and retrieved to support intelligent agent behavior.

Reporting to the Sr. Director, Engineering, this engineer designs the data architecture that powers agent intelligence—from extracting knowledge from Rimini Street's 15+ years of support case history to building real-time retrieval systems for customer-specific context. The ideal candidate combines strong data engineering fundamentals with modern AI/ML knowledge, particularly in embeddings, vector search, and retrieval optimization.

Essential Duties & Responsibilities

RAG Pipeline Development

  • Design and build RAG pipelines that retrieve relevant context from knowledge bases to augment AI agent responses.
  • Implement chunking strategies optimized for different content types: support tickets, documentation, policies, transaction records, and email threads.
  • Develop hybrid retrieval approaches combining dense embeddings, sparse search (BM25), and metadata filtering.
  • Build query understanding and reformulation logic to improve retrieval relevance.
  • Implement retrieval evaluation frameworks to measure and optimize precision, recall, and relevance.
  • Design reranking pipelines that prioritize the most relevant results for agent consumption.

Embedding & Vector Infrastructure

  • Implement and manage vector storage using PostgreSQL with pgvector extension, including index optimization for search performance.
  • Evaluate and select embedding models appropriate for enterprise content (technical documentation, business processes, ERP terminology).
  • Build embedding pipelines that process documents at scale with appropriate batching and error handling.
  • Implement incremental indexing strategies for real-time updates without full reprocessing.
  • Design multi-tenant vector architectures that isolate customer data while enabling efficient search.
  • Monitor and optimize vector search performance: latency, accuracy, and resource utilization.

Data Ingestion & Processing

  • Build data pipelines to ingest knowledge from diverse sources: Salesforce support tickets, ServiceNow cases, documentation repositories, email archives, and ERP transaction logs.
  • Implement ETL processes that clean, normalize, and enrich raw data for AI consumption.
  • Develop document processing pipelines: PDF extraction, HTML parsing, structured data normalization.
  • Build connectors to source systems including Salesforce, ServiceNow, SharePoint, and Confluence.
  • Implement data quality monitoring and alerting for ingestion pipelines.
  • Design data lineage tracking to understand how knowledge flows from source to agent consumption.

Knowledge Architecture

  • Design the knowledge architecture that organizes information across the Four-Spoke model: Policy Intelligence, Institutional Memory, Rimini Collective Intelligence, and Intelligent Escalation.
  • Build knowledge graphs and relationship models that capture connections between ERP concepts, processes, and solutions.
  • Implement metadata taxonomies that enable filtered retrieval by ERP system, module, version, customer, and topic.
  • Design versioning strategies for knowledge that evolves over time (policies, procedures, best practices).
  • Build feedback loops that capture which retrieved content was useful vs. ignored, enabling continuous improvement.

Cloud Data Platform Integration

  • Integrate with Snowflake for large-scale data processing, leveraging Cortex AI capabilities where applicable.
  • Build data pipelines that move and transform data between operational systems, Snowflake, and vector stores.
  • Implement data access patterns that respect customer data isolation and security boundaries.
  • Design efficient data synchronization between cloud data warehouse and real-time retrieval systems.
  • Optimize query patterns for cost-effective data processing at scale.

Experience

  • 2-3 years of data engineering experience, with at least 1-2 years focused on AI/ML data pipelines or RAG systems.
  • Hands-on experience building and optimizing RAG pipelines in production environments.
  • Strong experience with vector databases, embeddings, and similarity search.
  • Experience with ETL/ELT pipelines and data integration from diverse source systems.
  • Production experience with PostgreSQL and SQL-based data processing.
  • Background in Python for data processing and pipeline development.
  • Experience with enterprise data platforms (Snowflake, Databricks, or similar) preferred.
  • Exposure to enterprise software, ERP systems, or support/ticketing systems preferred.

Technical Skills

Required

  • Python for data engineering: pandas, data processing pipelines, async programming.
  • PostgreSQL with strong SQL skills; experience with advanced features (JSONB, full-text search, extensions).
  • Vector databases and embeddings: pgvector, or experience with Pinecone, Qdrant, Weaviate, or similar.
  • RAG concepts: chunking strategies, embedding models, retrieval methods, reranking.
  • ETL/ELT patterns and data pipeline orchestration (Airflow, Dagster, Prefect, or similar).
  • Data modeling for both relational and document-oriented use cases.
  • Git version control and CI/CD practices for data pipelines.
  • Understanding of API integration for data extraction (REST, GraphQL).

Preferred

  • Experience with Snowflake, including Cortex AI features for vector search and ML functions.
  • Familiarity with embedding models: OpenAI embeddings, Cohere, or open-source models (BGE, E5).
  • Experience with LlamaIndex, LangChain, or Haystack for RAG pipeline development.
  • Knowledge of document processing: PDF extraction (PyMuPDF, pdfplumber), HTML parsing, OCR.
  • Experience with Salesforce and/or ServiceNow data extraction and APIs.
  • Understanding of knowledge graphs and graph databases (Neo4j, or property graphs in PostgreSQL).
  • Experience with data quality frameworks and monitoring tools.
  • Familiarity with dbt for data transformation.
  • Exposure to enterprise search systems (Elasticsearch, OpenSearch).
  • Understanding of LLM fine-tuning and training data preparation.

Skills & Competencies

  • Strong analytical mindset with ability to understand complex data relationships and design efficient retrieval strategies.
  • Data quality obsession; understands that agent intelligence is only as good as the underlying data.
  • Systems thinker who designs for scale, reliability, and maintainability.
  • Collaborative; works effectively with GenAI Engineers to understand retrieval requirements and optimize for agent consumption.
  • Problem solver who can diagnose and resolve data pipeline issues quickly.
  • Clear communicator; able to explain data architecture decisions to technical and non-technical stakeholders.
  • Self-motivated and effective in a remote environment.
  • Fluent in English (written and verbal).

Desired Qualifications

  • Bachelor's or Master's degree in Computer Science, Data Science, or related field.
  • Experience in enterprise software companies or B2B SaaS platforms.
  • Background in information retrieval, search systems, or NLP.
  • Certifications in Snowflake, AWS Data Engineering, or similar.
  • Contributions to open source data or AI/ML projects.

Location & Travel

Location: Hyderabad, India

Language: Fluent English required (written and verbal)

Why Rimini Street

We are looking for talented, passionate people to help us build our future at Rimini Street. We hire only the best, the most extraordinary professionals and provide compensation, bonuses, and benefits to match the skills of our top-performing team members. Do you thrive in a fast-paced environment, enjoy growing together, and get excited about learning new skills Are you looking for an opportunity to make a true impact as part of a team of extraordinary professionals This is the place for you.

Our work is challenging and meaningful. We start and end each day with a sense of achievement and purpose guided by our core values, the Four Cs:

  • Company
    • We dream big and innovate boldly.

  • Colleagues
    • We work with extraordinary people who create a culture of mutual respect and collaboration.

  • Clients
    • We relentlessly pursue solutions that help clients achieve their goals. Our unmatched client care is rooted in our passion for exceptional service.

  • Community
    • We believe in leaving the world a better place than we found it. With the Rimini Street Foundation, we've made positive impacts in six continents for over 425 charities.
Accelerating Company Growth

  • Nasdaq-listed under ticker symbol RMNI since October 2017
  • Over 6,300+ signed contracts to date, including Fortune 500 and Global 100 companies
  • Over 2,000 team members in 23 countries
  • US and international recognition for industry leadership and philanthropic efforts. See all of our awards and recognitions here: https://www.riministreet.com/company/awards/

Rimini Street is committed to creating a diverse and inclusive environment and is proud to be an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to age, race, color, religion, national origin, sexual orientation, gender or gender identity, disability, protected veteran status, or any other characteristic protected by law.

To learn more about how Rimini Street is redefining the enterprise software support industry, visit http://www.riministreet.com

Please Note: Rimini Street does not accept resumes submitted by recruiting/staffing firms unless specifically requested by Human Resources. Unsolicited resumes will be ineligible for referral fees.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 147611741

Similar Jobs

Hyderabad, India

Skills:

Data ModelingSqlPythondata acquisition and transformation pipelinesSnowflake architectureperformance optimizationGovernance

Hyderabad, India

Skills:

Data FactoryData ModelingSqlStorageIncident ManagementDatabricksData SecurityData GovernancePythonRoot Cause AnalysisKey VaultEvent StreamingSynapseMessagingDimensional ConceptsAzure Data Services

Hyderabad, India

Skills:

Data GovernanceSqlIncident ManagementData ModelingPythonData SecurityDimensional conceptsData ingestionObservabilityRoot Cause Analysis

Hyderabad, India

Skills:

API designSqlSpring BootMicroservicesDistributed SystemsPythonsystem integrationScalaSDLC code qualitydata platform architectureAI LLM solution designenterprise AI controlsCloud-scale engineeringSecurity governance