Experience: 7+ Years
Location: Remote / Hybrid / Onsite (customize)
Employment Type: Full-Time / Contract
Domain: Financial Data Processing / Contract Reconciliation / Data Platforms
Responsibilities
- Design and develop scalable data ingestion pipelines for PDFs, Excel files, CSVs, and other structured/unstructured financial documents.
- Build normalization and transformation workflows to standardize messy financial and billing data.
- Develop reconciliation and validation engines to compare invoices against contract terms, pricing schedules, SLAs, and billing rules.
- Implement ETL/ELT pipelines using Microsoft Fabric components including Data Factory, Lakehouse, Notebooks, and Pipelines.
- Build data quality checks, exception handling frameworks, and audit mechanisms for financial accuracy.
- Create reusable Python-based processing modules for document parsing, data extraction, validation, and reconciliation.
- Work with OCR/document extraction tools and integrate outputs into downstream validation systems.
- Optimize data processing performance for high-volume financial records.
- Collaborate with product, business, and engineering teams to understand reconciliation rules and billing logic.
- Ensure security, governance, and compliance standards are followed for financial data handling.
- Support deployment, monitoring, troubleshooting, and continuous improvement of data workflows.
Qualifications
- Microsoft Fabric
- Python
- PySpark
- SQL
- OneLake
- Data Factory
- Lakehouse Architecture
- PDF & Spreadsheet Processing
- REST APIs
- Azure Services (preferred)
Essential Skills
- 7+ years of experience in Data Engineering and backend data processing systems.
- Minimum 1–2 years of hands-on experience with Microsoft Fabric.
- Strong proficiency in Python and data processing libraries such as:
- Pandas
- PySpark
- NumPy
- OpenPyXL
- PDF processing libraries (PyPDF2, pdfplumber, Camelot, Tabula, etc.)
- Experience building ETL/ELT pipelines and data transformation workflows.
- Strong experience with Microsoft Fabric services:
- Fabric Data Factory
- Lakehouse
- Notebooks
- Pipelines
- OneLake
- Experience handling semi-structured and unstructured data sources.
- Strong SQL skills and data modeling knowledge.
- Experience implementing business rules, reconciliation logic, and data validation frameworks.
- Familiarity with REST APIs and external system integrations.