
Search by job, company or skills
We are looking for a Data Engineer to join our team and help build an AI-powered data ingestion platform. The platform collects data from multiple sources and loads it into analytical databases that power our B2B analytics product, BevGenie, where users can interact with their data using natural language.
In this role, you will work closely with a senior data engineer and use AI coding assistants (Claude Code) to build, improve, and maintain production-grade data pipelines. You will be responsible for extracting, cleaning, transforming, and loading data from both structured and unstructured sources.
Key Responsibilities:
Design, build, and maintain scalable data ingestion pipelines
Extract data from websites, APIs, documents (PDFs), emails, and databases
Parse, clean, and transform structured and unstructured data
Implement data validation, quality checks, and error handling
Handle missing, duplicate, and malformed data
Ensure pipelines are reliable and idempotent
Orchestrate workflows using Dagster or similar tools
Load data into PostgreSQL, Supabase, Snowflake, and MongoDB
Collaborate with senior engineers and cross-functional teams
35 years of experience in Python-based development or data engineering
Strong Python fundamentals (functions, classes, logging, error handling)
Experience working with APIs and common data formats (JSON, CSV, XML)
Strong SQL skills including joins, aggregations, CTEs, and window functions
Understanding of relational database concepts, schema design, and indexing
Basic knowledge of web scraping and data extraction techniques
Familiarity with handling structured and unstructured data
Understanding of data quality, validation, and pipeline reliability
Good problem-solving and analytical skills
Strong communication and collaboration skills
Willingness to learn new tools and technologies
Nice to Have:
Experience with Dagster, Airflow, or Prefect
Experience with web scraping tools (BeautifulSoup, Scrapy, Playwright)
Experience with PDF extraction tools
Exposure to AWS services (S3, Lambda, Glue)
Knowledge of dimensional modeling or analytics data modeling
Any experience using AI/LLMs for data extraction
Work closely with and learn from a senior data engineer
Hands-on experience building real production data pipelines
Exposure to modern AI-assisted development workflows
Opportunity to work with diverse data sources and technologies
Remote-first and flexible work environment
Competitive compensation
High-impact role in a growing product-focused team
Job ID: 145334099