We are building a scalable data collection, storage, and distribution platform to consolidate data from vendors, research providers, exchanges, prime brokers, and web scraping sources. The platform will serve systematic & fundamental portfolio managers (PMs), as well as enterprise teams including Ops, Risk, Trading, and Compliance. The role involves developing internal data products and analytics while optimizing data pipelines.
Key Responsibilities:
- Web Scraping & Data Acquisition
- Utilize scripts, APIs, and web scraping tools to collect market data.
- Data Engineering & Pipeline Development
- Build and maintain a greenfield data platform on Snowflake and AWS.
- Enhance and optimize existing data pipelines to meet new business needs.
- Onboard and integrate new data providers efficiently.
- Data Storage & Migration
- Handle data migration projects ensuring seamless transition and accuracy.
- Implement efficient storage solutions to support analytical workloads.
- DevOps & Cloud Infrastructure
- Work with Docker, Kubernetes, and Jenkins for scalable deployment.
- Optimize cloud infrastructure on AWS for performance and cost-effectiveness.
Mandatory Skills & Experience:
- 10+ years of experience as a Data Engineer.
- Strong SQL & Python programming skills.
- Proficiency in Linux environments.
- Experience with AWS-based data solutions.
- Containerization expertise (Docker, Kubernetes).
- DevOps knowledge (Jenkins, CI/CD pipelines).
- Excellent communication and problem-solving skills.
Nice-to-Have Skills:
- Experience with Market Data / Capital Markets projects.
- Knowledge of Apache Airflow for workflow automation.