At DeepIQ, we are building the world's best Data + AI Ops platform for multi-modal data. Our platform helps some of the world's largest enterprises generate insights from their telemetric (OT), geospatial and structured data. Our team includes leading experts in cloud and machine learning technologies with deep expertise in commercializing novel software and scaling high-growth companies. If you are passionate about deep learning and AI and enjoy writing production quality software, then we have the right job for you! We are committed to providing you an ideal work-life balance and a great place to do high quality work, learn new skills and grow in your career.
JobResponsibilities
- Design agentic workflows and implement them using suitable LLM's.
- Research about latest trends and Tech in Gen AI space.
- Design and build software for large-scale data extraction and transformation from streaming, structured and unstructured data sources.
- Work closely with data scientists to develop and deploy data engineering and machine learning algorithms.
- Writepythoncode to cleanse, manipulate and analyse large datasets including SQL/NoSQL databases, XMLs, JSON and PDF files
- Develop ETL workflows primarily usingPythonbased software that satisfy identified business and data requirements.
- Follow set standards and best practices in ETL development. Refactor existing jobs that do not follow set standards.
- Provide guidance on ETL workflow implementation such as scheduling jobs, troubleshooting job errors, identifying issues in unusually long running jobs, etc.
Skills
- Undergraduate or higher degree in Computer Science or related disciplines
- 2+ years of professional programming experience inPythonfor data processing and analysis
- Expertise inPythondata packages (Pandas, Luigi, SQLAlchemy, etc)
- Strong SQL experience
- Experience with LLM's, OpenAI, Langchain and Langraph
- Experience with relational databases (Oracle/SQL server) and NoSQL databases (MongoDB)
- Experience in Apache Spark is a plus
- Experience with cloud-based EDW platforms like Snowflake, Redshift, BigQuery is a plus