Posted Dec 28, 2025

Senior Data Engineer (with AWS)

Apply Now
We are seeking a talented and experienced Data Engineer to join our team at Provectus . As part of our diverse practices, including Data, Machine Learning, DevOps, Application Development, and QA, you will collaborate with a multidisciplinary team of data engineers, machine learning engineers, and application developers. You will encounter numerous technical challenges and have the opportunity to contribute to Provectus ’ open source projects, build internal solutions, and engage in RD activities, providing an excellent environment for professional growth. Requirements Experience in data engineering; Experience working with Cloud Solutions (preferably AWS, also GCP or Azure); Experience with Cloud Data Platforms (e.g., Snowflake, Databricks); Proficiency with Infrastructure as Code (IaC) technologies like Terraform or AWS CloudFormation; Experience handling real-time and batch data flow and data warehousing with tools and technologies like Airflow, Dagster, Kafka, Apache Druid, Spark, dbt, etc.; Proficiency in programming languages relevant to data engineering such as Python and SQL; Experience in building scalable APIs; Experience in building Generative AI Applications (e.g., chatbots, RAG systems); Familiarity with Data Governance aspects like Quality, Discovery, Lineage, Security, Business Glossary, Modeling, Master Data, and Cost Optimization; Advanced or Fluent English skills; Strong problem-solving skills and the ability to work collaboratively in a fast-paced environment. Nice to Have: Relevant AWS, GCP, Azure, Databricks certifications; Knowledge of BI Tools (Power BI, QuickSight, Looker, Tableau, etc.); Experience in building Data Solutions in a Data Mesh architecture; Familiarity with classical Machine Learning tasks and tools (e.g., OCR, AWS SageMaker, MLFlow, etc.). Responsibilities: Collaborate closely with clients to deeply understand their existing IT environments, applications, business requirements, and digital transformation goals; Collect and manage large volumes of varied data sets; Work directly with Data Scientists and ML Engineers to create robust and resilient data pipelines that feed Data Products; Define data models that integrate disparate data across the organization; Design, implement, and maintain ETL/ELT data pipelines; Perform data transformations using tools such as Spark, Trino, and AWS Athena to handle large volumes of data efficiently; Develop, continuously test and deploy Data API Products with Python and frameworks like Flask or FastAPI. Originally posted on Himalayas