We are currently looking for a remote Big Data Software Engineer with 6+ months of production experience with Spark (PySpark) to join our team.
The customer is a biotechnology company, which engages in the discovery, invention, development, manufacture, and commercialization of medicines.
The main goal is to work out a solution that consumes and stores data from multiple customer’s domains.
- Implement pipeline processing application using PySpark and Airflow
- Integrate required database structure
- Apply data marts in Hive and PostgreSQL
- Create analytical SQL Scripts in PostgreSQL or any other DB
- Communicate with English speaking colleagues and customer representatives
- 6+ months of production experience with Spark (PySpark)
- Strong skills in Apache Airflow
- Knowledge of Apache Hive
- English level B2+
- Working experience within AWS services: S3, Athena, EC2
looking for something else?
Find a vacancy that works for you. Send us your CV to receive a personalized offer.