middle big data software engineer for a biotechnology company
We are currently looking for a remote Middle Big Data Software Engineer with 2+ years of production experience with Spark (PySpark) to join our team.
The customer is a biotechnology company, which engages in the discovery, invention, development, manufacture, and commercialization of medicines.
The main goal is to work out a solution that consumes and stores data from multiple customer’s domains.
- Implement pipeline processing application using PySpark and Airflow
- Integrate required database structure
- Apply data marts in Hive and PostgreSQL
- Create analytical SQL Scripts in PostgreSQL or any other DB
- Communicate with English speaking colleagues and customer representatives
- 2+ years of production experience with Spark (PySpark)
- Strong skills in Apache Airflow
- Knowledge of Apache Hive
- English level B2+
nice to have
- Working experience within AWS services: S3, Athena, EC2