Our customer exposes an API for other businesses to be able to access risks associated with a person, e.g. understand their credit score. Behind the scene there is sophisticated decisioning system and large data volumes. Currently back-end of this API has number of legacy versions, serving hundreds of clients, with individual installations for most clients.
We need to create a data lake for one of the biggest data analytics company working with personal information both domestically and internationally. In a nutshell, this includes replatforming of on-premises Enterprise Data Hub from Hadoop cluster into GCP. Day to day tasks include but not limited to creating spark application which manipulate with data from different sources including Oracle, Google Cloud Storage, BigQuery; creating pipelines via GCP Dataflow; working with Jenkins and AirFlow.
40 hrs/week
Hours per week
12+ months
Project length
Belarus, Russia, Ukraine
Locations eligible for the position