Backend Python Data Pipeline Engineer / Data Scientist – Python, Linux, Hadoop / Hive / Spark
We’re looking for a focused, data driven, ambitious and pragmatic Backend Python Data Pipeline Engineer to join a brilliant PhD team - to set the bar even higher and strengthen the team.
You're likely to have experience working in a name tech company, on projects that are known worldwide, or have experience with a fast-growing start-up and be a post-grad in Computer Science, Engineering or Maths.
This early stage, well funded, B2B tech start-up is building a predictive investment platform at the top end of the real estate industry. Backed by one of the largest VCs in Europe they have big dreams and are rapidly growing!
The team turns data into quantifiable insights that help real estate investors make safer, more informed decisions. They analyse large amounts of data from various sources in multiple formats to understand how the real estate market evolves over time, and take a peek into future growth.
The successful Data Pipeline Engineer must have:
- an MSc in Computer Science, Machine Learning, Math, Statistics or relevant fields
- experience handling large datasets
- be familiar with git
- know your way very well around a Linux terminal
- be using Python regularly
- be familiar with Hadoop / hive / Apache Spark
- have worked with MySQL and/or postgres
- have at least 3 years hands-on experience building large-scale software projects
- experience using Amazon Web Services (EMR, RDS)
- knowledge of GIS (e.g. postgis)
- knowledge of pandas, numpy and/or scikit-learn
- an open source contributor
- passionate about real estate
Benefits include: flexible working hours; 5 weeks holiday; a competitive salary; employee stock options and a fast-growing startup culture.
Backend Data Pipeline Engineer / Data Scientist / Data Engineer - Python, Linux, Hadoop / Hive / Spark
£65-80k + equity