Big Data Pipeline Engineer / Data Scientist – Python, Linux, Hadoop / Hive / Spark, central London
We’re looking for a focused, data driven, ambitious and pragmatic Big Data Pipeline Engineer to join a brilliant PhD team that likes to learn, adapt and share knowledge.
This early stage, well funded, B2B tech start-up is building a predictive investment platform at the top end of the real estate industry. Backed by one of the largest VCs in Europe they have big dreams and are rapidly growing!
The team turns data into quantifiable insights that help real estate investors make safer, more informed decisions. They analyse large amounts of data from various sources in multiple formats to understand how the real estate market evolves over time, and take a peek into future growth.
The successful Big Data Pipeline Engineer must have :
- an MSc in Computer Science, Machine Learning, Math, Statistics or relevant fields
- experience handling large datasets
- be familiar with git
- know your way very well around a Linux terminal
- be using Python regularly
- be familiar with Hadoop / hive / Apache Spark
- have worked with MySQL and/or postgres
- have at least 3 years hands-on experience building large-scale software projects
- experience using Amazon Web Services (EMR, RDS)
- knowledge of GIS (e.g. postgis)
- knowledge of pandas, numpy and/or scikit-learn
- an open source contributor
- passionate about real estate
Benefits include: flexible working hours; 5 weeks holiday; a competitive salary; employee stock options and a fast-growing startup culture.
Big Data Pipeline Engineer / Data Scientist / Data Engineer - Python, Linux, Hadoop / Hive / Spark
£45-80k + equity