Data Engineer

  • Permanent
  • Full time
  • £62,000
  • Remote
  • Research Team

At Eedi, we're on a mission to sustainably improve learning outcomes for 1 billion students by 2030.

Imagine every student owning their own unique predictive learning graph—capturing their journey, challenges, and progress over time. That’s the future we’re building at Eedi, enabling personalised, adaptive learning everywhere.

With billions of data points collected over a decade, we’re not just identifying knowledge gaps; we’re understanding why they exist and enabling timely interventions that accelerate learning.

We've secured a $12M grant from a prestigious US foundation, and our first version of the ML model is nearly complete. We’re forging partnerships with major education operators to reach hundreds of millions of students, while positioning Eedi as the leading platform for the world’s best AI tutors.

Why we need you

Our ability to power personalised learning at scale depends on a world-class data infrastructure. That’s where you come in.

As a Data Engineer at Eedi, you’ll design, build, and maintain scalable data pipelines and infrastructure that fuel our ML models, knowledge graph, and analytics systems. Your work will ensure our machine learning, data science and product teams have reliable, high-quality data to drive innovation and improve student learning outcomes.

What you’ll be doing

First Month:

  • Deep dive into the current data pipeline with a focus on security, defining a roadmap to progressively and pragmatically improve our architecture.
  • Collaborate with Eedi’s data science team to standardise data cleaning, wrangling, and hand-off processes.

First Six Months:

  • Develop a unified way to access disparate data sources (e.g., SQL, table storage, MixPanel), potentially through a data warehouse.
  • Implement secure ETL processes to limit access to sensitive data while maintaining operational efficiency.
  • Work closely with the ML team to enhance data quality, support improved model performance, and identify new data opportunities.
  • Establish data isolation strategies for B2B solutions (both by company and geography).
  • Support data extraction and processing for external competitions, such as our upcoming Kaggle competition with Vanderbilt University.

First Year:

  • Introduce additional metadata structures to streamline data analysis and modelling.
  • Implement robust validation scripts to evaluate the correctness of new product feature implementations and A/B experiments.
  • Improve performance efficiencies across the product, reducing reliance on long-running database queries.
  • Investigate streaming data solutions to facilitate regular ML model retraining.
  • Develop and manage vector databases for organising embeddings.

Must have’s…

  • You have 5+ years of experience in data engineering, building and optimising large-scale data pipelines.
  • You have expertise in Python for data processing, along with SQL and NoSQL for querying large datasets.
  • You’re experienced with cloud-based data infrastructure (we use Azure).
  • You have hands-on experience with data orchestration tools (e.g. Apache Airflow, Prefect, or Dagster).
  • You have deep understanding of distributed data processing frameworks (e.g., Spark, Dask, or Snowflake).
  • You have experience with data lakes and data warehouses (e.g., BigQuery, Redshift, Snowflake), and relational databases.
  • You have strong knowledge of ETL/ELT best practices, data modelling, and pipeline monitoring.
  • You’re passionate about building robust, scalable data solutions that power AI-driven education.

It would be nice if…

  • You have experience working in Edtech or a startup environment.
  • You have familiarity with graph databases and knowledge graphs.
  • You have experience supporting ML teams with feature engineering and data versioning.
  • You have a background in real-time data streaming solutions (e.g., Kafka, Kinesis, Pulsar).
  • You understand MLOps practices and their intersection with data engineering.
  • You have made contributions to open-source projects or experience publishing technical content.
  • You have experience with Infrastructure as Code (IaC) tools like Terraform.

Benefits at Eedi

  • Competitive salary + stock options + generous pension
  • 30 days holiday + Christmas break
  • Cycle-2-work scheme
  • Home office budget (chairs, screens etc)
  • Truly flexible working - we appreciate real work life balance
  • Quarterly off-sites
  • Learning budget - we're an education business!
  • Four day work week (following probation)


Please note: The deadline for receiving applications on this role is Friday 18th April.

There's one more, very important thing. We are an equal opportunity employer. We search for amazing people of diverse backgrounds, experiences, abilities, and perspectives. We take care of each other to create an inclusive work environment where we love to come to work every day.

Ready to use your ML superpowers to revolutionise maths education? We can't wait to hear from you!

Let's change the future of learning together!