All posts
EdTech & UniApplyForMe16 May 2026·5 min read

There’s no API for South African high schools, so I built my own

One of the features I’ve been building for the UniApplyForMe portal is school selection, where learners can find and pick their high school when signing up. Simple enough on the surface. But before I could build that, I needed the actual data: a complete, reliable list of South African high schools.

So I went looking. And I quickly found out that no such thing exists as a ready-to-use data source for developers. No public API, no database you can query, nothing. If you want this data in your product, you have to go get it yourself.

Here’s how I did it.

The government actually has the data, just not in a convenient format

After some digging, I found the DBE EMIS National Schools Masterlist, published by the Department of Basic Education. EMIS stands for Education Management Information System, and it’s essentially the government’s master record of every school in the country.

The catch is that it’s a spreadsheet, not an API. You download it, you process it yourself, and you figure out what to do with it from there. It covers all 25,527 schools in South Africa across every province and school type, and it includes useful detail on each school: its unique government ID number, address, GPS coordinates, contact details, quintile, whether it’s a no-fee school, and 2025 learner and educator numbers.

Not perfect, but it’s official and it’s thorough.

Cutting it down to what I actually need

The full spreadsheet includes everything from nursery schools to schools for learners with special needs. UniApplyForMe only needs secondary schools and combined schools (schools that run from primary through to matric) that are currently open and operating. So the first job was filtering.

I used a Python library called pandas (think of it as Excel for developers) to read through the spreadsheet and keep only the rows that matched those criteria. That cut the list from 25,527 schools down to 8,815 high schools spread across all nine provinces. Much more useful.

Cleaning up the messy bits

Government data is rarely clean out of the box, and this was no exception. A few things needed fixing before the data was ready to load into our database.

The main issue was with the learner and educator count columns. Some schools had blank values in those fields, and when you have a mix of numbers and blanks in a column, the processing tool quietly converts all the numbers to decimals to accommodate the gaps. So 950 becomes 950.0. That sounds harmless but our database refuses to store a decimal where it expects a whole number, so every row would have failed on import.

The fix was telling the tool to handle blanks properly so the numbers stayed as whole numbers. I also cleaned up some placeholder garbage in a few columns where missing values had been filled in with things like "99" or "Unknown" instead of just being left blank.

Loading it into the portal

Once the data was clean, I loaded it into a high_schools table in our Supabase database. A few decisions I made around how to set it up:

Since this is open government data, there’s no reason to restrict who can read it. I made the table publicly readable so the portal can search it without requiring the learner to be logged in first.

I also set up search so that when a learner starts typing their school name, the results come back quickly and match partial names. Typing “Northcliff” should surface Northcliff High School without needing an exact match.

The government ID number for each school (called a NatEMIS number) is stored as the unique identifier. This means when the DBE publishes a new version of the masterlist, I can re-import it and update existing records by matching on that number rather than having to wipe and reload everything.

Why this matters for the portal

Before this, if a learner wanted to enter their school they’d have to type it in manually. That opens the door to typos, inconsistent naming, and records that are hard to make sense of later. Now they can search and select from a verified list, and each learner record links back to a specific, known school with all its associated details.

It also means we can do things like filter applications by province or district, which is useful for the reps who manage applications on the staff side.

For other builders in this space

South Africa’s public data is better than most people realise. The DBE, Stats SA, and a few other government departments publish genuinely useful datasets. The gap is just that no one has turned most of it into something developers can easily plug into their products.

If you’re building something for SA learners and you need access to this schools data as an API rather than having to process the raw spreadsheet yourself, get in touch. I’m thinking about whether it makes sense to open this up for other builders.

Found this useful? Share it.

data importDBEEdTecheducation dataEMISopen dataschools APISouth AfricaSupabaseUniApplyForMe