If you missed the previous updates, it’s good to read “What should have happened in the Master Deeds leak” and “South African Master Deeds data leaked via MySQL data dump“.
I have asked Troy Hunt last night to run some more queries on the leaked database to establish origin of the data and he has updated his research. A quick run down of what happened:
My initial assumption was that the Master Deeds data leak only contained commercial information of people owning homes or having a credit record or having transacted with financial institutions. It is quite common for credit bureaus to share information for credit vetting purposes and consumers in South Africa typically consent in one way or another to share their information with 3rd parties.
When I heard from Troy that the leak contained over 60 million records and considering that the South African population is only 56m (census 2016), I immediately felt that the record set contains many duplicates, deceased or at worst contains the whole population register, including minors.
Since the data-set contains a South African ID number (comparable with a social security number), it is very easy to determine the age of the leaked data records by the first 6 characters (the date of birth):
Troy has now confirmed that out of the 66 million records, 57m million where marked as “alive” and 9.3m as “deceased”. Based on the South African ID-number, a minor (younger than 18 years of age) would have an ID number starting with 99 (for 1999) – so someone born on 24th September 2006 would have an ID number of “060924”. Since the year of the ID-number is two digits, it is very possible that “060924” could also be 24th September 1906, but this is very unlikely that the Department of Home Affairs (DHA) has captured records prior to 1940 (and if so, those records would have a low record count).
Troy then proceeded to group the Master Deeds record set by age group which then shows that 12,4m records are of minors:
Looking at a subset of the data you will notice records of children with dates of birth from December 2013 – i.e. “131218” means 18th December 2013 or 18th December 1913 (the latter being very unlikely to exist in a dump):
Did the Department of Home Affairs leak information?
With the above revelation it now becomes questionable how records of children landed up in a commercial data-set and for what purpose. It is quite common for financial sector companies such as banks to have agreements with Home Affairs to do identity verification for fraud prevention, but there should be no reason for children to appear in those records as they would have no commercial agreements.
The South African Department of Home Affairs is the official custodian of all citizen records and would ultimately be the owner/curator of the ID numbers present in the file. There is no doubt that the data was enriched through other sources such as credit bureaus. Where Dracore should be questioned is how ID records of minors landed up in a commercial dataset which was then on-sold to other parties.
I hope that DHA will investigate this and question Dracore and other parties involved on how details of minors were obtained and who has acquired the data-sets and for what purpose.
South African Identity verification “compromised”
This leak of ID numbers and other PII (personal identifiable information) will have long-term consequences for all exposed individuals. South Africa uses the ID number as the primary means of verifications. The Financial Intelligence Act (FICA) relies on it. The RICA (Regulation of Interception of Communications and Provision of Communication-Related Information Act) relies on it to associate the owner of a mobile number to a South African ID. Credit companies need it. So does the South African Revenue Service and literally any other government- or commercial entity in this country.
For years to come, the leaked identity information can now be used for phishing attacks, identity theft and other criminal means. Phishing- and social engineer attacks will dramatically increase due to the content of the data-set and the government and commercial entities affected by this will have a difficult time ahead to legitimately verify the owner of the presented identity information.
Comments from Dracore – legal consequences for Mr. Mohapi
As far as it goes, Dracore has provided some feedback on their website “Is Dracore Data Sciences Responsible For South Africa’s Largest Ever Data Leak?“. A more detailed interview describing the background of how Dracore operates and how the story unfolded is in the soundclip below.
Mr. Mohapi has originally broken the story and in the above interview, the CEO of Dracore has indicated that a court interdict will be issued against Mr. Mohapi.