Microsoft leaks 6.5TB in Bing search data via unsecured Elastic server. *Insert ‘Wow… that much?’ joke here*

Microsoft exposed a 6.5TB Elastic server to the world, including search terms, location coordinates, device ID data, and a partial list of which URLs were visited, earlier this month.

According to a report from security site WizCase, the server was password-protected until around 10 September, when “the authentication was removed”.

WizCase code-prober Ata Hakcil discovered the leak on 12 September. The data appears to be generated by the Bing mobile app, which promises users “Getting rewarded is easy, just search with the Bing,” and has been downloaded more than 10 million times from Google’s Play Store. The data was growing by up to 200GB per day and included searches from people in more than 70 countries, according to WizCase.

Once the data was unsecured, several things happened. The infosec firm reported the problem to Microsoft on 13 September, and the database was secured by the company’s security response centre on 16 September. That left plenty of time for hackers and bots to find the data, and WizCase said the server suffered a Meow attack on two occasions, referring to a bot which deletes unsecured databases and replaces them with new ones including the word “meow”. However, data continued to be collected. If the Meow bot found that data, it is likely that other interested parties did as well.

In mitigation, the data did not include personal information such as name, address or email address. A critical question, though, is whether there was enough data included that the individuals could be traced.

In 2006, AOL released what it thought was anonymised search data for research purposes, but journalists soon proved this wrong by identifying some of the searchers. One of the reasons why this was easy was that each searcher was identified by a numeric key, so it was possible to see all the searches made by a particular individual.

It seems Microsoft’s leaked data may likewise have privacy implications. WizCase screenshots show that the data includes fields called deviceID, deviceHash, AdID and clientID, all of which are promising in terms of finding all the searches from a particular user. There are also coordinates showing location “within 500 meters”, not precise enough to get an address, but helpful to someone trying to identify searchers.

The data also reveals some of the unsavoury things people search for, including illegal content. WizCase suggested that if criminals succeed in deanonymising the data, some individuals could be vulnerable to blackmail or phishing scams as a result.

Statcounter data shows just 2.83 per cent market share for Bing versus Google’s 92.05 per cent. That said, it is a small percentage of a very large market, and Statcounter’s figures may not reflect searches via the Bing app or those integrated into Windows search.

The incident is unfortunate for Microsoft, which advertises “simplified privacy controls” as one of the benefits of the iOS version of Bing Search.

A Microsoft spokesperson told us: “We’ve fixed a misconfiguration that caused a small amount of search query data to be exposed. After analysis, we’ve determined that the exposed data was limited and de-identified.”

Anybody can make a mistake, but there is an implicit deal with search providers like Microsoft and Google that we get personalisation and improved search results in return for allowing them to collect data on our behaviour. A high level of trust is required, and this kind of incident is damaging to that trust. The data was, apparently, not encrypted. ®

READ MORE HERE