Article Image

ENF Data on the European Grid

How I misused a German server to collect and publish the only ENF data available of the European grid since 2019.

Read below

Published August 20th, 2024  

In every audio sample, there are obviously noises. Most commonly, they're the ones we hear. However, in every recorded (in real life) audio sample, at about 50 or 60 Hz, there's another incredibly important noise: the power grid. Power grids in the United States run at about 50Hz, and those elsewhere tend to run at about 60Hz. They slightly adjust their frequency every second, in no particular pattern, to match power demand. Since every single grid has these unique sounds, we can, in theory, accurately match any audio sample to an exact time (and general region) of recording. We can even verify the authenticity of any audio sample! This approach, using electrical network frequency (ENF) analysis, is currently (allegedly) used by only a few government-level actors. Open-source intelligence tooling developers at Bellingcat are working on a public implementation of the technique. However, available ENF datasets are very limited and out-of-date, but crucial for the approach to work.

This is where a high schooler with too much free time can help. At mainsfrequency.com, there is a widget that shows the current ENF of the European grid:Pasted image 20240820124637.png
If this website has this data updating constantly, surely I can get it too, right? Needless to say, a European ENF dataset that's up-to-date would have quite an impact on the usefulness of ENF analysis. Checking the network tab in dev tools, we see second-by-second requests to some server at netzfrequenzmessung.de:

curl 'https://netzfrequenzmessung.de:9081/frequenz02c.xml?c=1279246' -H 'User-Agent: {redacted}' -H 'Accept: */*' -H 'Accept-Language: en-US,en;q=0.5' -H 'Content-Type: text/plain'

What's this c query parameter? It was pretty simple to find in the source code:

function AjaxAufruf() {
	if( req) {
		if ("withCredentials" in req){  // nur wenn Browser CORS unterstützt gibt es Credentials
			req.open( "GET", url0+"?c="+Math.round(Math.random()*100000)*31, true );  // verschiedene Namen da IE sonst cacht und nix erneuert
			req.onreadystatechange = CallbackFkt;        
			req.setRequestHeader('Content-Type',  "text/plain");			
			try{ req.send( null ); } catch(e) { console.log('Fehler: '+e); }	
		}
	}
}

Code as found originally, with modified indentation.

Now, I don't speak German, but I do know how to use Google Translate! It seems that c is used to work around IE request caching. After playing with it for a little, I found that c is actually also used to verify the "authenticity" of the request. Using the same number twice too often returns a 429 Too Many Requests (but it seems to reset every so often). From here, I started trying random c-values. Only positive integers are generated by the website's code, so I tried negative numbers (random multiples of between and negative integer limit) to get the data. I also alternate user agents and generate random IPs for the Forwarded headers to avoid rate limits. In Python, the code is basically

def get_enf_data():
    url = "https://netzfrequenzmessung.de:9081/frequenz02c.xml?c=" + get_c()
    ip = get_ip()
    headers = {
        "Accept": "*/*",
        "Accept-Language": "en-US,en;q=0.5",
        "Connection": "keep-alive",
        "Host": "www.mainsfrequency.com",
        "Referer": "https://www.mainsfrequency.com/", # trust me bro
        "Sec-Fetch-Dest": "empty",
        "Sec-Fetch-Mode": "cors",
        "Sec-Fetch-Site": "same-origin",
        "User-Agent": getUA(),
        "Forwarded": "for=" + ip,
        "X-Forwarded-For": ip,
    }
    response = requests.get(url, headers=headers)
    # parse XML, write to CSV...

Running this function about once a second on HackClub's Nest (a free service for high schoolers) gives fewer than ten seconds of data loss per day! The following graph looks very "this is the ENF signature of a power grid over a timeframe of one hour, plotted with matplotlib"-y:Pasted image 20240820160440.png
* — For rate limits that last less than three seconds, we average the value that was skipped. I found that averaging a gap more than one second could lead to unpredictable results, so I didn't average those errors. The ten-seconds-of-data-loss statement includes both filled and unfilled errors.*

Rendering a full day (it was an accident...) took a pretty long time, but looks very believable: