At work we were focusing on moving our build stations to the cloud because COVID-19 has made it so that everything is easier if it’s able to be done remotely.
In working on this task I found our nightly regression tests suddenly failing. One test was producing vastly different output and one test was outright crashing. Looking at our log files, there was an error that I hadn’t seen before. Found the error in the source….we couldn’t open a file, that we had just written. It only happened on some parts, but once that file was locked there was no going back.
This release of the software had some logic changes to how files were written to where we were taking advantage of memory mapped files, and that seemed to be the source of the issue.
Tracking more, nothing seemed obviously wrong. I couldn’t understand where things were failing.
Using MS/SysInterals ProcMon tools I found that the file was being left open as a DLL handle (some googling revealed that MapViewOfFile makes that show up as a DLL load in ProcMon) but it was failing to unload it. Here’s the code. I couldn’t spot anything wrong.
HANDLE source = CreateFile(temp_name, GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
That line worked. Ok next
HANDLE dest = CreateFile(dest_file, GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
That was working too.
HANDLE source_mapping = CreateFileMapping(source, NULL, PAGE_READONLY, 0, 0, "source_mapping");
PBYTE source_file = (PBYTE)MapViewOfFile(source_mapping, FILE_MAP_READ, 0, 0, 0);
if (source_file!= NULL)
DWORD bytes_to_write = 16000000;
long long total_bytes_written = 0;
bool working = true;
//have to chunk the data because write file can fail on large files especially over the network
while (total_bytes_written < file_size.QuadPart && working) //this line returns an error…sometimes
if (file_size.QuadPart - total_bytes_written < bytes_to_write)
bytes_to_write = (DWORD)(file_size.QuadPart - total_bytes_written); //this is smaller than a DWORD so we're ok
DWORD bytes_written = 0;
working = WriteFile(dest, source_file, bytes_to_write, &bytes_written, NULL);
total_bytes_written += bytes_written;
source_file = (PBYTE)(source_file + bytes_to_write);
success = working;
UnmapViewOfFile was returning an error about “Attempt to access invalid address.” only on some files. I have no idea what the cause is, why it happens only on some files and not for the bulk of our files.
I eventually added a check after the WriteFile to see if we had written the full file length and break out. I know that shouldn’t be necessary, there’s no reason it SHOULD be necessary, it was properly breaking out after advancing the pointer the amount of memory we had read, but apparently some mapped files didn’t like advacing the file pointer to the end of the file, even with the read|write flag set.
I am not proud of the change nor how long it took me to identify what was happening, but at least tests are working again.