A short story about failing request
One of our partners sent me an email recently:
Hello Patryk, We have failing request on our development server. There is an endpoint that downloads file which size is about 140 mb. It crashes all the time. There are Out Of Memory exceptions in our logs.
Can you give us a hand see what is happening and why? We have tried to increase container's memory limit, but not luck.
It was evening, 10:00 PM when I read this email. "Okay, let's check that out!" I said after I had grabbed my cup of coffee.
My first steps were to confirm app memory limit in our dev server (app is hosted as a pod in K8S cluster):
[
{
"op": "replace",
"path": "/spec/template/spec/containers/0/resources/limits/memory",
"value": "999Mi"
},
(...)
]
Okay, we have 999 mb. Quite a lot. But the question is - why can't app handle 140 mb file? Even if it is stored in the memory (whole file content), it should be more than enough.
So, it's time to dive into a code. After few minutes I figured out the flow:
Okay, so let's see how it behaves on local environment. I've attached the debugger and...
... 1.5 GB is used to handle one request that downloads 140 MB file. No wonder that OOMs are thrown.. As we are speaking of exception, here is stack trace of it:
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at System.Text.StringBuilder.ToString()
at RestSharp.Extensions.HttpResponseExtensions.GetResponseString(HttpResponseMessage response, Byte[] bytes, Encoding clientEncoding)
at RestSharp.RestResponse.<>c__DisplayClass0_0.<<FromHttpResponse>g__GetDefaultResponse|0>d.MoveNext()
As we can see, process is running out of memory in RestSharp's internal code which executes request. We are not even processing it 😔. But wait a minute..
Isn't strange that we are using some ToString()
for response that supposed to be binary file?
I've decided to peek RestSharp implementation:
await RestResponse.FromHttpResponse(
internalResponse.ResponseMessage!,
request,
Options.Encoding,
internalResponse.CookieContainer?.GetCookies(internalResponse.Url),
Options.CalculateResponseStatus,
cancellationToken
)
.ConfigureAwait(false)
And even deeper:
var bytes = stream == null ? null : await stream.ReadAsBytes(cancellationToken).ConfigureAwait(false);
var content = bytes == null ? null : httpResponse.GetResponseString(bytes, encoding);
Bingo!
It looks like RestSharp reads whole response as byte array (140mb) and what's more - reads response it as string. From RFC standard we know, that files are send as base64 encoded string which takes 3x more space that original byte content. So we have in total 140 + (140 * 3) = 560 MB just for one request! Not to mention helper classes (like string builder etc) which can lead to 1.5 GB.
Not good ðŸ˜
So once we know what is happening & what the process looks like we can figure out the solution. Let's sum up 💡:
Solution is waiting to be figured out. Experienced developer will notice that there is no need to loading file to memory at all. API can act like proxy and pass bytes from XYZ to client. Let's try that!
Let's try to set some configuration in RestSharp so response is not loaded into memory.
Let's get back to the source code that threw OOM exception:
async Task<RestResponse> GetDefaultResponse() {
await using var stream = await httpResponse.ReadResponseStream(request.ResponseWriter, cancellationToken).ConfigureAwait(false);
var bytes = stream == null ? null : await stream.ReadAsBytes(cancellationToken).ConfigureAwait(false);
var content = bytes == null ? null : httpResponse.GetResponseString(bytes, encoding);
(...)
}
(Source)
Unfortunately, there is no flag that block response reading... Bad for us.
RestSharp uses System.Net.Http.HttpClient under the hood - lower level client builtin in .NET Core. Lower level doesn't mean that it is harder to use. But we have more control over response. Let's try that!
Code for getting file in service:
public async Task<HttpResponseMessage> GetFile(string url, string sessionId)
{
return await _client.SendAsync(
new HttpRequestMessage(HttpMethod.Get, url)
{
Headers =
{
{"Cookie", sessionId}
}
},
CancellationToken.None
);
}
Controller code:
[HttpGet]
[Route("download/{fileId}")]
public async Task GetByFileId(int fileId)
{
var result = await _mediator.Send(newGetDocumentsByFileIdQuery { FileId=fileId });
Response.RegisterForDispose(result.File);
Response.StatusCode = 200;
await using (var fileStream = await result.File.GetOriginalStreamAsync())
{
await fileStream.CopyToAsync(
Response.BodyWriter.AsStream()
);
}
await Response.BodyWriter.CompleteAsync();
}
(Code simplified to show only essential part)
Okay, here are the results:
Not bad 🎆! We are using "only" 570 MB! But wait... Didn't I tell that file should be passed as a proxy? 570 MB indicates that something is still loaded into memory. Much less than before, but still.
The answer for our issue can be found on MSDN Page for HttpClient:
Remarks: This operation will not block. Depending on the value of the completionOption parameter, the returned Task
object will complete as soon as a response is available or the entire response including content is read.
So it looks like HttpClient is buffering response in the memory. But there is magic "or" in that quote. It means, it can be disabled! 💓
Indeed, it can with a simple flag:
return await_client.SendAsync(
new HttpRequestMessage(HttpMethod.Get, url)
{
Headers =
{
{"Cookie", sessionId}
}
},
//HERE 🚀
HttpCompletionOption.ResponseHeadersRead,
CancellationToken.None
);
And the results:
We did it! 99 MB (from 1.5 GB)!
As you can see, choosing proper solution for a problem we face can be critical in terms of performance. I don't want to blame RestSharp - it is easy to use (this is not relevant nowadays, but please remember it was created a way before HttpClient!) but in favor of full control over http request flow.
Hope you enjoyed my debugging journey!
PON. - PT. 10:00 - 18:00
office@knsdata.com