Check file access rights? Check that IIS_IUSRS has write access...
This is not a permanent issues. it does occur once or twice a day and lasts for 30-60 minutes. IIS restarts are not helping.... the rezizer is creating empty files (0 bytes)
Maybe check that the requested file actually exist?
It does exist - it's stored as the blob in the db. If it didn't exist in assume the exceptions would have been thrown all the time.
Seems like something goes wrong while writing to file sometimes which leads to that the file is then locked. Then it will fail until that file lock is released after a while. I noticed that Image Resizer 4 has at least 2 bug fixes in that area. Its first writing to a temp file and then renaming it and it also has improved flushing of the file stream. I guess those bug fixes might be related to your problem. Sure sounds like it at least. Tried upgrading?
It still really strange that it is not related to any CMS events, nor the traffic on the site. What I also tried is to stop IIS and clean the image cache.... it did help - for 15 secs. What is also interesting that the problem mostly occurs on 2 out of 3 web servers (it will start on one of them and the second will go tits up a few minutes later).
Ah yeah, upgrade was the first thing I tried- no luck
The problem started on Thursday, a few hours after the servers were scaled up.
Not sure if it is related ato all or it is just a coincidence.
Thank you for all the help so far!
Try using different directories for image cache on different servers if you are using load balancing. That might help
I tried removing entire image cache folder and recreating it already. I will try disabling disk cache plugin just to make sure it is not another's issue symptom
You can try to run FileMon there and see what's going on on the file system level. Also you can try to scan through ETW stream to see who is maybe blocking the file lock acquire.
And how exactly "scale-up" looks?
Are you using a shared directory for the cache?
@Vladis - scale up - the VMs are shut down, resources are being added (CPU, RAM) and they are booted back again
@Daniel - No, every server has its own cache.
9 CPU @ 2.4Ghz each
It is also weird that most of the time it does happen on 2 out of 3 servers, and it doesn't start at the same time (5-6 minutes delay).
Code in the DiskCache plugin is pretty complex and involves async queue for writes. I would recommend to raise maybe issue to IR team. See - maybe they can help. Doesn't sound it's EPi related..
We did try turning async wrties on/off. I have already disabled disk cache (there is a CDN in front of the site). I will keep you posted!
I have some data from Newrelic now:
Seems like AuthorizeRequest is holding the entire request. Any idea what could be casuing that? Now all 3 web servers are affected. So I am thinking:
- database connectivity issue (nothing in the logs)
- remote events
- images on the network share (some legacy images are stored on the file server)
Also check cookies. I had a few fat cookies that slowed down images on intranet once.
You were also able to disable check access rights before on vpp provider. Not sure if that is possible now with blob . That helped performance a lot before...
I am not using any custom VPP. The section about VPP configuraiton has been removed from the v9 documentation ;/
Anyway, what are you using for image resizing? Maybe it is worth considring changing...
I really do appreciate the time you are spending in trying to tackle this issue!
@Daniel - Where about are you storing your blobs? We do have them in the SQL (db size ~ 38GB). Considering moving to s3.
@Vladis - Upgrading is obviously an option. However I do feel that the problem may be somewhere else. InterceptModule does not implement Authorize event (https://github.com/imazen/resizer/blob/resizer3-4-3.103/Core/InterceptModule.cs). Moreover the issue kicks off at more than 1 web server at once (2 or 3).
Thank you, M
Single server with blobs on disc.
Start with upgrade. If you are using EPiServef plugin (which you should), then wait for 4.1.1 version. Should be soon.
So I did upgrade ImageResizer do 4.0.5, purchased DiskCache (4.0.5) and upgraded ImageResizer.Plugins.EPiServerBlobReader to 4.1.1. For the last couple of weeks it was all good, until today :(
ImageResizer.ImageProcessingException: Failed to acquire a lock on file "E:\wwwroot\imagecache\11e8\7026e2e2484dcbfd15ab25b5b336067698b1c2c8ba97025b74a1a8266cb489e8.jpg" within 15000ms. Caching failed. at ImageResizer.Plugins.DiskCache.DiskCache.Process(IResponseArgs e) at ImageResizer.Plugins.DiskCache.DiskCache.Process(HttpContext context, IResponseArgs e) at ImageResizer.InterceptModule.CheckRequest_PostAuthorizeRequest(Object sender, EventArgs e) at System.Web.HttpApplication.SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)
I have contacted ImageResizer support and that's what they came back with:
This can happen if there are two requests for the same URL happening simultaneously, but if it takes just under 15 seconds to fetch the source image. You might add logging to EPIServerBlobReader so that you can correlate fetch times with URLs that have later/earlier failed requests.
You may also find useful information from looking at response times in the log. If you see that a failing URL has a successful but slow response at anothet time, then yoi can work to address the underlying network/storage performance problem.
Timeouts usually indicate a pathological source-fetch performance problem (or a flush-to-disk problem, but that's less likely).
The thing is that the site is not under big load atm (up to 100 concurrent users).
I would be very interested to hear your thoughts.
Interesting. Not sure if package had enough logging. Do you have reference to official reply from them? Also - as package is open sourcce, might be so that you can add logging (if not there) and make PR? Happy to merge...
My brain is just shouting bug, kill, kill, die!
Not sure that's helpful though :)
Valdis - it was just an email. happy to share it with you if you would like to.
More update: We thought that this is AV software slowing the File server down. So we unistalled it a month ago. It was all good until yesterday!
Has anyone of you tried extracting image resizer to another project (don't have the full EPiServer so no additional license has to be purchased)? The idea would be to handle image processing just on the file server.
Would be great to have reference from them.
Do you need additional logging in the provider? I haven't added yet anything..
Regarding offloading image processing to another server/extracting to different project... I don't claim myself to be an episerver license expert, but my got feelings are that you will still need to license that stuff (as epi binaries are needed for plugin to work). might be OK, if running multisite license tho.
Here you go: https://snag.gy/R6V8IG.jpg
We could try adding some logging. Running wireshark on all the boxes already.
I will get in touch with EPi in regards to the licensing.
Thanks for all the help!
Recently I noticed that one of the production site is throwing loads of the ImageResizer.ImageProcessingException errors. At the same time RAM consumption increases dramatically leading to app pool recycle.
Any idea what could be causing that?