ImageResizer.ImageProcessingException ?

Vote:
 

Hi Guys,

Recently I noticed that one of the production site is throwing loads of the ImageResizer.ImageProcessingException errors. At the same time RAM consumption increases dramatically leading to app pool recycle.

ImageResizer.ImageProcessingException (0x80004005): Failed to acquire a lock on file "E:\wwwroot\imagecache\1e\d1a022809295f272d4dc7c63377a759c0f336be0a1be8ca696fcf27479892cf1.jpg" within 15000ms. Caching failed.
   at ImageResizer.Plugins.DiskCache.DiskCache.Process(IResponseArgs e)
   at ImageResizer.Plugins.DiskCache.DiskCache.Process(HttpContext context, IResponseArgs e)
   at ImageResizer.InterceptModule.HandleRequest(HttpContext context, String virtualPath, NameValueCollection queryString, IVirtualFile vf)
   at ImageResizer.InterceptModule.CheckRequest_PostAuthorizeRequest(Object sender, EventArgs e)
   at System.Web.HttpApplication.SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
   at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)

Any idea what could be causing that?

Many thanks,

Maciej

#155170
Sep 10, 2016 22:26
Vote:
 

Which version of IR?

#155171
Sep 11, 2016 8:41
Vote:
 

3.4.2.549

#155172
Sep 11, 2016 9:01
Vote:
 

Check file access rights? Check that IIS_IUSRS has write access...

#155177
Sep 11, 2016 17:15
Vote:
 

 This is not a permanent issues. it does occur once or twice a day and lasts for 30-60 minutes. IIS restarts are not helping.... the rezizer is creating  empty files (0 bytes)

#155179
Sep 11, 2016 17:44
Vote:
 

Maybe check that the requested file actually exist?

#155182
Sep 11, 2016 19:59
Vote:
 

It does exist - it's stored as the blob in the db. If it didn't exist in assume the exceptions would have been thrown all the time.  

#155185
Sep 11, 2016 22:55
Vote:
 

True. 

Seems like something goes wrong while writing to file sometimes which leads to that the file is then locked. Then it will fail until that file lock is released after a while. I noticed that Image Resizer 4 has at least 2 bug fixes in that area. Its first writing to a temp file and then renaming it and it also has improved flushing of the file stream. I guess those bug fixes might be related to your problem. Sure sounds like it at least. Tried upgrading?

#155186
Sep 12, 2016 0:45
Vote:
 

Concur 

It still really strange that it is not related to any CMS events, nor the traffic on the site. What I also tried is to stop IIS and clean the image cache.... it did help - for 15 secs. What is also interesting that the problem mostly occurs on 2 out of 3 web servers (it will start on one of them and the second will go tits up a few minutes later).

Ah yeah, upgrade was the first thing I tried- no luck 

The problem started on Thursday, a few hours after the servers were scaled up.

Not sure if it is related ato all or it is just a coincidence. 

Thank you for all the help so far!

#155188
Sep 12, 2016 7:53
Vote:
 

Try using different directories for image cache on different servers if you are using load balancing. That might help

#155189
Sep 12, 2016 7:56
Vote:
 

I tried removing entire image cache folder and recreating it already. I will try disabling disk cache plugin just to make sure it is not another's issue symptom 

#155190
Sep 12, 2016 8:14
Vote:
 

You can try to run FileMon there and see what's going on on the file system level. Also you can try to scan through ETW stream to see who is maybe blocking the file lock acquire.

And how exactly "scale-up" looks?

#155192
Sep 12, 2016 8:51
Vote:
 

Are you using a shared directory for the cache?

#155193
Sep 12, 2016 9:15
Vote:
 

@Vladis - scale up - the VMs are shut down, resources are being added (CPU, RAM) and they are booted back again

@Daniel - No, every server has its own cache.

#155195
Sep 12, 2016 9:21
Vote:
 

What is processor count now?

#155196
Sep 12, 2016 9:22
Vote:
 

9 CPU @ 2.4Ghz each

#155197
Sep 12, 2016 9:23
Vote:
 

It is also weird that most of the time it does happen on 2 out of 3 servers, and it doesn't start at the same time (5-6 minutes delay).

#155198
Sep 12, 2016 9:25
Vote:
 

Code in the DiskCache plugin is pretty complex and involves async queue for writes. I would recommend to raise maybe issue to IR team. See - maybe they can help. Doesn't sound it's EPi related..

#155199
Sep 12, 2016 10:35
Vote:
 

Hi Vladis,

We did try turning async wrties on/off. I have already disabled disk cache (there is a CDN in front of the site). I will keep you posted!

Thank you.

M

#155202
Sep 12, 2016 10:51
Vote:
 

Hi Guys,

I have some data from Newrelic now: 

Seems like AuthorizeRequest is holding the entire request. Any idea what could be casuing that? Now all 3 web servers are affected. So I am thinking:

- database connectivity issue (nothing in the logs)

- remote events

- images on the network share (some legacy images are stored on the file server)

Any thoughts?

Thank you,

Maciej

#155448
Sep 17, 2016 11:51
Vote:
 

Also check cookies. I had a few fat cookies that slowed down images on intranet once. 

You were also able to disable check access rights before on vpp provider. Not sure if that is possible now with blob . That helped performance a lot before...

#155451
Sep 17, 2016 19:43
Vote:
 

Hi Daniel,

I am not using any custom VPP. The section about VPP configuraiton has been removed from the v9 documentation ;/

Anyway, what are you using for image resizing? Maybe it is worth considring changing...

M

#155452
Edited, Sep 17, 2016 22:18
Vote:
 
<p>Image Resizer :)</p> <p>zero problems...</p>
#155453
Sep 18, 2016 7:27
Vote:
 

Maciej,

Is upgrade to 4.x applicable for your project?

#155454
Sep 18, 2016 7:30
Vote:
 

Hi Guys,

I really do appreciate the time you are spending in trying to tackle this issue!

@Daniel - Where about are you storing your blobs? We do have them in the SQL (db size ~ 38GB). Considering moving to s3.

@Vladis - Upgrading is obviously an option. However I do feel that the problem may be somewhere else. InterceptModule does not implement Authorize event (https://github.com/imazen/resizer/blob/resizer3-4-3.103/Core/InterceptModule.cs). Moreover the issue kicks off at more than 1 web server at once (2 or 3).

Thank you, M

#155457
Sep 18, 2016 9:54
Vote:
 

Single server with blobs on disc. 

#155458
Sep 18, 2016 10:34
Vote:
 

Start with upgrade. If you are using EPiServef plugin (which you should), then wait for 4.1.1 version. Should be soon.

#155461
Sep 18, 2016 16:37
Vote:
 

Hi Guys,

So I did upgrade ImageResizer do 4.0.5, purchased DiskCache (4.0.5) and upgraded ImageResizer.Plugins.EPiServerBlobReader to 4.1.1. For the last couple of weeks it was all good, until today :(

ImageResizer.ImageProcessingException: Failed to acquire a lock on file "E:\wwwroot\imagecache\11e8\7026e2e2484dcbfd15ab25b5b336067698b1c2c8ba97025b74a1a8266cb489e8.jpg" within 15000ms. Caching failed.
at ImageResizer.Plugins.DiskCache.DiskCache.Process(IResponseArgs e)
at ImageResizer.Plugins.DiskCache.DiskCache.Process(HttpContext context, IResponseArgs e)
at ImageResizer.InterceptModule.CheckRequest_PostAuthorizeRequest(Object sender, EventArgs e)
at System.Web.HttpApplication.SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)

I have contacted ImageResizer support and that's what they came back with:

This can happen if there are two requests for the same URL happening  simultaneously, but if it takes just under 15 seconds to fetch the source image. You might add logging to EPIServerBlobReader so that you can  correlate fetch times with URLs that have later/earlier failed requests.

You may also find useful information from looking at response times in the log. If you see that a failing URL has a successful but slow response at anothet time, then yoi can work to address the underlying network/storage performance problem.

Timeouts usually indicate a pathological source-fetch performance problem (or a flush-to-disk problem, but that's less likely).

The thing is that the site is not under big load atm (up to 100 concurrent users).

I would be very interested to hear your thoughts.

Maciej

#162242
Oct 13, 2016 18:59
Vote:
 

Interesting. Not sure if package had enough logging. Do you have reference to official reply from them? Also - as package is open sourcce, might be so that you can add logging (if not there) and make PR? Happy to merge...

#162409
Oct 13, 2016 22:12
Vote:
 

My brain is just shouting bug, kill, kill, die! 

Not sure that's helpful though :)

#162411
Oct 13, 2016 23:17
Vote:
 

Valdis - it was just an email. happy to share it with you if you would like to.

More update: We thought that this is AV software slowing the File server down. So we unistalled it a month ago. It was all good until yesterday!

Has anyone of you tried extracting image resizer to another project (don't have the full EPiServer so no additional license has to be purchased)? The idea would be to handle image processing just on the file server.

Thanks,

Maciej

#172433
Dec 02, 2016 14:57
Vote:
 

Maciej,

Would be great to have reference from them.

Do you need additional logging in the provider? I haven't added yet anything..

Regarding offloading image processing to another server/extracting to different project... I don't claim myself to be an episerver license expert, but my got feelings are that you will still need to license that stuff (as epi binaries are needed for plugin to work). might be OK, if running multisite license tho.

#172446
Dec 02, 2016 18:16
Vote:
 

Here you go: https://snag.gy/R6V8IG.jpg

We could try adding some logging. Running wireshark on all the boxes already.

I will get in touch with EPi in regards to the licensing.

Thanks for all the help!

M

#172447
Dec 02, 2016 18:22
* You are NOT allowed to include any hyperlinks in the post because your account hasn't associated to your company. User profile should be updated.