Risks of two scheduled jobs programmatically modifying a single piece of content

Vote:
 

Gidday,

We currently have a Scheduled Job which synchronises information from a third-party API into several fields on Pages within our CMS.

I'm looking at adding the synchronisation of some more data onto new fields on the same Page Type, but the new synchronisation will be significantly slower than the existing, and I would prefer not to slow down the existing Scheduled Job.

I was considering adding a second Scheduled Job to perform the new synchronisation, leaving the existing Job with only its current responsibilities, but I was concerned that there might be edge cases where both Jobs attempted to update the same Page simultaneously.

Are there any consistency guarantees provided when performing a Save on Writeable Content, or would I need to ensure that the two jobs were not running simultaneously?

I couldn't see anything in a quick dive down the codepath, so my assumption is that any changes made in between calling CreateWritableClone and Save by a parallel process would not be reflected, and the second save would overwrite anything done by the first.

Thanks for any assistance.

#282984
Edited, Jul 04, 2022 8:17
Vote:
 

Hi,

Try this in both your first and second schedule job:

contentRepository.Save(content, SaveAction.ForceCurrentVersion)

It should save only the changes you set.

The only downside is that it won't create a version making it difficult for you to track or rollback the changes...if you even care about it.

Should be easy to create a poc for it.

#283051
Jul 04, 2022 23:13
Vote:
 

Thanks Surjit!

I tested this out below (Note the below includes some of our convenience helpers / extensions to retrieve content, get a writeable version or save, should be pretty clear from context what they do).

var threadOneHasWriteable = false;
var threadTwoHasSaved = false;

var threadOne = Task.Run(() =>
{
		var content = ContentHelper.SafeGet<ArticlePage>(269614);
		var writeable = content.GetWriteable();
		threadOneHasWriteable = true;

		while (!threadTwoHasSaved)
				Thread.Sleep(TimeSpan.FromSeconds(1));

		writeable.MetaDescription = "Updated by Thread One";
		writeable.Save(SaveAction.ForceCurrentVersion);
});

var threadTwo = Task.Run(() =>
{
		while (!threadOneHasWriteable)
				Thread.Sleep(TimeSpan.FromSeconds(1));

		var content = ContentHelper.SafeGet<ArticlePage>(269614);
		var writeable = content.GetWriteable();

		writeable.MetaTitle = "Updated by Thread Two";

		writeable.Save(SaveAction.ForceCurrentVersion);
		threadTwoHasSaved = true;
});

Task.WhenAll(threadOne, threadTwo).GetAwaiter().GetResult();

var finalContent = ContentHelper.SafeGet<ArticlePage>(269614);
WriteLine($"Meta Title - {finalContent.MetaTitle}");
WriteLine($"Meta Description - {finalContent.MetaDescription}");

I found the following -

  1. If two threads are accessing the same content, but editing different fields, both saves go through successfully and both edits are respected
  2. If two threads are accessing the same content and they are editing the same field, both saves go through successfully and the second save performed dictates the content in the current version
  3. It doesn't seem to matter whether you use SaveAction.ForceCurrentVersion or SaveAction.Publish (Those were the only two I tested)
#283153
Edited, Jul 05, 2022 3:52
Vote:
 

After you run your code, fire up edit mode and look at the versions its created. You'll find saveaction.publish will create a new version.

But saveaction.forcecurrentversion will save to the current version.

The potential problem you have with your poc is task one may have finished before task two. In other words, the second createwritableclone might be being called after saving of task one. So you might be observing metadescription value is carrying over to the next version...tasks by default are scheduled in the threadpool.

You can be doubly sure by doing the sequence of events without tasks. So call createwritableclone twice one aft the other and experiment with both saveactions and you'll see the difference.

#283157
Jul 05, 2022 6:35
Vote:
 
After you run your code, fire up edit mode and look at the versions its created. You'll find saveaction.publish will create a new version.

Apologies for not being clearer there - I only meant that it didn't seem to change the behaviour with regards to the concurrency protection (Or lack therof).

The potential problem you have with your poc is task one may have finished before task two. In other words, the second createwritableclone might be being called after saving of task one

The second thread waits for the first to have called GetWriteableClone() before it attempts to call GetWriteableClone() and Save, and the first waits until the second has saved before it attempts to Save - Trying to replicate a worst-case scenario where the two operations actually overlap by retrieving the same piece of unmodified content and then both performing different edits. In this case, it appears that the writes to the two different fields I used (Apologies, the two fields I picked have pretty similar names) both went through, and neither thread removed the changes the other had done.

#283163
Jul 05, 2022 9:59
Vote:
 

Makes sense! Apologies I skimmed over the outer scoped variables. Hopefully this help you with your bigger piece of work.

#283164
Jul 05, 2022 10:29
Vote:
 

How time consuming is your 1st scheduled job which syncs to CMS from an API? If this is not lengthy and system can wait then you always can wait for this to be finished first. For instance if you have 2nd scheduled job to perform the new synchronisation, You can check if the first job is still running and put your content as scheduled. So next time when it saves on scheduled time, it again checks for the first job completion.

var firstSyncJob = _scheduledJobRepository.Get(new Guid("8EC257F9-FF22-40EC-9958-C1C5BA8C2A53"));
	if (firstSyncJob.IsRunning)
	{
		"CMS Sync job is running. Leve no actions to perform.";
        writeable.StartPublish = futureDate;
        _contentRepository.Save(writeable, SaveAction.CheckIn | SaveAction.DelayedPublish, AccessLevel.NoAccess);
	}
	else{
		var secondSyncJob = _scheduledJobRepository.Get(new Guid("9EC257F9-FF22-40ED-9958-C1C5BA8C2A53"));
		_scheduledJobExecutor.StartAsync(secondSyncJob);
	}

You can add check on content events as Publishing_Content and do above checks.

#283503
Jul 11, 2022 13:18
Oshi - Jul 12, 2022 9:39
Thanks, I actually didn't know you could programmatically check whether jobs were running, let alone start them!
* You are NOT allowed to include any hyperlinks in the post because your account hasn't associated to your company. User profile should be updated.