Saving Pages programatically

marcus.hansson@cejn.com

Vote:

I'm programatically saving lots of pages where one page can be represented in different languages. But on the pages where there are more than one language present i get the following exception:

Exception: System.Data.SqlClient.SqlException
Message: Violation of PRIMARY KEY constraint 'PK_tblPageKeyword'. Cannot insert duplicate key in object 'dbo.tblPageKeyword'.

I do think this has got something to do with the page and its different languages, but I cannot figure out why or what causes this error.
Is it because I'm trying to save a property on a page where the property itself can only be saved on the MasterlanguageBranch?

Thanks
Marcus

#37026

Feb 17, 2010 12:49

Magnus von Wachenfeldt

Vote:

Sounds like it. One solution could be to do some additional checks with propertyData.IsLanguageSpecific and pageData.IsMasterLanguageBranch.

#37035

Feb 17, 2010 18:04

marcus.hansson@cejn.com

Vote:

I'm not so sure about this anymore, I've tried to do some changes in the code to see where exactly episerver inserts the properties that makes it fail. But each time I got to different lines of code and even after it only inserts into properties that I know to be safe, I get the exception.
Though, I did find this link, but I've yet to test it.
http://world.episerver.com/Blogs/Per-Bjurstrom/Archive/2009/6/Performance-How-to-disable-keyword-indexing-in-SP2/

And since it is the keywords that fail, and earlier we had to disable the pageindexing to be rid of the problems with deadlocks in the database, it might also be the indexing of keywords that causes these problems.

I will try this today and see if I get any results, or does anyone know if it is ok to truncate the tables tblKeyword and tblPageKeyword without it being harmful to the site?

I guess I also should add that we save and publish around 7000-pages in two hours each time we do a masspublishing, which is a lot of pages.

#37069

Edited, Feb 19, 2010 8:56

marcus.hansson@cejn.com

Vote:

I've tested what Per Bjurström said in the link I provided and it works. We no longer get any exceptions when we do a massupdate on all pages we have, or even when we create the tree from scratch.
It's problematic that EPiServer does all indexing after each page that is published, is there no way to turn this off and then do a scheduled job which indexes all pages and keywords during the night?

#37091

Feb 19, 2010 14:49

Vote:

You can set indexingDelayAfterPublish to a very long time and make sure that your app pool recycles before it happens to prevent indexing.

I have created a tool to reindex the pages that could very easily be turned inte a scheduled task.

http://blog.fredrikhaglund.se/blog/2010/02/19/tool-to-re-index-all-episerver-pages/

#37099

Edited, Feb 19, 2010 19:20

marcus.hansson@cejn.com

Vote:

Now I am unsure on how EPiServer handles the default search, but I did imagine that it used the indexed keywords to search in pages, the same index we had to turn off due to massive amounts of exceptions. But after using the code you provided I still get no results when using search. Currently I can only get search-results from pages which were published before I set indexingTextEnabled ="false".

So how does it work?

Oh, and I also get this exception when running the code:
Error while indexing -1
System.UriFormatException: Invalid URI: The hostname could not be parsed.
   at System.Uri.CreateThis(String uri, Boolean dontEscape, UriKind uriKind)
   at EPiServer.Url.EnsureUri()
   at EPiServer.DataAbstraction.SoftLink.set_Url(String value)
   at EPiServer.LazyIndexer.IndexPage(Int32 pageID)
   at EPiServerSample.Plugin.FixIndexing.ThreadProc(Object stateInfo) in FixIndexing.aspx.cs:line 74

#37145

Edited, Feb 22, 2010 14:54

Vote:

Hi!

The routine I wrote calls EPiServer.LazyIndexer.IndexPage() and this is the same method that is triggered during publish so indexingTextEnabled also affects this.

The parser and registration of soft links are quite sensetive for garbage in the html code and invalid urls.

Is this a migrated database from EPiServer 4?

#37176

Feb 23, 2010 12:21

marcus.hansson@cejn.com

Vote:

Hi, I think I made a hasted conclusion, today when I checked the website, I could actually search all the pages which were previously not Indexed. Though it seems like it had taken some time for the indexing to take effect i do guess it works now.
I also tried with indexingDelayAfterPublish and it also works wonders.
To answer your question, yes, this is a migrated database from 4.62b.

So thank you for your help

#37180

Feb 23, 2010 13:09