We've done two load-balanced environments using FTS. In one case, Server B read the VPP data from a Windows share on Server A, and on another we used DFS to sync the two VPPs. In either case, each machine had it's own service, and it's own catalogs.
Worked fine in both cases.
Thanks for the reply! But didn't you get any write-before-write problems when updating the index? Or did you use only one of the servers for updates?
I mean that Server A receives a call to the service to insert an item in the index. It reads the index file into memory and appends the new data. While it is doing so, Server B receives another call to add an item. It reads the index file, which is still in the same state as when Server A read it. Server B does it's processing. Now, whichever of A and B writes the file last will overwrite the other server's changes. One of the changes will be lost.
This does not happen in the sync scenario, but there you instead have the problem with conflicting changes when merging, at least theoretically.
But the two machines each have their own index files. (At least I think they do. Now you have me nervous.)
But if they have their own index files, they aren't load-balanced, then it's two separate sites (at least in search)? Or am I completely misunderstanding the setup?
In both cases, they are two separate machines, behind a hardware load-balancer. Each machine talks to the same database cluster.
In terms of VPP, they differed: one had a primary/slave set-up where one machine had the VPP files and other accessed it via a Windows share. In the other situation, they both had their own VPP files, but they were synced in real-time with DFS.
But, in either case, they each had their own set of Windows services that indexed files in the VPP as if each machine was the only machine in the world. They maintained their own indexes, which I assume were identical, since they're were indexing the same data in the end.
When you came in to the load-balancer, you got sent to one of the machines. That machine had no knowledge of the other one, and operated as if it was the only machine in the world.
I think I understand now. The thing I was worried about was adding stuff to the index by calling the REST API. Say a user in my site adds a comment to a page, and I want to index that comment. If this is a single server with the FTS (EFS? what should it be called?) running locally I would just call the API and add it. If I had two load-balanced servers I would have to move the FTS to a third server, which would be called by the server handling the request resulting in the production of content. One database for both servers - one index for both servers. So far so good.
The problem comes when the search load is too heavy for one server to handle. Ideally I would like a load-balanced setup for the search index where both (search)servers work with the same index. But this raises the concurrency problems since it's file based, not using a database.
To mimic your application though, the web server handling the content production request would simply call BOTH the (search)servers, each holding it's separate index, to add the item to the index. Then incoming searches are handled by either one of the two servers with (functionally, even though items may have been added in different order) identical indexes.
I suppose that works. Though I would have preferred that the two servers could synchronize themselves and the add request, as the search request, could go to the load balancer evenutally hitting either one of the search servers, which then propagates the new item to the other search server(s).
But why would you need to call both machines when you add something? If you add it to one, wouldn't it eventually get to both?
How would it, if they're using their own index files? It would require them to read and write from the same file (on a network share) which raises the write-before-write-problem. Or that the files from both servers are somehow merged now and then.
Or am I completely missing something? Is perhaps every indexed item stored in it's own file? I assumed it was one large flat file for the whole index.
I got this cleared by EPiServer. Either I have to call the service on both servers to add it to the separate indexes or I have to use a mechanism to synchronize the data files. The second alternative is recommended and is what is used in EPiServer Everweb in their SAN (I don't know the details of how it's synced, but we're using Everweb so I don't have to care ;) )
How should the FTS be set up in a load-balanced environment? I suppose we can't put the index data files on a fileshare and run the service on each web front and have them update and/or read from the same file? I guess we could run the index on only one of the web fronts but that kind of kills the load-balanced scenario.
Can the FTS be set up to run on in its own application, on its own server? Does that require an additional licence? If it can be set up this way, how would you go about to load-balance the "search server"?