Best practice - how to delete and archive pages

Michael Gustafson

Vote:

expand_less 0 expand_more

Many of our editors feel uneasiness when deleting pages in EPi and instead choose to unpublish them. The reason is partly that editors do not know if they will reuse the files in the future, partly that they want the ability to see what they have previously published. With time this makes the tree completely full of unnecessary old material that clutters the tree.

I would like the editors to feel completely confident when deleting pages and be sure that they can find their old stuffi.

Any ideas how this can be achieved? I´m thinking of something like a connection to some kind of archive that that regularly and automatic makes copies of the site.

#190197

Apr 04, 2018 17:27

Scott Reed

Vote:

expand_less 0 expand_more

Possible suggestions

Make a copy of the website in the pages tree and then hook in to the episerver events for content. You could then hook in to the deleting event and then instead of deleting cancel and move it instead.
Or alternatively create a scheduled job that runs say every hours and duplicates your site (That could be intensive if you have a large site structure)
Alternatively you could look at actually modifying the tree such as https://tedgustaf.com/blog/2013/hide-pages-in-the-page-tree-in-episerver-7/ and maybe leaving them unpublished but hiding them from the tree menu, then add an option to the users options or in to the Episerver UI to toggle this or just show unpublished pages to admins rather than editors, then it's part of an approval sequence to unpublish pages. This means no work in moving things but will result in a cleaner UI as most of the time editors will only see correct pages.

#190222

Apr 05, 2018 10:24

Michael Gustafson

Vote:

expand_less 0 expand_more

Thanks for great answers. You got me thinking - since we have loads and loads of pages I think we really need to get rid of the unnecessary pages from EPi. So I guess hiding them or move them to a (hidden) archive folder is not an option in the long run.

Maybe another approach could be to have a url-crawler (outside EPi) to save all html pages or at least make images of them. Like the Wayback Machine or similar. Is it someone "out there" that faces the same problems?

#190228

Apr 05, 2018 14:07

Scott Reed

Vote:

expand_less 0 expand_more

You could do but I guess it depends how complex your data structure/block strucutre is to how easy it would be to restore these. What I would probably do is create another episerver site off another domain/server and then take daily backups of the databases. Then if you needed to get access to a specific version of the content you could for any day, this would also allow you to use the admin import/export tool to export those pages and automatically pull them and any associated media/blocks back in to the current solution. If you wanted to go furthur you could create an admin plugin which lists these backups and allows auto restoration of a specific one by executing a powershell script or something. If you're on the DXC this may be difficult but if you control them it's something you can definately do.

If you don't have the DXC you could always programatically hook in to the import/export feature of episerver and just generate an export of the site tree that could be used for re-importing too. I think it really depends on if you want editors/admins to self action this feaure or if it's something you want to do.

#190230

Apr 05, 2018 14:25

Scott Reed

Vote:

expand_less 0 expand_more

Also the import export tool generate a very readable file so maybe using this and then creating a simple viewer would allow you to programatically export these files and import them as needed. Lots of ideas :-)

#190231

Apr 05, 2018 14:26

akmlwdamp123

Vote:

expand_less 0 expand_more

Make sure you assign responsibility for content maintenance to someone as a regular part of their job. They should have clear processes for dealing with out-of-date, irrelevant, and poorly performing content, and have the authority to get it done.
Get buy-in from senior stakeholders on the definitions of archiving and deleting content for your company. For example, the chief information office of the Canadian government distinguishes between three types of web content:
- Current content: Information that is up-to-date, relevant, and required.
- Archived content: Information that is no longer current but is retained online for reference or to provide a context to current content.
- Legacy (or deleted) content: Information that has been revised or supplanted and that has been deleted from the site and moved into a “corporate repository”.
Be clear on your company’s legal and technology requirements around archiving and deleting website content.

#215935

Edited, Jan 19, 2020 20:19