Exploring the Sitecore PowerShell Extension for Media Cleanup
If it Works, it's good enough...!
Recently, I’ve been delving into the Sitecore PowerShell Extension (SPE). I was tasked with a particular requirement: a content author wanted to periodically clean up the media folder. My job was to create a script that would traverse the parent node, verify whether a media item is being used somewhere, and based on that, decide whether to remove the media.
A few points I understood were:
Media can be anything: an image, PDF, etc.
Some images might be more under developer control than CMS control, so I had to ensure I didn’t touch those media items.
Perfect code is a myth, so backup and logging were the most crucial parts of the script.
To carry out the above idea, I started with a few static variables and helper methods, as shown below:
#prepare the logs throughout the PS process.
$ProcessLogs = @()
#log the list of Item which should be removed.
$RemovableItemLogs = @()
#list of Item which we are going to delete so prepare the back of those Item.
$Downloadableitems = @()
#this will act as our root node from where we will start traversing the tree.
$scanningNode = Get-Item -Path "master:/sitecore/media library";
#while iterating the tree we will skip this nodes considering it will have the item which we don't want to delete irrespective of usage.
$excludeNode = "master:/sitecore/media library/BH";
$location = get-location
$time = Get-Date -format "yyyy-MM-d_hhmmss"
$zipName = "MediaCleanupActivity"
#as we already have lots of chunk in App_data folder we want to put our thing in specific folder.
New-Item -Path "$($SitecoreDataFolder)" -Name "PS_MediaCleanUp" -ItemType "directory"
#"SitecoreDataFolder" will five you the path till app_data folder in your wwwroot
$basePath = "$($SitecoreDataFolder)\PS_MediaCleanUp\"
$zipPath = "$($basePath)$zipName-$time.zip"
$ProcessLogPath = "$($basePath)SP_ProcessLog-$time.txt"
$RemovableItemLogsPath = "$($basePath)SP_RemovableItemLogs-$time.txt"
#above we have few path constants
Once we have everything ready, let’s start with the process to get all the children we have for the scanning node.
$NodeRecurse = $scanningNode | Get-ChildItem -Path master:
foreach ($nodeItem in $NodeRecurse) {
$ProcessLogs += File-Log "Node Item :- $($nodeItem.ItemPath)"
Search-MediaRefferWithExclusion -scanningItem $nodeItem -excludeItems $excludeNode -Target $Target -Languages $Languages
}
Of course, the above code will only traverse the first level of children, not the descendants. We will make the complete iteration till the leaf node through recursion in a future blog post.
function Search-MediaRefferWithExclusion ($scanningItem, $excludeItems, $Target, $Languages) {
foreach ($excludedItem in $excludeItems) {
$exclueNode = Get-Item -Path $excludedItem;
if (($exclueNode.ID -eq $scanningItem.ID) -or ($scanningItem.ItemPath.StartsWith($exclueNode.ItemPath))) {
$script:ProcessLogs += File-Log "Skipping branch as it's in exclude list $($exclueNode.ItemPath)."
return;
}
}
if ($scanningItem.TemplateName -eq "Image") {
$itemLinks = Get-ItemReferrer -Item $scanningItem | Select-Object -Property Name
$itemcount = (@() + $itemLinks).Count
if ( $itemcount -eq 0 ) {
$script:Downloadableitems += $scanningItem
$scanningItem | Remove-Item
$script:ProcessLogs += File-Log "Delete image $($scanningItem.ItemPath)."
$script:RemovableItemLogs += File-Log "Delete image $($scanningItem.ItemPath)."
}
else {
$script:ProcessLogs += File-Log "Skip image $($scanningItem.ItemPath) as it having $($itemcount) references."
}
}
$NodeRecurse = $scanningItem | Get-ChildItem -Path master:
foreach ($nodeItem in $NodeRecurse) {
$script:ProcessLogs += File-Log "Node Item in recursion :- $($nodeItem.ItemPath)"
Search-MediaRefferWithExclusion -scanningItem $nodeItem -excludeItems $excludeItems -Target $Target -Languages $Languages
}
}
The Search-MediaRefferWithExclusion
method will first check if the node or item is part of the exclusion list. If not, it will process to check the reference of it and plan it for deletion. This function is recursive: the first foreach loop checks the inclusion list, the second if condition checks the media type and whether the item is being used or not, and finally, the last foreach loop calls the same method until we reach the leaf node.
Make sure the Get-ItemReferrer
will only return a PSCustomObject
. We can access it as an array with $items = (@() + $itemLinks)
.
Once this method executes, we will have an array ready with a list of items which we have to zip up for backup. The method below will use the items under $Downloadableitems
and zip it into the path we mentioned above in $zipPath
.
function prepare-ZipItems( $zipArchive, $sourcedir ) {
Set-Location $sourcedir
[System.Reflection.Assembly]::Load("WindowsBase,Version=3.0.0.0, `
Culture=neutral, PublicKeyToken=31bf3856ad364e35") > $null
$ZipPackage = [System.IO.Packaging.ZipPackage]::Open($zipArchive, `
[System.IO.FileMode]::OpenOrCreate, [System.IO.FileAccess]::ReadWrite)
[byte[]]$buff = new-object byte[] 40960
$i = 0;
ForEach ($item In $Downloadableitems) {
$i++
if ([Sitecore.Resources.Media.MediaManager]::HasMediaContent($item)) {
$mediaItem = New-Object "Sitecore.Data.Items.MediaItem" $item;
$mediaStream = $mediaItem.GetMediaStream();
$fileName = Resolve-Path -Path $item.ProviderPath -Relative
$fileName = "$fileName.$($item.Extension)".Replace("\", "/").Replace("./", "/");
"Added: $fileName"
Write-Progress -Activity "Zipping Files " -CurrentOperation "Adding $fileName" -Status "$i out of $($Downloadableitems.Length)" -PercentComplete ($i * 100 / $Downloadableitems.Length)
$partUri = New-Object System.Uri($fileName, [System.UriKind]::Relative)
$partUri = [System.IO.Packaging.PackUriHelper]::CreatePartUri($partUri);
$part = $ZipPackage.CreatePart($partUri, "application/zip", [System.IO.Packaging.CompressionOption]::Maximum)
$stream = $part.GetStream();
do {
$count = $mediaStream.Read($buff, 0, $buff.Length)
$stream.Write($buff, 0, $count)
} while ($count -gt 0)
$stream.Close()
$mediaStream.Close()
}
}
$ZipPackage.Close()
}
At the end, we will output the logs file we have with the Out-File command from PSE.
At this stage, we will have three items under “App-data/PS_MediaCleanUp”
the process logs, removable item logs, and the backup images in a zip file.
I have set this script in the Sitecore scheduler for automation, so this script will not have much user interaction.
However, if we want to run this script remotely and want the data from the browser, we can use the code below to achieve it.
Download-File -FullName $zipPath > $null
The above command will download the zip file in the browser.
You can find the complete script in GitHub Repo…!