Debugging SharePoint Search with PnP PowerShell and Crawl Logs
Summary
- Investigates missing search results despite crawl entries with no errors.
- Correlates
SPItemModifiedTimeblank entries with unsearchable files (while noting some blanks still indexed, so additional checks are required). - Provides a PnP PowerShell script to detect affected items at scale.
- Shares remediation and prevention guidance.
Table of Contents
- Background
- Symptoms
- Investigation
- Tenant-wide Detection Script
- How the Script Works
- Results
- Root Cause
- Fix
- Prevention Tips
- References
Background
A particular document library stopped returning many files in SharePoint search, even though the crawl log showed entries without errors. The affected files had been moved from one site collection to another where a default sensitivity label was configured at the destination library level.
Symptoms
- Search queries by path (for a specific folder) returned fewer files than expected.
- Example: A folder with ~140 files returned ~84 results; the issue scaled across ~6,000 files with ~1,500 affected.

- Crawl log entries existed, but many items showed blank
SPItemModifiedTime. - Reindexing the site or the library did not resolve the issue.
Investigation
- Scoped queries using
Path:<site>/<library>/<folder>revealed discrepancies. - Correlated unreturned items with crawl log metadata showing
SPItemModifiedTimeis blank. - Need to determine the issue was widespread and needed tenant-level scanning.
Tenant-wide Detection Script
Use PnP PowerShell to query crawl logs per library, identify entries with blank SPItemModifiedTime, and verify whether the items are actually absent from search results.
# Sample: narrow to a specific library
Get-PnPSearchCrawlLog -Filter "https://contoso.sharepoint.com/sites/hr-policy/Shared%20Documents" -RowLimit 10000 -RawFormat
The following script scales the approach across multiple sites from Sites.csv with SiteUrl as column and exports a report of potentially unsearchable items.
cls
# ===== Settings =====
$clientId = "xxxxxxxx"
$dateTime = Get-Date -Format "yyyy-MM-dd-HH-mm-ss"
$invocation = Get-Variable -Name MyInvocation -ValueOnly
$directoryPath = Split-Path $invocation.MyCommand.Path
$csvPath = Join-Path $directoryPath "Sites.csv" # CSV must have a column 'SiteUrl'
# Ensure output folder exists
$outputFolder = Join-Path $directoryPath "output_files"
if (-not (Test-Path $outputFolder)) { New-Item -ItemType Directory -Path $outputFolder | Out-Null }
$outputCsv = Join-Path $outputFolder ("CrawlLog-SPItemModifiedTime-Null-" + $dateTime + ".csv")
# System/ignored lists
$ExcludedLists = @(
"Access Requests","App Packages","appdata","appfiles","Apps in Testing","Cache Profiles","Composed Looks",
"Content and Structure Reports","Content type publishing error log","Converted Forms","Device Channels",
"Form Templates","fpdatasources","Get started with Apps for Office and SharePoint","List Template Gallery",
"Long Running Operation Status","Maintenance Log Library","Images","site collection images","Master Docs",
"Master Page Gallery","MicroFeed","NintexFormXml","Quick Deploy Items","Relationships List","Reusable Content",
"Reporting Metadata","Reporting Templates","Search Config List","Site Assets","Preservation Hold Library",
"Site Pages","Solution Gallery","Style Library","Suggested Content Browser Locations","Theme Gallery",
"TaxonomyHiddenList","User Information List","Web Part Gallery","wfpub","wfsvc","Workflow History",
"Workflow Tasks","Pages"
)
# ===== Collect results =====
$results = New-Object System.Collections.Generic.List[object]
$sites = Import-Csv -Path $csvPath # expects column "SiteUrl"
foreach ($s in $sites) {
$siteUrl = $s.SiteUrl
Write-Host "Connecting to site: $siteUrl" -ForegroundColor Cyan
# Connect interactively with the client ID (adjust auth as needed for your tenant)
Connect-PnPOnline -ClientId $clientId -Url $siteUrl -Interactive
# Get only visible document libraries
$lists = Get-PnPList -Includes BaseType, BaseTemplate, Hidden, Title, ItemCount, RootFolder |
Where-Object {
$_.Hidden -eq $false -and
$_.BaseType -eq "DocumentLibrary" -and
$_.Title -notin $ExcludedLists
}
foreach ($library in $lists) {
# Build library URL: e.g. https://tenant/sites/site/Shared Documents
$libraryUrl = ($siteUrl.TrimEnd('/')) + '/' + $library.rootfolder.Name
Write-Host "Querying library: $($library.Title)" -ForegroundColor Yellow
# Keep row limit reasonable to avoid huge payloads
$rowLimit = $library.ItemCount
# Pull crawl log entries; filter to items with null/empty SPItemModifiedTime
$entries = Get-PnPSearchCrawlLog -Filter $libraryUrl -RowLimit $rowLimit -RawFormat |
Where-Object { $_.SPItemModifiedTime -eq $null }
# Shape results for export; include FullUrl (fallback to DocumentUrl if missing)
$output = $entries | Where-Object {$_.FullUrl -ne $libraryUrl -and $_.FullUrl -notlike "*`/Forms/Default.aspx" -and $_.FullUrl -notlike "*.aspx*" -and $_.FullUrl -notlike "*.one*"}
foreach($result in $output)
{
# Filter to a site/library path$result.FullUrl and select extra properties
try{
$kql = "Path:`"$($result.FullUrl)`""
$searchr = Submit-PnPSearchQuery -Query $kql -All -SelectProperties @(
"Title","Path"
) -SortList @{LastModifiedTime="Descending"}
if($searchr.Rowcount -lt 1)
{
# Create a PSCustomObject row
$projected = [pscustomobject]@{
FullUrl = $result.FullUrl
DocumentUrl = $libraryUrl
SPItemModifiedTime = $result.SPItemModifiedTime
ErrorCode = $result.ErrorCode
}
$results.Add($projected)
}
}
catch{
Write-Error "$($_.Exception.Message) for $($result.FullUrl)"
}
}
}
# Disconnect-PnPOnline
}
# ===== Export =====
$results | Export-Csv -Path $outputCsv -NoTypeInformation -Encoding UTF8
Write-Host "Export complete: $outputCsv" -ForegroundColor Green
How the Script Works
- Reads site URLs from
Sites.csvand connects usingConnect-PnPOnline -Interactive. - Iterates visible document libraries, skipping common system libraries.
- Retrieves crawl log entries via
Get-PnPSearchCrawlLogand filters items whereSPItemModifiedTimeis blank. - Uses
Submit-PnPSearchQuerywithPath:"<item-url>"to verify if the item is missing from results. - Exports a CSV of affected items for remediation.
Results
- Confirmed widespread presence of items with blank
SPItemModifiedTimethat were not returned by search. - Reindexing was attempted at site and library levels but did not resolve the issue.
Root Cause
- During file moves from a library without a default sensitivity label to a library with a default sensitivity label, a conflict occurred.
- The sensitivity label surfaced as a managed property duplication on certain documents mentioned by the Microsoft support engineer.
- Duplicate/failed application of label metadata prevented proper indexing, leading to blank
SPItemModifiedTimein crawl logs and missing search results.
Fix
- Remove the existing sensitivity label on affected files and allow the destination library’s default sensitivity label to apply.
- After remediation, verify indexing via crawl logs and confirm searchability with KQL path queries.