Tuesday, December 20, 2016

SharePoint 2013: An unrecognized HTTP response was received when attempting to crawl this item

I got this error message:
The start address http://somesite cannot be crawled.
Context: Application 'Search_Service_Application', Catalog 'Portal_Content'
Details:
An unrecognized HTTP response was received when attempting to crawl this item. Verify whether the item can be accessed using your browser.   (0x80041204)
  
The start address https://somesite cannot be crawled.
Context: Application 'Search_Service_Application', Catalog 'Portal_Content'
Details:
Item not crawled due to one of the following reasons: Preventive crawl rule; Specified content source hops/depth exceeded; URL has query string parameter; Required protocol handler not found; Preventive robots directive. (0x80040d07)

After recreating the search and different changes to the content sources and the typical missing permissions, the customer basically got stuck with this broken search. After taking a look at the settings and the web.config I found this:
<httpProtocol>
<customHeaders>
<add name="X-Content-Type-Options" value="nosniff" />
<add name="X-MS-InvokeApp" value="1; RequireReadOnly" />
</customHeaders>
</httpProtocol>

What does this do?
1. <add name="X-Content-Type-Options" value="nosniff" />
Every file will bring a MIME type with it, which can differ from the specified MIME type. Internet Explorer can check the files, if the files should be handled different and choose a different application or handling of the file. But this will also lead to a security issue, for example: you upload a modified JPEG with a script included, this JPEG could possibly start to run the code if handled the wrong way. Basically the script will get detected and because the MIME type detected by IE is different from the specified MIME type, IE will start to run the script.

2. <add name="X-MS-InvokeApp" value="1; RequireReadOnly" />
With InvokeApp the Internet Explorer can start an application (like Office) and hand over the URL to the application. The file only open in a read only state with "RequireReadOnly" set in this line.

It is fine to remove those lines if you are not running an external website in your SharePoint environment. As soon as you allow anonymous access, you should put those lines back in. But those lines will also create some issues with the search. Removing them fixed my issues.