Advanced Crawler Settings
Depending on your website, you might have some special requirements that can only be fulfilled by accessing our advanced settings, which you will find under Website Crawling or Sitemap Indexing.
How do I index secure or password-protected content?
If you have password-protected content that you'd like to include in your search results, you need to authenticate our crawler so we're able to access the secure pages. There are several options:
If you use HTTP Basic Authentication, simply fill out a username and password.
If you have a custom login page, use the Custom Login Screen settings instead.
Set a cookie to authenticate our crawler.
Whitelist our crawler's IP addresses so it can access all pages without a login (under Firewall > Tools):
Provide a special sitemap.xml with deep links to the hidden content
Detect our crawler with the following User Agent string in the HTTP header:
Mozilla/5.0 (compatible; SiteSearch360/1.0; +https://sitesearch360.com/)
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36
Push your content to our HTTP REST API.
How do I crawl content behind a custom login page?
Under Advanced Settings > Custom Login Screen, check the box called "Active."
Provide the URL of your login page, e.g. https://yoursite.com/login
Provide the login form XPath:
On your login page, right-click the login form element, press Inspect, and find its id in the markup. For example, you might see something like:
<form name="loginform" id="loginform" action=" https://yoursite.com/login.php " method="post”>
So you'd take id="loginform" and address it with the following XPath:
Define the authentication parameter names and map them with the credentials for the crawler to access the content.
Let's find out what parameter name is used for your login field first. Right-click the field and press Inspect. For example, you'll have:
<input type="text" name="log" id="user_login" class="input”>
So you’d take
logand use it as Parameter Name. The login (username, email, etc.) would be the Parameter Value. Click Add and repeat the same process for the password field.
Save and go to the Index section where you can test your setup on a single URL and re-index the entire site to add the password-protected pages to your search results.
Some login screens have a single field, usually for the password (e.g. in Weebly), in which case you'd only need one parameter name-value pair.
Sometimes it can be useful to tell our crawler to set a specific cookie when accessing your website.
For example, if you have a location cookie that determines which language your search results are in, you can set this cookie to "us" for your English-language project or to "de" for your German-language project.
What is indexing intensity?
Indexing intensity influences how quickly our crawler moves through your website. You can set the intensity anywhere from 1 (slowest indexing, little stress on your server) to 5 (fastest indexing, higher stress on your server).
If you are looking for other ways to increase crawling speed, consider switching to sitemap indexing and using the Optimize Indexing setting.
Note: JS crawling isn't enabled for free trial accounts by default. Please reach out if you need to test it before signing up for a paid plan.