SqwidgeBot works in an automated manner, it crawls your site looking at the content to add to our Index. Currently we are running in Beta development so the results aren't published to the general public just yet ... but soon!
That is perfectly fine, we don't want to upset anybody if possible and we obey the Robot Exclusion standard. To prevent a file or directory of your site from being crawled, add a file named robots.txt in the root directory (example: http://www.domain.com/robots.txt). In this file you are able to specify disallowed paths of your website in the following format:
User-agent: SqwidgeBot
Disallow: /private
Disallow: /everything
In the above example we have told SqwidgeBot not to access any files located in the directory named /private and /everything.
For more information on Robot Exclusion please visit www.robotstxt.org
Definitely! Sometimes the speeds at which SqwidgeBot browses website's can be a little bit too fast for you so we have made every effort to support the Crawl-Delay robot directive.
User-agent: SqwidgeBot
Crawl-Delay: 5
The robots.txt example above will limit the downloading of a page to every Ten seconds. You can change this number to however many seconds you like. Remember to always keep it as seconds though.
In short, Yes, of course! We make every effort to index as many web page's as possible. But given the ever changing nature of a dynamic page we limit the number of pages we download during a visit.