Google Lets Spiders Automatically Fill Out Forms to Crawl More Web Pages

by eeyye on 2008-04-13 23:31:18

According to foreign media reports: US search giant Google has recently started implementing a new technology in its web-crawling bots. These bots can automatically fill out forms on some web pages and then submit them to the server for automatic crawling of the feedback page, thus obtaining more detailed information about the website.

On April 12th, Beijing time, according to foreign media reports: US search giant Google has recently started implementing a new technology in its web-crawling bots. These bots can automatically fill out forms on some web pages and then submit them to the server for automatic crawling of the feedback page, thus obtaining more detailed information about the website.

Media analysis suggests that this will pose a threat to website information security.

Generally speaking, forms (Form) are a way for websites to collect user information. For example, when users apply to become registered members, they need to submit relevant identity information. The form will send these data to the server, and the webpage on the server will provide the next step instructions.

In the past, Google's spider bots would not fill out forms because they could not know the content of the next prompt page.

Recently, Google has upgraded its crawling system. The robot will automatically fill in the data based on the names of various items in the form and submit it to the server. In this way, the feedback page from the server will also be crawled by the robot, and Google will obtain more information about the website.

Google stated in a blog post that they would take a cautious approach to this feature. For example, at first, only some very useful websites would use the automatic form-filling method. Additionally, website administrators can specify in the robots.txt file whether they allow Google to submit forms, and Google will not violate the wishes of the website administrator.

Some search industry experts have indicated that this functionality from Google could pose a threat to the information security of corporate websites.