How to make search engines understand web pages more accurately?

by wgyuvs8a on 2012-02-29 18:02:37

In the first article of this series, it was mentioned that SEO should be data-driven, and some preliminary work on data preparation was discussed. While data is crucial, its role can only be auxiliary: identifying problems, summarizing improvements, serving as a reference for decision-making, but it cannot exist independently of existing SEO methods. The methods of SEO can be divided into two or four categories: making the website friendly to search engines, and making the website friendly to the users of search engines. If we also consider black hat SEO techniques, we can add two more: making search engines mistakenly believe the website is friendly to them, and making search engines mistakenly believe the website is friendly to their users. Any experienced SEO professional can summarize and see if there are any SEO methods that fall outside these four points. At least, I have never seen any.

Of course, black hat SEO is not within the scope of discussion in this series, so I will briefly describe in two articles how to make a website friendly to search engines and their users. The theme of this article is how to make a website friendly to search engines, which is a very broad topic. After several revisions, it was decided to just provide one example. Since the technology involved in search engines is so extensive, and the corresponding website technology required is also substantial, no single article can cover more than a fraction of the iceberg. Therefore, it's better to find a relatively representative example, and leave the rest for everyone to expand upon.

How can search engines understand web pages more accurately?

Search engines are, after all, just programs, and they cannot perfectly judge the different situations of the numerous different web pages on the internet.

Segmentation

Generally, in HTML code, each important section of the webpage should be marked with tags, and within each important section, there should be one or more tags clearly indicating the theme of that section. This practice makes the content expressed in each part of the page clearer. Especially for search engines, it can clarify through such means how to segment the webpage and understand the nature of each segment through subheadings, thereby determining how to calculate and process the information.

A typical example is Amazon's product information page:

As can be seen in the above figure, it is clearly divided into three sections, and it clearly indicates that they are related purchases, technical details, and product details. In contrast, many e-commerce websites place the product image and price at the top, and starting from the second block, they pile up product parameters, product descriptions, and a large number of potentially unnecessary product images together, which undoubtedly performs much worse. (The Product Details section in the above figure mostly contains automatically generated content, and it is valuable for both users and SEO, yet it is overlooked by most e-commerce websites.)

Amazon's product page SEO is the best in the e-commerce field, far surpassing websites like eBay. One of the main reasons is segmentation.

Semanticization

Here, semanticization refers to the HTML code that users do not see having meaning. Although this has no significance for users, it allows search engines and other programs to understand the content more easily. (Of course, it also facilitates code maintenance, which is a technical matter.)

Microdata and microformats are concepts that have been increasingly valued. They can undoubtedly clearly identify the meaning of elements on the webpage. Here, I won't go into detail, but you can refer to:

However, many websites currently do not use HTML5 for various reasons (but from an SEO perspective, efforts should be made to promote it), so tags like and cannot be used, and must still be used. In such cases, attention needs to be paid to ID naming. For example, for search engines, is easier to understand than . Moreover, generally speaking, places where IDs can be used should not use CLASSes, as many designers like to write without considering the difference. However, W3C standards clearly point out that elements with uniqueness should use IDs rather than CLASSes. For search engines, elements with uniqueness can determine their position of appearance, making it easier to determine the role of that section on the webpage.

For instance, there was once a PPC landing page in our company where the relevant keywords appeared on the webpage, but the quality score of those words remained extremely low. After analysis, it was found that those keywords were written in sections, causing these texts to be treated as irrelevant content at the bottom of the page, leading to incorrect analysis by the search engine, negatively impacting the quality score.

Popularization

Here, popularization refers to not using difficult-to-understand indicative texts on the webpage, such as "find" instead of the common "search." This can cause a certain degree of confusion for users and even more so for search engines.