Project settings
Online project – the site address for analysis is specified here. Pages of the site will be downloaded from this address and then analyzed. You should specify the domain and the analyzer will then automatically find other pages on the site by following the links.
Offline project (local) – a list of local .html files for analysis can be specified here.
Number of search threads – specifies the number of pages to be downloaded simultaneously.
Analyze – start analyzing the pages according to the above procedure.
Stop analyze – stop analyzing the pages.
Settings-> Analyze saved pages (for offline projects only) – when a remote (Internet located) project is analyzed, the downloaded pages are saved and will be analyzed when the project is launched. However, if local pages are analyzed two variations are possible when launching the project:
| · | Repeated analysis of pages included in the project. In this case, all changes to the pages since the last analysis will be taken into account and the updated results will be displayed. This is the default behavior of the analyzer.
|
| · | Show the results of the previous analysis. When the project is launched, the data from the previous analysis will be displayed. Any changes to the local pages will be ignored until a new analysis is performed.
|
Charset - the sites being analyzed can be in various languages and use various encodings. You may need to specify the encoding of a particular site for it to be displayed correctly. You can also allow the program to determine encoding automatically (Autodetect).
Advanced settings: below there is a number of tabs with advanced settings.
Files - downloaded and analyzed files are displayed here
Analysis rules – it often happens that a keyword has several forms, there are words with the same root, synonyms, etc. in the text. In this case it may be useful to regard a group of words as one word while analyzing pages.
General analysis rules - enables the specified rules for the analysis of pages
Word length - defines the length of words to apply this rule to
Cut on the left - defines the number of characters that are not taken into account (skipped) at the beginning of the word
Cut on the right - defines the number of characters that are not taken into account (skipped) at the end of the word
Example: suppose we specify the following rule: "word length - 8, skip on the left - 0, skip on the right - 2". Let us take the word "analysis" for example. According to the specified rule, the forms of this word (analysis, analyses) will be regarded as one word during the analysis, which makes the latter more convenient.
Note: as a rule, this option should be used to analyze non-English sites (i.e. in those languages where words may take various inflections, prefixes, etc.).
The Add, Remove, Clear list buttons allow you to manage the list of rules.
Word groups - enables the mode of grouping words. Suppose we have a group of keywords that have the same root (e.g. run, running, runner, etc.). It can be useful to regard these words as one word while analyzing pages. In this case you can create a group named "run group" and specify the list of words included into this group. During the analysis, the program will regard all words of this group as one keyword and all parameters (weight, density, etc.) will be displayed for this group.
The Create group, Add word, Edit, Delete group, Remove word, Sort, Check duplicates, Undo current changes, Collapse, Expand, Save, Load buttons allow you to manage the list of groups.
Include/Exclude paths - sometimes it may be convenient to analyze only a part of your site. For example, the site can have a forum with a lot of pages that should be skipped during the analysis. Or, otherwise, you need to analyze only one part of the site and skip the rest of pages. To solve these tasks, you should let the analyzer know what paths should be checked on the site.
"Include paths" mode - the program will analyze only the pages that are stored at the specified location. For example, to analyze the section www.site.com/reports/ on the site, you should specify this path. The rest of pages on the site will be ignored.
"Exclude paths" mode - the program will analyze all pages on the site except the specified ones. For example, to avoid analyzing the forum of the site, you should specify the path www.site.com/forum.php?. In this case all pages except those of the forum will be analyzed.
The Add, Delete, Load, Save buttons allow you to manage the list of specified paths.
Stop words - any text contains a lot of common words (prepositions, interjections, articles, etc.) search engines do not take into account while analyzing pages. It is convenient to skip such words during the analysis. This section allows you to edit the lists of stop words that the program will skip during the analysis.
The Add, Delete, Load, Save, Load defaults buttons allow you to manage the list of stop words.
Keywords by pages
In the left hand list, all the analyzed pages are enumerated. The table displays a list of all the words contained in this page as well as their parameters.
Count – the number of times the word appeared in the page. As well as text, the count includes: TITLE tag, meta-tag Keywords, Description and Alt tags for images.
Weight. Words contained in the page are analyzed by the search engines using a specific algorithm. Here are some of factors taken into account by the search engines:
| · | The total number of keywords.
|
| · | The text attribute tags. For example, title, bold type and so on.
|
| · | The distance from the beginning of the text.
|
Various other parameters are also used. Using similar methods, HTML analyzer attributes a definite figure - the weight - to each word, and this allows you to estimate their significance.
Density shows the percentage ratio of the number of specified keyword occurrences to the total number of words in a page. The density of keywords is an important parameter that is to be taken into account when optimizing your site. A low density will cause the search engine to consider the page irrelevant for a query on that keyword. An extremely high keyword density will cause the search engine to switch on a search spam-filter and as a result, the page will be excluded from search results.
Text. This column shows the total number of occurrences of a specified keyword in the reading text. This is the text that is visible to your site visitors. Keywords in TITLE tags, meta-tags and Alt tags are not included in this count. The program does count keywords located between certain text design tags (bold type, italics etc.) or if a keyword is part of an anchor text.
Title – shows the number of times a specified keyword appears in the TITLE tag, the title of the text.
Bold – shows the number of times a keyword is in bold type.
Italic – shows the number of times a keyword is in italics.
Header – shows if a keyword appears in the headers. Such keywords are marked with the tags "hx" "/hx".
Anchor text. The page may include links to other resources. Usually, a link has some descriptive text (the text between the tags "a href=…" and "/a") which is called Anchor or Anchor text. This column reflects the number of keywords found in anchor text.
Alt (Alt tags for images). Any image in a page may have some alternative text that is shown when the picture cannot be downloaded. This column shows if a keyword was included in the pictures Alt tags.
MetaK (meta-tag Keywords). The primary purpose of a Keywords meta-tag is to show a list of keywords relevant to page content. This list of words was intended to be used by search engines to denote page content. Currently, meta-tag information is ignored by most of the search engines, but we recommend that they are still used on your pages.
MetaD (meta-tag Description). When a search engine builds a report on some search query it displays a brief description as well as a link to each web resource. Generally, the description is based on the text contained at the page. The purpose of the Description meta-tag is to allow the search engine to use a description given by the author of a page instead. This meta-tag is not used by all search engines but its presence on web pages is recommended.
Pages by keywords
This report is similar to the "Keywords by pages" report. The difference is that you choose a keyword in the left hand list and the program will then list all of the pages containing the keyword.
Phrases by pages
The left hand list enumerates all of the analyzed pages. The table shows the list of phrases contained in a page together with their parameters.
From the point of view of a search engine, a phrase is a group of words contained in a page. Any combination of words, even meaningless ones, is a phrase to a search engine. However, when searching by phrases, preference is given to those pages containing specific keywords in a sequence that is likely to make sense. HTML analyzer checks all the word combinations in the page and attributes a weight to them. This allows you to estimate the importance of each phrase to the search engines.
The weight of a phrase depends on various parameters and three components contribute toward the phrase weight. Firstly, it is based on the weight of the individual words that make up the phrase as explained above. The numbers of words, their appearance, presence in TITLE tags, meta-tags, etc. are all taken into account.
There may be several occurrences of the exact phrase in the page text. The second component of the phrase weight is based on the number of times the exact phrase appears in the text.
The third weight component is an estimate of the concentration of key phrase words in the text. If the words comprising a key phrase are widely scattered through the text then this component will be low. If the words are concentrated together, such as in the same paragraph, the weight will be higher.
The total weight displayed in the table is based on the three components described above.
Count – the number of times a phrase appears in the page, based on exact matching.
Weight – is the weight of a phrase as described above.
Show words – this option allows you to view detailed information on each word of the phrase.
Pages by phrases
This report is similar to the previous one. The difference is that you select the key phrase in the left hand list and the table will show you a list of all the pages containing this phrase.
Site keywords statistics
All the words in every page of the site are displayed in this report. The total number of occurrences and the total weight of each word is shown.
Site information
This report displays general information on all of the analyzed pages. The total number of words in each page (including the text, service tags, TITLE tag, meta-tags, etc.) and the number of readable words (visible text words) is displayed for each page. The size of a page, in Kilobytes, is also displayed.
0 comments:
Post a Comment