[
Skip Navigation]
≡
β©οΈ
π£οΈ
-
π
Help
:
Wiki
:
CMS Detectors
≡
Welcome
Signin
CMS Detectors@Help
View
Source
History
Discussion
Help Group
Create/Find Pages
Group Feed
My Groups
π
Locale: en-US
Page: CMS Detectors
β
ποΈ
Page Type:
Standard
Markdown
Page and Feedback
Page Alias
Media List
Presentation
Url Shortener
Share Wall
Alias Page To:
Page Border:
Solid
Dashed
None
Table of Contents:
Title:
Author:
Meta Robots:
Meta Description:
Meta Properties (such as Open Graph)
One line per property in format: name|content
Header Page Name:
Footer Page Name:
'''CMS Detectors''' are used to help Yioop get to the most important content on a web page. <br /><br /> You must enter the '''Name'''. The Header Regex and Important Content XPath are optional but will have no effect if they are not entered. <br /> '''The Header Regex''' is used to detect the CMS. The header of most CMS created sites are very common. A specifically crafted regular expression can be used to detect the CMS you are looking for. It looks in the href value in a rel='stylesheet' tag or the src value in a type='text/javascript' tag. <br /><br /> The '''Important Content XPath''' is used to target the most important content for summarizing. The first entry is where to target the important content. Any subsequent entry will be used to remove content within the important content. Append each removal XPath to the end of the value delimited by three pound signs (###). <br /> '''Example:''' <br /><br /> <table border='1'> <th>Setting</th> <th>Value</th> <tr><td>Name</td><td>Wordpress</td></tr> <tr><td>Header Regex </td><td>wp-(?:content|includes)</td></tr> <tr><td>Important Content XPath</td><td>//div[@id="content"]###<br />//div[@id="comments"]###<br />//div[@id="respond"]</td></tr> </table> <br />
X