Windows Generator
- Quick start
- Installation
- Basic setup
- Sitemap options
- Filtering pages
- Sitemap images
- FTP Settings
- Spidering
- Editing pages
- Editing sitemap images
- Exporting sitemaps
- Change log
HTML Concepts
Filtering sitemap pages and images
Filtering is a powerful feature that allows you to precisely control which pages and images are excluded from your sitemap in our online and windows sitemap generator. This guide will walk you through the process of creating filtering rules using expressions, ranging from simple wildcard expressions to more advanced techniques. We follow the ANSI-92 standard for our filtering expressions.
Simple Wildcard Filters
Wildcard expressions provide a flexible way to match URLs based on patterns. You can use wildcards to filter partial filenames, filename patterns, or even apply regular expressions for more complex matching.
“%” Wildcard
The “%” wildcard matches any number of characters. It can be used as the first or last character in the character string.
Example:
wh%
matches “what,” “white,” and “why,” but not “awhile” or “watch.”
“_” Wildcard
The “_” wildcard matches any single alphabetic character.
Example:
B_ll
matches “ball,” “bell,” and “bill.”
“[]” Wildcard
The “[]” wildcard allows you to match any single character within the brackets.
Example:
B[ae]ll
matches “ball” and “bell,” but not “bill.”
“^” Wildcard
The “^” wildcard is used to match any character not in the brackets.
Example:
b[^ae]ll
matches “bill” and “bull,” but not “ball” or “bell.”
“-” Wildcard
The “-” wildcard lets you match any character within a specified range, with the range defined in ascending order (e.g., A to Z, not Z to A).
Example:
b[a-c]d
matches “bad,” “bbd,” and “bcd.”
Advanced Filtering Examples
Now that we’ve covered the basics, let’s explore some more advanced filtering examples to demonstrate the versatility of URL filtering expressions.
Complex Patterns
You can combine wildcards and regular expressions to create complex filters. For example:
.*blog.*
matches any URL containing the word “blog.”
Parameter-Based Filtering
You can use expressions to filter URLs based on query parameters. For example:
^.*\?utm_source=facebook$
matches URLs with the exact query parameter “utm_source=facebook.”
Exclusion Filters
To exclude specific URLs or patterns, use the negation operator “!”. For example:
!/private/*
excludes all URLs under the “example.com/private/” directory.
Case-Insensitive Matching
To perform case-insensitive matching, use the “i” flag. For example:
/products/i
matches URLs containing “products” in any case (e.g., “Products,” “products,” “PrOdUcTs”).
With these advanced filtering examples, you can tailor your URL filtering rules to meet your specific needs, ensuring that your sitemap includes only the content that matters most to you.