Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
0cfb986
feat: initial commit of re-write
moonmeister Jul 10, 2020
a6c5869
fix: change to excludes from exclude cause it makes more sense, other…
moonmeister Jul 17, 2020
b68afaa
fix: lint errors
moonmeister Jul 17, 2020
db9f278
Merge branch 'master' into moonmeister/feat/sitemap-rewrite
moonmeister Jul 29, 2020
439829f
misc updates
moonmeister Aug 6, 2020
fc9f1c8
Merge branch 'master' into moonmeister/feat/sitemap-rewrite
moonmeister Aug 7, 2020
a4b1dd8
fix SSR bugs an validation
moonmeister Aug 7, 2020
3005301
wip: docs
moonmeister Aug 7, 2020
40af1dc
Merge branch 'master' into moonmeister/feat/sitemap-rewrite
moonmeister Aug 12, 2020
35e874e
wip
moonmeister Sep 3, 2020
28f7a1f
wip
moonmeister Sep 3, 2020
3fcdd8c
feat: finalize sriting sitemap and update default filter to runseprat…
moonmeister Sep 5, 2020
b47a0e9
feat: finalize sriting sitemap and update default filter to runseprat…
moonmeister Sep 5, 2020
11a893d
docs updates and better naming for sitemap size limit
moonmeister Sep 5, 2020
a83e10e
refactor: cleanup
moonmeister Sep 6, 2020
c70a33a
api docs
moonmeister Sep 6, 2020
7067352
misc docs fixes
moonmeister Sep 6, 2020
8b5d807
remove we
moonmeister Sep 6, 2020
647f626
add more links
moonmeister Sep 6, 2020
3bd5dc8
typos and formating
moonmeister Sep 6, 2020
7216ea8
Merge branch 'master' into moonmeister/feat/sitemap-rewrite
moonmeister Sep 6, 2020
bc6b9f1
Merge branch 'master' into moonmeister/feat/sitemap-rewrite
moonmeister Sep 15, 2020
9377d91
test: fix ssr tests
moonmeister Sep 15, 2020
1cb320c
fix node tests and switch to sitemap writer now that it has fixes.
moonmeister Sep 15, 2020
f3b47fb
Merge branch 'master' into moonmeister/feat/sitemap-rewrite
moonmeister Sep 15, 2020
d3737d4
docs: fix typos, formatting, and language
moonmeister Sep 16, 2020
e947941
tests: internal test and other cleanup
moonmeister Sep 16, 2020
ff28649
fix: forgot to update sitemap plugin to use new fixes in the simple s…
moonmeister Sep 16, 2020
6be31d8
Merge branch 'master' into moonmeister/feat/sitemap-rewrite
moonmeister Sep 19, 2020
5fa8d0b
Update packages/gatsby-plugin-sitemap/src/gatsby-node.js
moonmeister Sep 19, 2020
58bd083
Update packages/gatsby-plugin-sitemap/src/gatsby-node.js
moonmeister Sep 19, 2020
11b5dc4
Update packages/gatsby-plugin-sitemap/src/__tests__/gatsby-node.js
moonmeister Sep 19, 2020
dc4acc5
Update packages/gatsby-plugin-sitemap/src/internals.js
moonmeister Sep 19, 2020
38ad29e
Merge branch 'moonmeister/feat/sitemap-rewrite' of github.com:gatsbyj…
moonmeister Sep 19, 2020
d114e09
fix: better error handling, remove global reporter, allow things to b…
moonmeister Sep 19, 2020
1e3316a
Update packages/gatsby-plugin-sitemap/src/__tests__/gatsby-ssr.js
moonmeister Sep 19, 2020
e2c115c
Update packages/gatsby-plugin-sitemap/src/__tests__/gatsby-node.js
moonmeister Sep 19, 2020
03a2283
Update packages/gatsby-plugin-sitemap/src/__tests__/gatsby-node.js
moonmeister Sep 19, 2020
9588c56
Update packages/gatsby-plugin-sitemap/src/__tests__/gatsby-node.js
moonmeister Sep 19, 2020
ba6103b
Update packages/gatsby-plugin-sitemap/src/__tests__/gatsby-ssr.js
moonmeister Sep 19, 2020
18aafe1
Apply suggestions from code review
moonmeister Sep 19, 2020
05947d6
improve filter performance
moonmeister Sep 19, 2020
a0702ef
clean yarn/lock
moonmeister Sep 19, 2020
5d0784e
fix: test snapshots
moonmeister Sep 20, 2020
8f8d969
fix: misc test improvements
moonmeister Sep 21, 2020
bd890e7
fix: update e2e node version only need 10.17 but 10.16 was the newest…
moonmeister Sep 21, 2020
70cd713
Merge branch 'master' into moonmeister/feat/sitemap-rewrite
moonmeister Sep 27, 2020
c06a376
tests: add options-validation tests
moonmeister Sep 27, 2020
b585a94
feat: remove reporter from internal function and buble up verbose mes…
moonmeister Sep 27, 2020
5694c61
fix serialize signature in API docs
moonmeister Sep 28, 2020
155c7b4
Merge branch 'master' into moonmeister/feat/sitemap-rewrite
moonmeister Oct 26, 2020
364d0a1
fix: modify merged pluginOptionSchema API to use new stuff. HAndle if…
moonmeister Oct 26, 2020
13d397b
chore: cleanup and add wards suggestions
moonmeister Oct 26, 2020
4de597d
fix missplaced bracket
moonmeister Oct 26, 2020
6ee6e9c
Merge branch 'master' into moonmeister/feat/sitemap-rewrite
moonmeister Nov 30, 2020
13844c9
fix: remove logic for optional schema validation now that it has ship…
moonmeister Dec 2, 2020
6d451e3
allow async serialize to match added current funcitonality
moonmeister Dec 2, 2020
54f8952
Merge branch 'master' into moonmeister/feat/sitemap-rewrite
moonmeister Dec 17, 2020
3f01216
Update Packages
moonmeister Dec 17, 2020
e096556
Merge branch 'master' into moonmeister/feat/sitemap-rewrite
moonmeister Dec 29, 2020
6945e66
clean lock
moonmeister Dec 29, 2020
aeeceb6
Merge branch 'master' into moonmeister/feat/sitemap-rewrite
moonmeister Jan 21, 2021
3100df2
Merge branch 'master' into moonmeister/feat/sitemap-rewrite
moonmeister Feb 8, 2021
d9e50c8
Merge branch 'master' into moonmeister/feat/sitemap-rewrite
moonmeister Mar 9, 2021
01f214c
lint
moonmeister Mar 9, 2021
039fe4c
update
wardpeet Mar 9, 2021
31727e0
Merge branch 'master' into moonmeister/feat/sitemap-rewrite
moonmeister Apr 14, 2021
ab89897
fix: change gatsby utils version
moonmeister Apr 14, 2021
f8ca507
fix: dirty lock
moonmeister Apr 14, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
223 changes: 163 additions & 60 deletions packages/gatsby-plugin-sitemap/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,89 +21,192 @@ plugins: [`gatsby-plugin-sitemap`]
Above is the minimal configuration required to have it work. By default, the
generated sitemap will include all of your site's pages, except the ones you exclude.

## Recommended usage

You probably do not want to use the defaults in this plugin. Here's an example of the default output:

```xml
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.net/blog/</loc>
<changefreq>daily</changefreq>
<priority>0.7</priority>
</url>
<url>
<loc>https://example.net/</loc>
<changefreq>daily</changefreq>
<priority>0.7</priority>
</url>
</urlset>
```

See the `changefreq` and `priority` fields? Those will be the same for every page, no matter how important or how often it gets updated. They will most likely be wrong. But wait, there's more, in their [docs](https://support.google.com/webmasters/answer/183668?hl=en) Google says:

> - Google ignores `<priority>` and `<changefreq>` values, so don't bother adding them.
> - Google reads the `<lastmod>` value, but if you misrepresent this value, we will stop reading it.

You really want to customize this plugin config to include an accurate `lastmod` date. Checkout the [example](#example) for an example of how to do this.

## Options

The `defaultOptions` [here](https://github.com/gatsbyjs/gatsby/blob/master/packages/gatsby-plugin-sitemap/src/internals.js#L71) can be overridden.
The [`default config`](https://github.com/gatsbyjs/gatsby/blob/master/packages/gatsby-plugin-sitemap/src/options-validation.js) can be overridden.

The options are as follows:

- `query` (GraphQL Query) The query for the data you need to generate the sitemap. It's required to get the site's URL, if you are not fetching it from `site.siteMetadata.siteUrl`, you will need to set a custom `resolveSiteUrl` function. If you override the query, you probably will also need to set a `serializer` to return the correct data for the sitemap. Due to how this plugin was built it is currently expected/required to fetch the page paths from `allSitePage`, but you may use the `allSitePage.edges.node` or `allSitePage.nodes` query structure.
- `output` (string) The filepath and name. Defaults to `/sitemap.xml`.
- `exclude` (array of strings) An array of paths to exclude from the sitemap.
- `createLinkInHead` (boolean) Whether to populate the `<head>` of your site with a link to the sitemap.
- `serialize` (function) Takes the output of the data query and lets you return an array of sitemap entries.
- `resolveSiteUrl` (function) Takes the output of the data query and lets you return the site URL.
- `output` (string = `/sitemap`) Folder path where sitemaps are stored.
- `createLinkInHead` (boolean = true) Whether to populate the `<head>` of your site with a link to the sitemap.
- `entryLimit` (number = 45000) Number of entries per sitemap file, a sitemap index and multiple sitemaps are created if you have more entries.
- `exclude` (string[] = []) An array of paths to exclude from the sitemap. While this is usually an array of strings it is possible to enter other data types into this array for custom filtering. Doing so will require customization of the [`filterPages`](#filterPages) function.
- `query` (GraphQL Query) The query for the data you need to generate the sitemap. It's required to get the site's URL, if you are not fetching it from `site.siteMetadata.siteUrl`, you will need to set a custom [`resolveSiteUrl`](#resolveSiteUrl) function. If you override the query, you may need to pass in a custom [`resolvePagePath`](#resolvePagePath), [`resolvePages`](#resolvePages) to keep everything working. If you fetch pages without using `allSitePage.nodes` query structure you will definately need to customize the [`resolvePages`](#resolvePages) function.
- [`resolveSiteUrl`](#resolveSiteUrl) (function) Takes the output of the data query and lets you return the site URL. Sync or async functions allowed.
- [`resolvePagePath`](#resolvePagePath) (function) Takes a page object and returns the uri of the page (no domain or protocol).
- [`resolvePages`](#resolvePagePath) (function) Takes the output of the data query and expects an array of page objects to be returned. Sync or async functions allowed.
- [`filterPages`](#filterPages) (function) Takes the current page a string (or other object) from the `exclude` array and expects a boolean to be returned. `true` excludes the path, `false` keeps it.
- [`serialize`](#serialize) (function) Takes the output of `filterPages` and lets you return a sitemap entry. Sync or async functions allowed.

We _ALWAYS_ exclude the following pages: `/dev-404-page`,`/404` &`/offline-plugin-app-shell-fallback`, this cannot be changed.
The following pages are **always** excluded: `/dev-404-page`,`/404` &`/offline-plugin-app-shell-fallback`, this cannot be changed even by customizing the [`filterPages`](#filterPages) function.

Example:
## Example:

```javascript
const siteUrl = process.env.URL || `https://fallback.net`

// In your gatsby-config.js
siteMetadata: {
siteUrl: `https://www.example.com`,
},
plugins: [
{
resolve: `gatsby-plugin-sitemap`,
options: {
output: `/some-other-sitemap.xml`,
// Exclude specific pages or groups of pages using glob parameters
// See: https://github.com/isaacs/minimatch
// The example below will exclude the single `path/to/page` and all routes beginning with `category`
exclude: [`/category/*`, `/path/to/page`],
query: `
module.exports = {
plugins: [
{
resolve: "gatsby-plugin-sitemap",
options: {
query: `
{
wp {
generalSettings {
siteUrl
}
}

allSitePage {
nodes {
path
}
}
}`,
resolveSiteUrl: ({site, allSitePage}) => {
//Alternatively, you may also pass in an environment variable (or any location) at the beginning of your `gatsby-config.js`.
return site.wp.generalSettings.siteUrl
},
serialize: ({ site, allSitePage }) =>
allSitePage.nodes.map(node => {
allWpContentNode(filter: {nodeType: {in: ["Post", "Page"]}}) {
nodes {
... on WpPost {
uri
modifiedGmt
}
... on WpPage {
uri
modifiedGmt
}
}
}
}
`,
resolveSiteUrl: () => siteUrl,
resolvePages: ({
allSitePage: { nodes: allPages },
allWpContentNode: { nodes: allWpNodes },
}) => {
const wpNodeMap = allWpNodes.reduce((acc, node) => {
const { uri } = node
acc[uri] = node

return acc
}, {})

return allPages.map(page => {
return { ...page, ...wpNodeMap[page.path] }
})
},
serialize: ({ path, modifiedGmt }) => {
return {
url: `${site.wp.generalSettings.siteUrl}${node.path}`,
changefreq: `daily`,
priority: 0.7,
url: path,
lastmod: modifiedGmt,
}
})
}
}
]
},
},
},
],
}
```

## Sitemap Index
## API Reference

<a id=resolveSiteUrl></a>

## resolveSiteUrl ⇒ <code>string</code>

Sync or async functions allowed.

**Returns**: <code>string</code> - - site URL, this can come from the graphql query or another scope.

| Param | Type | Description |
| ----- | ------------------- | ---------------------------- |
| data | <code>object</code> | Results of the GraphQL query |

<a id=resolvePagePath></a>

## resolvePagePath ⇒ <code>string</code>

If you don't want to place the URI in `path` then `resolvePagePath`
is needed.

We also support generating `sitemap index`.
**Returns**: <code>string</code> - - uri of the page without domain or protocol

- [Split up your large sitemaps](https://support.google.com/webmasters/answer/75712?hl=en)
- [Using Sitemap index files (to group multiple sitemap files)](https://www.sitemaps.org/protocol.html#index)
| Param | Type | Description |
| ----- | ------------------- | ------------------- |
| page | <code>object</code> | <code>string</code> | Array Item returned from resolvePages |

<a id=resolvePages></a>

## resolvePages ⇒ <code>Array</code>

This allows custom resolution of the array of pages.
This also where users could merge multiple sources into
a single array if needed. Sync or async functions allowed.

**Returns**: <code>object[]</code> - - Array of objects representing each page

| Param | Type | Description |
| ----- | ------------------- | ---------------------------- |
| data | <code>object</code> | results of the GraphQL query |

<a id="filterPages"></a>

## filterPages ⇒ <code>boolean</code>

This allows filtering any data in any way.

This function is executed via:

```javascript
// In your gatsby-config.js
siteMetadata: {
siteUrl: `https://www.example.com`,
},
plugins: [
{
resolve: `gatsby-plugin-sitemap`,
options: {
sitemapSize: 5000
}
}
]
allPages.filter(
page => !excludes.some(excludedRoute => thisFunc(page, ecludedRoute, tools))
)
```

`allPages` is the results of the [`resolvePages`](#resolvePages) function.

**Returns**: <code>Boolean</code> - - `true` excludes the path, `false` keeps it.

| Param | Type | Description |
| ------------- | ------------------- | ----------------------------------------------------------------------------------- |
| page | <code>object</code> | |
| excludedRoute | <code>string</code> | Element from `exclude` Array in plugin config. |
| tools | <code>object</code> | contains tools for filtering `{ minimatch, withoutTrailingSlash, resolvePagePath }` |

<a id="serialize"></a>

## serialize ⇒ <code>object</code>

This function is executed by:

```javascript
allPages.map(page => thisFunc(page, tools))
```

Above is the minimal configuration to split a large sitemap.
When the number of URLs in a sitemap is more than 5000, the plugin will create sitemap (e.g. `sitemap-0.xml`, `sitemap-1.xml`) and index (e.g. `sitemap.xml`) files.
`allpages` is the result of the [`filterPages`](#filterPages) function. Sync or async functions allowed.

**Kind**: global variable

| Param | Type | Description |
| ----- | ------------------- | ---------------------------------------------------------------- |
| page | <code>object</code> | A single element from the results of the `resolvePages` function |
| tools | <code>object</code> | contains tools for serializing `{ resolvePagePath }` |
10 changes: 6 additions & 4 deletions packages/gatsby-plugin-sitemap/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,14 @@
"@babel/runtime": "^7.12.5",
"common-tags": "^1.8.0",
"minimatch": "^3.0.4",
"pify": "^3.0.0",
"sitemap": "^1.13.0"
"sitemap": "^6.3.0"
},
"devDependencies": {
"@babel/cli": "^7.12.1",
"@babel/core": "^7.12.3",
"babel-preset-gatsby-package": "^1.4.0-next.0",
"cross-env": "^7.0.3"
"cross-env": "^7.0.3",
"gatsby-plugin-utils": "1.4.0-next.0"
},
"homepage": "https://github.com/gatsbyjs/gatsby/tree/master/packages/gatsby-plugin-sitemap#readme",
"keywords": [
Expand All @@ -39,7 +39,9 @@
"scripts": {
"build": "babel src --out-dir . --ignore \"**/__tests__\"",
"prepare": "cross-env NODE_ENV=production npm run build",
"watch": "babel -w src --out-dir . --ignore \"**/__tests__\""
"watch": "babel -w src --out-dir . --ignore \"**/__tests__\"",
"test": "jest",
"test:watch": "jest --watch"
},
"engines": {
"node": ">=12.13.0"
Expand Down
Original file line number Diff line number Diff line change
@@ -1,25 +1,26 @@
// Jest Snapshot v1, https://goo.gl/fbAQLP

exports[`Test plugin sitemap custom query runs 1`] = `
"<?xml version=\\"1.0\\" encoding=\\"UTF-8\\"?>
<urlset xmlns=\\"http://www.sitemaps.org/schemas/sitemap/0.9\\" xmlns:news=\\"http://www.google.com/schemas/sitemap-news/0.9\\" xmlns:xhtml=\\"http://www.w3.org/1999/xhtml\\" xmlns:mobile=\\"http://www.google.com/schemas/sitemap-mobile/1.0\\" xmlns:image=\\"http://www.google.com/schemas/sitemap-image/1.1\\" xmlns:video=\\"http://www.google.com/schemas/sitemap-video/1.1\\">
<url> <loc>http://dummy.url/post/page-1</loc> <changefreq>weekly</changefreq> <priority>0.8</priority> </url>
</urlset>"
exports[`gatsby-plugin-sitemap Node API should accept a custom query 1`] = `
Array [
Object {
"changefreq": "weekly",
"priority": 0.8,
"url": "http://dummy.url/page-1",
},
]
`;

exports[`Test plugin sitemap default settings work properly 1`] = `
"<?xml version=\\"1.0\\" encoding=\\"UTF-8\\"?>
<urlset xmlns=\\"http://www.sitemaps.org/schemas/sitemap/0.9\\" xmlns:news=\\"http://www.google.com/schemas/sitemap-news/0.9\\" xmlns:xhtml=\\"http://www.w3.org/1999/xhtml\\" xmlns:mobile=\\"http://www.google.com/schemas/sitemap-mobile/1.0\\" xmlns:image=\\"http://www.google.com/schemas/sitemap-image/1.1\\" xmlns:video=\\"http://www.google.com/schemas/sitemap-video/1.1\\">
<url> <loc>http://dummy.url/page-1</loc> <changefreq>daily</changefreq> <priority>0.7</priority> </url>
<url> <loc>http://dummy.url/page-2</loc> <changefreq>daily</changefreq> <priority>0.7</priority> </url>
</urlset>"
exports[`gatsby-plugin-sitemap Node API should succeed with default options 1`] = `
Array [
Object {
"changefreq": "daily",
"priority": 0.7,
"url": "http://dummy.url/page-1",
},
Object {
"changefreq": "daily",
"priority": 0.7,
"url": "http://dummy.url/page-2",
},
]
`;

exports[`Test plugin sitemap sitemap index set sitemap size and urls are less than it. 1`] = `
"<?xml version=\\"1.0\\" encoding=\\"UTF-8\\"?>
<urlset xmlns=\\"http://www.sitemaps.org/schemas/sitemap/0.9\\" xmlns:news=\\"http://www.google.com/schemas/sitemap-news/0.9\\" xmlns:xhtml=\\"http://www.w3.org/1999/xhtml\\" xmlns:mobile=\\"http://www.google.com/schemas/sitemap-mobile/1.0\\" xmlns:image=\\"http://www.google.com/schemas/sitemap-image/1.1\\" xmlns:video=\\"http://www.google.com/schemas/sitemap-video/1.1\\">
<url> <loc>http://dummy.url/page-1</loc> <changefreq>daily</changefreq> <priority>0.7</priority> </url>
<url> <loc>http://dummy.url/page-2</loc> <changefreq>daily</changefreq> <priority>0.7</priority> </url>
</urlset>"
`;

Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
// Jest Snapshot v1, https://goo.gl/fbAQLP

exports[`Adds <Link> for site to head creates Link href with path prefix when __PATH_PREFIX__ sets 1`] = `
exports[`gatsby-plugin-sitemap SSR API creates Link href with path prefix when __PATH_PREFIX__ sets 1`] = `
[MockFunction] {
"calls": Array [
Array [
Array [
<link
href="/hogwarts/sitemap.xml"
href="/hogwarts/test-folder/sitemap-index.xml"
rel="sitemap"
type="application/xml"
/>,
Expand All @@ -22,13 +22,13 @@ exports[`Adds <Link> for site to head creates Link href with path prefix when __
}
`;

exports[`Adds <Link> for site to head creates Link if createLinkInHead is true 1`] = `
exports[`gatsby-plugin-sitemap SSR API should create a Link if createLinkInHead is true 1`] = `
[MockFunction] {
"calls": Array [
Array [
Array [
<link
href="/sitemap.xml"
href="/test-folder/sitemap-index.xml"
rel="sitemap"
type="application/xml"
/>,
Expand All @@ -44,4 +44,4 @@ exports[`Adds <Link> for site to head creates Link if createLinkInHead is true 1
}
`;

exports[`Adds <Link> for site to head does not create Link if createLinkInHead is false 1`] = `[MockFunction]`;
exports[`gatsby-plugin-sitemap SSR API should not create Link if createLinkInHead is false 1`] = `[MockFunction]`;
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
// Jest Snapshot v1, https://goo.gl/fbAQLP

exports[`gatsby-plugin-sitemap internals tests pageFilter should filter correctly 1`] = `
Array [
Object {
"path": "/to/keep/1",
},
Object {
"path": "/to/keep/2",
},
]
`;
Loading