Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions packages/gatsby-plugin-screenshot/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
/*.js
!index.js
yarn.lock
lambda
lambda-package.zip
41 changes: 41 additions & 0 deletions packages/gatsby-plugin-screenshot/.npmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Logs
logs
*.log

# Runtime data
pids
*.pid
*.seed

# Directory for instrumented libs generated by jscoverage/JSCover
lib-cov

# Coverage directory used by tools like istanbul
coverage

# Grunt intermediate storage (http://gruntjs.com/creating-plugins#storing-task-files)
.grunt

# node-waf configuration
.lock-wscript

# Compiled binary addons (http://nodejs.org/api/addons.html)
build/Release

# Dependency directory
# https://www.npmjs.org/doc/misc/npm-faq.html#should-i-check-my-node_modules-folder-into-git
node_modules
*.un~
yarn.lock
src
flow-typed
coverage
decls
examples

# Lambda-related
src/lambda
lambda
lambda-package.json
chrome
lambda-package.zip
59 changes: 59 additions & 0 deletions packages/gatsby-plugin-screenshot/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# gatsby-plugin-screenshot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This plugin should be named gatsby-transformer-screenshot


Plugin for creating screenshots of website URLs using an AWS Lambda
Function. This plugin looks for `SitesYaml` nodes with a `url`
property, and creates `Screenshot` nodes with an `imageFile` field.

[Live demo](https://thatotherperson.github.io/gatsby-screenshot-demo/)
([source](https://github.com/ThatOtherPerson/gatsby-screenshot-demo))

## Install

`npm install gatsby-plugin-screenshot`

## How to use

```javascript
// in your gatsby-config.js
plugins: [
{
resolve: `gatsby-plugin-screenshot`,
options: {
lambdaName: `gatsby-screenshot-lambda`,
region: 'us-west-2',
credentials: { // optional
accessKeyId: 'xxxx',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding these here isn't ideal as this plugin will be used in public repos (e.g. gatsbyjs.org). Don't we just need the URL for the lambda?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look at https://github.com/gatsbyjs/gatsby/pull/3526/files/b3797101f312b54492b021d874518542afb5c28c#diff-9d836096b5349fa77536e46b0dfb13f4R34 - providing credentials here is optional. You can provide credentials through the other methods AWS provides - for example, on my computer I have them in ~/.aws/credentials.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure but I mean anyone trying to run gatsbyjs.org locally now won't be able to see screenshots unless we distribute the access key, etc. It'd be better if the lambda had an associated API gateway instead of invoking the lambda directly.

secretAccessKey: 'xxxx',
sessionToken: 'xxxx' // optional
}
}
}
]
```

AWS provides several ways to configure credentials; see here for more information: https://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/setting-credentials-node.html. If you set `credentials` in this plugin's options, it will override all the other methods.

## How to query

You can query for screenshot files as shown below:

```graphql
{
allSitesYaml {
edges {
node {
url
childScreenshot {
imageFile
}
}
}
}
}
```

imageFile is a PNG file like any other loaded from your filesystem, so you can use this plugin in combination with `gatsby-image`.

## Lambda setup

To build the Lambda package, run `npm run lambda-package` in this directory. A file called `lambda-package.zip` will be generated - upload this as the source of your AWS Lambda. You will also need to create an S3 bucket - screenshots will be saved in the root of this bucket. Set the cache expiration time by creating a (Lifecycle Policy)[https://docs.aws.amazon.com/AmazonS3/latest/user-guide/create-lifecycle.html] for the bucket that marks objects for expiration after your desired period of days. Finally, you will need to set `S3_BUCKET` as an environment variable for the lambda, and be sure to set the `lambdaName` property in `gatsby-config.js`.
Copy link
Contributor

@KyleAMathews KyleAMathews Jan 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about naming the script npm run build-lambda-package?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it'd be good to fill out the description here as many people don't know what Lambda is or why they'd need to do something special in conjunction with setting up this plugin.

Binary file not shown.
35 changes: 35 additions & 0 deletions packages/gatsby-plugin-screenshot/chrome/buildChrome.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# build headless chrome on EC2
# https://github.com/adieuadieu/serverless-chrome/blob/master/chrome/README.md

# sudo su

yum install -y git redhat-lsb python bzip2 tar pkgconfig atk-devel alsa-lib-devel bison binutils brlapi-devel bluez-libs-devel bzip2-devel cairo-devel cups-devel dbus-devel dbus-glib-devel expat-devel fontconfig-devel freetype-devel gcc-c++ GConf2-devel glib2-devel glibc.i686 gperf glib2-devel gtk2-devel gtk3-devel java-1.*.0-openjdk-devel libatomic libcap-devel libffi-devel libgcc.i686 libgnome-keyring-devel libjpeg-devel libstdc++.i686 libX11-devel libXScrnSaver-devel libXtst-devel libxkbcommon-x11-devel ncurses-compat-libs nspr-devel nss-devel pam-devel pango-devel pciutils-devel pulseaudio-libs-devel zlib.i686 httpd mod_ssl php php-cli python-psutil wdiff --enablerepo=epel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who uses this script?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script is for building a new version of Chrome that's compatible with AWS Lambda; see here for more information: https://github.com/sambaiz/puppeteer-lambda-starter-kit#build-headless-chrome-optional. At the moment a precompiled binary is included in this PR.


cd ~
git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git
echo "export PATH=$PATH:$HOME/depot_tools" >> ~/.bash_profile
source ~/.bash_profile

mkdir Chromium
cd Chromium
fetch --no-history chromium
cd src

# use /tmp instead of /dev/shm
# https://groups.google.com/a/chromium.org/forum/#!msg/headless-dev/qqbZVZ2IwEw/CPInd55OBgAJ
sed -i -e "s/use_dev_shm = true;/use_dev_shm = false;/g" base/files/file_util_posix.cc

mkdir -p out/Headless
echo 'import("//build/args/headless.gn")' > out/Headless/args.gn
echo 'is_debug = false' >> out/Headless/args.gn
echo 'symbol_level = 0' >> out/Headless/args.gn
echo 'is_component_build = false' >> out/Headless/args.gn
echo 'remove_webcore_debug_symbols = true' >> out/Headless/args.gn
echo 'enable_nacl = false' >> out/Headless/args.gn
gn gen out/Headless
ninja -C out/Headless headless_shell

cd out/Headless
tar -zcvf /home/ec2-user/headless_shell.tar.gz headless_shell

# scp [email protected]:~/headless_shell.tar.gz .
Empty file.
10 changes: 10 additions & 0 deletions packages/gatsby-plugin-screenshot/lambda-package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"dependencies": {
"puppeteer": "^1.0.0",
"tar": "^4.2.0",
"tmp": "0.0.33"
},
"devDependencies": {
"aws-sdk": "^2.181.0"
}
}
27 changes: 27 additions & 0 deletions packages/gatsby-plugin-screenshot/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
{
"name": "gatsby-plugin-screenshot",
"version": "1.0.0",
"description": "(TODO: edit) Uses AWS Lambda to take screenshots of websites",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finish TODO

"main": "index.js",
"dependencies": {
"aws-sdk": "^2.181.0"
},
"devDependencies": {
"babel-cli": "^6.26.0",
"cross-env": "^5.1.3"
},
"scripts": {
"build": "babel src --out-dir . --ignore __tests__",
"watch": "babel -w src --out-dir . --ignore __tests__",
"prepublish": "cross-env NODE_ENV=production npm run build",
"lambda-package": "npm run lambda-package-prepare && cp chrome/headless_shell.tar.gz lambda && cd lambda && zip -rq ../lambda-package.zip .",
"lambda-package-prepare": "npm run build && cp lambda-package.json lambda/package.json && cd lambda && PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=1 npm install --production"
},
"keywords": [
"gatsby",
"gatsby-plugin",
"screenshot"
],
"author": "David Beckley <[email protected]>",
"license": "MIT"
}
146 changes: 146 additions & 0 deletions packages/gatsby-plugin-screenshot/src/gatsby-node.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
const crypto = require(`crypto`)
const AWS = require(`aws-sdk`)
const _ = require(`lodash`)
const { createRemoteFileNode } = require(`gatsby-source-filesystem`)

var lambda

const createContentDigest = obj =>
crypto
.createHash(`md5`)
.update(JSON.stringify(obj))
.digest(`hex`)

exports.onPreBootstrap = (
{ store, cache, boundActionCreators },
pluginOptions
) => {
const { createNode, touchNode } = boundActionCreators

// Set up the lambda service object based on configuration options

if (!pluginOptions.lambdaName) {
console.log(`
gatsby-plugin-screenshot requires a lambdaName option. Please specify
the name of the AWS Lambda function to invoke.
`)
process.exit(1)
}

const options = {
params: { FunctionName: pluginOptions.lambdaName },
apiVersion: `2015-03-31`,
}

if (pluginOptions.region) {
options.region = pluginOptions.region
}

if (pluginOptions.credentials) {
options.credentials = pluginOptions.credentials
}

lambda = new AWS.Lambda(options)

// Check for updated screenshots
// and prevent Gatsby from garbage collecting remote file nodes
return Promise.all(
_.values(store.getState().nodes)
.filter(n => n.internal.type === `Screenshot`)
.map(async n => {
if (n.expires && new Date() >= new Date(n.expires)) {
// Screenshot expired, re-run Lambda
await createScreenshotNode({
url: n.url,
parent: n.parent,
store,
cache,
createNode,
})
} else {
// Screenshot hasn't yet expired, touch the image node
// to prevent garbage collection
touchNode(n.imageFile___NODE)
}
})
)
}

exports.onCreateNode = async ({ node, boundActionCreators, store, cache }) => {
const { createNode, createParentChildLink } = boundActionCreators

// We only care about parsed sites.yaml files
if (node.internal.type !== `SitesYaml`) {
return
}

const screenshotNode = await createScreenshotNode({
url: node.url,
parent: node.id,
store,
cache,
createNode,
})

createParentChildLink({
parent: node,
child: screenshotNode,
})
}

const getScreenshot = url => {
const params = {
Payload: JSON.stringify({ url }),
}

return new Promise((resolve, reject) => {
lambda.invoke(params, (err, data) => {
if (err) reject(err)
else {
const payload = JSON.parse(data.Payload)

if (
typeof data.FunctionError === `string` &&
data.FunctionError.length > 0
)
reject(payload)
resolve(payload)
}
})
})
}

const createScreenshotNode = async ({
url,
parent,
store,
cache,
createNode,
}) => {
const screenshotResponse = await getScreenshot(url)

const fileNode = await createRemoteFileNode({
url: screenshotResponse.url,
store,
cache,
createNode,
})

const screenshotNode = {
id: `${parent} >>> Screenshot`,
url,
expires: screenshotResponse.expires,
parent,
children: [],
internal: {
type: `Screenshot`,
},
imageFile___NODE: fileNode.id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe name this screenshotFile instead? image is a bit generic.

}

screenshotNode.internal.contentDigest = createContentDigest(screenshotNode)

createNode(screenshotNode)

return screenshotNode
}
Loading