Use Nightmare.js to Automate Headless Browsing
Traducciones al EspañolEstamos traduciendo nuestros guías y tutoriales al Español. Es posible que usted esté viendo una traducción generada automáticamente. Estamos trabajando con traductores profesionales para verificar las traducciones de nuestro sitio web. Este proyecto es un trabajo en curso.
Nightmare.js is a high-level browser automation library, designed to automate browsing tasks for sites that don’t have APIs. The library itself is a wrapper around Electron, which Nightmare.js uses as a browser to interact with web sites. This guide helps you install Nightmare.js on Ubuntu 16.04 and run automation scripts without the need for a graphical user interface.
Before You Begin
Familiarize yourself with our Getting Started guide and complete the steps for setting your Linode’s hostname and timezone.
This guide will use
sudo
wherever possible. Complete the sections of our Securing Your Server to create a standard user account, harden SSH access and remove unnecessary network services.Update your system:
sudo apt-get update && sudo apt-get upgrade
NoteThis guide is written for a non-root user. Commands that require elevated privileges are prefixed withsudo
. If you’re not familiar with thesudo
command, see the Users and Groups guide.
Install Node.js
The Ubuntu 16.04 repository is slower to release recent versions of Node.js. Install the most recent available version through the NodeSource PPA (formerly Chris Lea’s Launchpad PPA).
Install the NodeSource PPA:
curl -sL https://deb.nodesource.com/setup_6.x | sudo -E bash -
Note
This command fetches the latest version of Node.js 6. To install a specific version, replace the6.x
in this example.Install Node.js and NPM with the following command:
sudo apt-get install -y nodejs
Confirm that Node.js is successfully installed:
node --version
Check that the NPM command-line tool is successfully installed as well:
npm --version
Install Nightmare.js
To avoid installing the Node packages for the system globally, install Nightmare.js in a specific directory. This examples creates a automation
directory within the current user’s home directory as the base the project.
Create and switch to the
automation
directory:mkdir ~/automation && cd ~/automation
Initialize an NPM project. NPM prompts you to provide a name, repository, and other details for the project. Accept the default values or assign whatever names your want. To accept the defaults automatically, add the
-f
force flag to this example:npm init
Install Nightmare.js:
npm install --save nightmare
Create and Run the Automation Script
Nightmare.js is an NPM module, so it can be imported from within a Node.js script. Use these examples to write a simple script that will search Linode’s documentation for guides about Ubuntu.
Nightmare.js uses the Electron browser and requires an X server. Install
xvfb
and its dependencies so that you can run graphical applications without display hardware:sudo apt-get install -y xvfb x11-xkb-utils xfonts-100dpi xfonts-75dpi xfonts-scalable xfonts-cyrillic x11-apps clang libdbus-1-dev libgtk2.0-dev libnotify-dev libgnome-keyring-dev libgconf2-dev libasound2-dev libcap-dev libcups2-dev libxtst-dev libxss1 libnss3-dev gcc-multilib g++-multilib
Create
linode.js
inside the automation directory and add the following:- File: ~/automation/linode.js
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
const Nightmare = require('nightmare'); const nightmare = Nightmare({show: true}); nightmare .goto('https://www.linode.com/docs') .insert('.ais-SearchBox-input', 'ubuntu') .click('.ais-SearchBox-submit') .wait('.ais-Hits-list') .evaluate(function() { let searchResults = []; const results = document.querySelectorAll('a.c-search__result__link'); results.forEach(function(result) { let row = { 'title':result.innerText, 'url':result.href } searchResults.push(row); }); return searchResults; }) .end() .then(function(result) { result.forEach(function(r) { console.log('Title: ' + r.title); console.log('URL: ' + r.url); }) }) .catch(function(e) { console.log(e); });
Run the script:
xvfb-run node linode.js
The script visits the Linode docs page, enters ‘Ubuntu’ into the input box, and clicks the submit button. It then waits for the results to load and prints the url and title each entry on the first page of results.
The output will resemble the following:
Title: How to Install a LAMP Stack on Ubuntu 16.04 URL: https://www.linode.com/docs/web-servers/lamp/install-lamp-stack-on-ubuntu-16-04 Title: Install and Configure MySQL Workbench on Ubuntu 16.04 URL: https://www.linode.com/docs/databases/mysql/install-and-configure-mysql-workbench-on-ubuntu Title: Install MongoDB on Ubuntu 16.04 (Xenial) URL: https://www.linode.com/docs/databases/mongodb/install-mongodb-on-ubuntu-16-04 ...
Add a Cron Job to Run the Automation Script
This example automates the script to run once every hour. It changes to the ~/automation/
directory, runs the scraping script, and saves the output to a file with a unique filename that includes the date and time it ran.
For more information about using Cron, see our Schedule Tasks with Cron guide.
Open the crontab file:
crontab -e
Add the following line to the end of the file:
- File: crontab
0 * * * * cd ~/automation && xvfb-run node linode.js >> data_$(date +\%Y_\%m_\%d_\%I_\%M_\%p).txt
More Information
You may wish to consult the following resources for additional information on this topic. While these are provided in the hope that they will be useful, please note that we cannot vouch for the accuracy or timeliness of externally hosted materials.
This page was originally published on