Ferrum is a pure Ruby API to Chrome that doesn't require Puppeteer. Thus, it could potentially collapse our Grover (Ruby) → Puppeteer (Node) → Chrome stack to just Ferrum (Ruby) → Chrome.
(It's not documented yet, but Ferrum recently added the necessary call (set_content) for passing in HTML rather than a URL to scrape.)
This would hopefully be a fast drop-in replacement that would decrease system complexity and improve performance. A significant question is whether Ferrum will run properly on Heroku; doing Grover+Puppeteer required a few Heroku-specific tweaks (like the puppeteer buildpack and the GROVER_NO_SANDBOX var). The community says yes and includes a link to an example, though we have to verify.
Ferrum is a pure Ruby API to Chrome that doesn't require Puppeteer. Thus, it could potentially collapse our Grover (Ruby) → Puppeteer (Node) → Chrome stack to just Ferrum (Ruby) → Chrome.
(It's not documented yet, but Ferrum recently added the necessary call (
set_content) for passing in HTML rather than a URL to scrape.)This would hopefully be a fast drop-in replacement that would decrease system complexity and improve performance. A significant question is whether Ferrum will run properly on Heroku; doing Grover+Puppeteer required a few Heroku-specific tweaks (like the puppeteer buildpack and the
GROVER_NO_SANDBOXvar). The community says yes and includes a link to an example, though we have to verify.