PhantomJS: Archiving the project, suspending development (opens in new tab)

(github.com)

586 pointsgowan8y ago132 comments

132 comments

Chrome and Firefox gaining headless modes is the ultimate goal Phantom could've achieved.

So I consider it a complete success.

Kudos to all contributors.

I feel the same. No failure here. It served its purpose very well, while it was needed. It’s retiring while it’s still better than it’s replacement in a few areas...like proxy support.

overdrivetg8y ago

This is the main blocker for me right now for sure.

BinaryIdiot8y ago

I really hope Microsoft offers a headless IE / Edge at some point. It would be amazing to be able to use all 3 major browsers like this. Heck, get Safari in there too (though I feel like it should be doable with WebKit already).

poizan428y ago

Huh? They have had the WebBrowser ActiveX control since IE3 I think. That works as headless as you get on Windows.

1 more reply

luisrudge8y ago

they have webdriver support. Is that enough? I'm not sure what's the difference between headless and webdriver (if any) https://blogs.windows.com/msedgedev/2015/07/23/bringing-auto...

1 more reply

tnolet8y ago

Agree 100%. Headless & programmable browsing is a hard nut to crack. PhantomJs paved the way.

kodablah8y ago

Sadly, Chrome won't support extensions: https://bugs.chromium.org/p/chromium/issues/detail?id=706008

rnnr8y ago

but... i think you can attach it to an existing chrome with remote debugging enabled, so not that big a deal!

1 more reply

ricardobeat8y ago

The main advantage of PhantomJS at this point is the way it's built and distributed as a single binary, unlike Chrome. Much easier to maintain.

ris8y ago

Correction: much easier to maintain irresponsibly. That binary you just downloaded and stuck somewhere some time ago - when was the last time it was updated? What about all of its underlying bundled static libraries? How certain were you ever of their up-to-dateness? Did you ever check what version of zlib it was using?

Distributions attempting to package phantomjs properly had one hell of a time trying to reproduce its builds reliably. Most gave up.

Distribution from author as binaries is a whole bundle of fail from the get-go.

fareesh8y ago

Does headless chrome do PDF generation? That's the only thing I'm using phantom for at the moment.

madeofpalk8y ago

Yes.

In fact, there's a command line switch for it https://developers.google.com/web/updates/2017/04/headless-c...

The Chrome team also make Puppeteer, a node's library for interfacing with headless chrome, and has methods for making PDFs as well https://github.com/GoogleChrome/puppeteer

dewey8y ago

First result:

--print-to-pdf

https://developers.google.com/web/updates/2017/04/headless-c...

itslennysfault8y ago

Yes.

It's really easy to do using [puppeteer](https://github.com/GoogleChrome/puppeteer). The 2nd or 3rd example is PDF.

2 more replies

wgjordan8y ago

This project has been effectively dead since April 2017, when Vitallium stepped down as maintainer as soon as Headless Chrome was announced [1]:

> Headless Chrome is coming [...] I think people will switch to it, eventually. Chrome is faster and more stable than PhantomJS. And it doesn't eat memory like crazy. [...] I don't see any future in developing PhantomJS. Developing PhantomJS 2 and 2.5 as a single developer is a bloody hell.

One potential path forward could have been to have PhantomJS support Headless Chrome as a runtime [2], which Paul Irish (of Google Chrome team) reached out to PhantomJS about. However, it seems there hasn't been enough interest/resources to ever make this happen.

[1] https://groups.google.com/d/msg/phantomjs/9aI5d-LDuNE/5Z3SMZ...

[2] https://github.com/ariya/phantomjs/issues/14954

micimize8y ago

Timeline of what lead to this, from what I could gather:

• phantomjs is 7 years old, @pixiuPL has been contributing for about 2 months

• @ariya didn't respond to his requests for owner level permissions

• @pixiuPL published an open letter to the main page of phantomjs.org https://github.com/ariya/phantomjs/issues/15345

• the stress leads @ariya to close the repo.

• @pixiuPL intends to continue development on a fork

This is a good reminder of why non-technical skills are so important in OS and in general.

rilut8y ago

I don't know man, but look at @pixiuPL's commits https://github.com/ariya/phantomjs/commits?author=pixiuPL

Especially his own commits (non-merge commits)

enitihas8y ago

Exactly. I mean look at his commits. Look at this file: https://github.com/ariya/phantomjs/blob/master/package.json (He created a commit only to add his how name as a contributor to package.json, which is the only name in package.json).

How did his changes even make it to the repo. There are commits adding and deleting whitespace with the disguised commit message of "Refactoring Code". I have no doubt on why ariya couldn't work with him.

1 more reply

ricardobeat8y ago

Changing default indentation from 2 spaces to tabs (without updating existing files?): https://github.com/ariya/phantomjs/commit/6485f5466110fc8d4b...

I couldn't find a single one containing any meaningful code changes. The closest one is a81a38f[1] which seems to introduce bugs - removing open file check, plus a hanging if clause.

Sounds like it's either an elaborate prank, or the guy has no grounding on reality.

[1] https://github.com/ariya/phantomjs/commit/a81a38ffabe2cea715...

bcherny8y ago

What about them?

2 more replies

watwut8y ago

Who is lacking non-technical skills in that scenario? I just don't see it.

enitihas8y ago

I think looking at @pixiuPL's commits (https://github.com/ariya/phantomjs/commits?author=pixiuPL) and his tall, non sensical claims makes it clear who lacks the basic human communication skills. Just have a look for yourself.

Looking at some issues filed by him (https://github.com/composer/composer/issues/7016) makes the entire thing more clear.

1 more reply

TheAceOfHearts8y ago

Some people are mentioning headless Chromium, so I wanna mention another tool I've used to replace some of phantomjs' functionality: jsdom [0].

It's much more lightweight than a real browser, and it doesn't require large extra binaries.

I don't do any complex scrapping, but occasionally I want to pull down and aggregate a site's data. For most pages, it's as simple as making a request and passing the response into a new jsdom instance. You can then query the DOM using the same built-in browser APIs you're already familiar with.

I've previously used jsdom to run a large web app's tests on node, which provided a huge performance boost and drastically lowered our build times. As long as you maintain a good architecture (i.e. isolating browser specific bits from your business logic) you're unlikely to encounter any pitfalls. Our testing strategy was to use node and jsdom during local testing and on each commit. IMO, you should generally only need to run tests on an actual browser before each release (as a safety net), and possibly on a regular schedule (if your release cycle is long).

[0] https://www.npmjs.com/package/jsdom

AlphaWeaver8y ago

Cheerio [0] is fantastic for this as well...

[0]: https://www.npmjs.com/package/cheerio

TheAceOfHearts8y ago

I've tried Cheerio as well, but I prefer JSDOM since it exposes the DOM APIs. What I'll normally do is interactively test things out in the browser's console, and then transfer em over to my script. Browser dev tools are just super amazing.

2 more replies

Rapzid8y ago

Using Cheerio with TypeScript types to parse data from tables in downloaded HTML files. Great tool.

madeofpalk8y ago

One question I've had recently is how to scrape out a Javascript object out of HTML source. With server-side react + redux, I've wanted to be able to scrap out the serialised var __STATE__ = {...} object to JSON, from nodejs. Best solution I cobbled together was to basically eval() the JS source, which I know is far from ideal.

seeekr8y ago

You could use a parser like esprima or its equivalent from the babeljs ecosystem on the JS source instead and just find the global variable with name `__STATE__` and just eval its init expression. Cheaper, more secure, more direct than actually running the JS.

1 more reply

TheAceOfHearts8y ago

You can use the vm module [0] to securely execute the code.

[0] https://nodejs.org/api/vm.html

tonto8y ago

Right before release is a bad time to realize there are problems with your build

chrisweekly8y ago

Better than right after release.

tokenizerrr8y ago

Depends on how often you release and who sets the schedule.

h1d8y ago

jsdom is pretty unforgiving and won't load broken HTML. That's where I stopped using it. Could use some tidy tool maybe.

draw_down8y ago

jsdom is an impressive achievement, but it may not be what you want depending on what you’re trying to do. It doesn’t mimic the behavior of browsers well in a number of regards, so it will let you do things that real browsers don’t allow. If you’re doing integration-type testing that can lead to tests that pass but functionality that fails in real browsers.

enitihas8y ago

For those who haven't looked at some of the commits by @pixiuPL, the list is here : https://github.com/ariya/phantomjs/commits?author=pixiuPL.

To summarize: It does not look like the guy has done a single commit with any meaning. His commits are basically the following:

1. Adding his own name in package.json 2. Adding and deleting whitespace. 3. Deleting the entire project and commiting. 4. Adding the entire project back again and commiting.

Just out of curiosity: How likely is that someone may be able to use a large number of such non functional commits(adding and removing whitespace) to a popular open source repository to boost their career ambitions.(e,g. Claiming that they made 50 commits to a popular project might sound impressive in an interview.)

captain_murdock8y ago

Grab some popcorn and give this a read: https://github.com/ariya/phantomjs/issues/15345.

@pixiuPL thinks he's king of the world, but gets rightfully put in his place.

enitihas8y ago

I think an interesting project may be to look at popular github repositories and searching for such 'stat builders',i.e, people who make commits of no utility just to boost their github stats.

nolok8y ago

Given that he hides those behind fake commit message (unless one counts removing a comment or a whitespace "code refactoring), I would say rather likely.

enitihas8y ago

It seems there are no limits to this madness.e,g https://github.com/ariya/phantomjs/commit/970edb9b683175a6b1...

In this commit the guy deletes two spaces from a file, and adds copyright for his name at the top. Going through his commits has made me extremely shocked. I mean how did such low quality commits made it into the master branch of the repo. It is like these commits were invisible to all the visitors and users of the repo.

michaelermer8y ago

Also look at stuff like this... https://github.com/SeleniumHQ/selenium/pull/5468

enitihas8y ago

The main question which is coming to my mind right now is how on earth did this guy get contributor access to the repository. And here I used to think that being maintainer of a large open source project must take a lot of talent and hard work.

gulperxcx8y ago

Sounds like you're getting ideas.

petercooper8y ago

Two alternatives:

Headless Chrome with Puppeteer: https://github.com/GoogleChrome/puppeteer

Firefox-based Slimer.js: https://github.com/laurentj/slimerjs (same API as Phantom which is useful if using a higher level library like http://casperjs.org/)

mrskitch8y ago

I maintain a puppeteer-as-a-service repo here: https://github.com/joelgriffith/browserless. It’s pretty feature rich at this point, allowing you to specify concurrency, sessions timeouts, and comes with a robust IDE (which you can play with here: https://chrome.browserless.io).

I’m working on building out a serverless model, which is the holy grail of headless workflows, but it’s a bit more challenging to operationalize than one would think.

I’m hoping that these efforts will lower the bar for folks wanting to get started with puppeteer and headless Chrome!

skinnymuch8y ago

Browserless seems awesome. Thanks for sharing your project!

lukebennett8y ago

As has been said, this point was somewhat inevitable with the advent of Chrome and Firefox's headless modes. However, as the project slips into the mists of history, let's not forget the vital stepping stone it provided in having access to a real headless browser environment vs a simulated one. I for one will remain grateful to Ariya, Vitallium and all the team for their efforts.

tnolet8y ago

I’m super biased in this, having spend considerable time programming against PhantomJs, Selenium and now Headless Chrome / Puppeteer for my startup https://checklyhq.com. This whole area of automating browser interactions is an extremely hard thing to get stable. In my experience, the recent Puppeteer library takes the cake but PhantomJs is the spiritual father here. I will not talk about Selenium for blood pressure reasons

iaml8y ago

Having dabbled with both selenium and phantom, I can vouch for both being PITA to work with.

mrskitch8y ago

Have you seen _my_ startup (https://browserless.io/). The stability part is something I’m trying to solve once and for all with this project.

rumblefrog8y ago

Within the issue @pixiuPL created, I listed some of the things that he has shown incompetence on: https://github.com/ariya/phantomjs/issues/15345#issuecomment...

mkarnicki8y ago

Nicely put github comment, well done. Thank you. I feel sick in my mouth seeing PL in his username, which clearly indicates my home country. I am beyond baffled.

hrasyid8y ago

Ariya wrote a bit about his reasoning here: https://mobile.twitter.com/AriyaHidayat/status/9701730017013... also mentioning an old post in https://github.com/ariya/phantomjs/issues/14541

hartator8y ago

I still think it's premature. There is still couple of fields PhantomJS is better than Headless Chrome. Notably proxy support, and API aviability.

ComputerGuru8y ago

Yes, but what was in it for Vitallium? Continue working thanklessly on a project to serve others’ needs, who has a whole will leave en masse as soon as headless chrome gets to parity with proxy support?

transreal8y ago

That's not really true. You can use proxies with Headless Chrome using the --proxy-server command line parameter. And the API is richer that PhantomJS. See the underlying API documentation here: https://chromedevtools.github.io/debugger-protocol-viewer/to....

hartator8y ago

It's only for proxy without auth. So mainly local ones. There is no way to use username and a password right now for proxy with headless chrome.

redka8y ago

Well with Chrome going headless there isn't a whole lot of place for PhantomJS anyway. Or is there? What is it still good for?

apocalyptic0n38y ago

Legacy systems for one. The Cooperative Patent Classification group releases their classifications en masse as HTML (single zip download, which is great). I built a parser for a PHP project that could parse all several hundred thousand records from the HTML in a few minutes. In 2017, they switched to a system that loads in the data from JSON stored in Javascript in the HTML (it is every bit as terrible as you imagine). Obviously loading in the HTML and trying to use regex to match the JSON was a terrible idea (especially since it was encoded to boot...), so I instead used Phantom to load each file, render it, and save it to a temporary file which I then parse using the original pre-2017 parser. Like 10 lines of code in Phantom to do it.

Obviously with my situation, this is not the end of the world. I use the parser twice a year and Phantom will continue to handle that task just fine. But I also know that the switch to using headless Chrome would be an expensive one if necessary; we have to research it, we have to update local dev environments, we have to implement it, we have to write new tests for it, we have to test it, we have to updating our deployment strategy, update our server deployment configuration, and, worst of all, get all of these changes and new software installations approved by the USPTO which is a nightmare. My situation is simple, but would take several weeks to several months to actually deploy to production. As it stands, I will likely have to explain why we have a now-unmaintained piece of software on the server and may be forced to switch regardless.

I can easily imagine how this project sunsetting, even though there is a clear alternative and successor, could be a nightmare to a lot of people. It's not the end of the world, but it's definitely unfortunate

feelin_googley8y ago

Is this the data you were trying to parse?

https://www.cooperativepatentclassification.org/Archive.html

1 more reply

redka8y ago

Why would you need PhantomJS for that? Can't you just parse the HTML files with Nokogiri and be done with it? That would be orders of magnitude faster anyway

3 more replies

minitoar8y ago

Maintaining systems already built on top of PhantomJS.

toomuchtodo8y ago

A bit concerning, as youtube-dl relies on PhantomJS currently.

1 more reply

paulie_a8y ago

I am curious about this aspect and probably should do some research, but how will highcharts to PDF work?

Phantomjs was generally great for that type of rendering

epx8y ago

Not sure whether it is as easy to use as PhantomJS.

nkozyra8y ago

I'd say Puppeteer is on-par with Phantom for ease of basic use. It has a richer, deeper API, of course, but at its core it's modern Javascript.

1 more reply

redka8y ago

Well that depends if you're stuck with Javascript. There isn't anything simpler (that I'm aware of - bu I do web scraping/automation professionally for about 6 years) than watir[0]. PhantomJS doesn't even come remotely close.

[0] http://watir.com/

Analemma_8y ago

There is one thing about this that saddens me: PhantomJS still starts up much faster than headless Firefox or Chrome, at least for me, which makes some of our integration tests take a long longer than they should.

Has anyone here figured out any tricks to get headless Chrome booted fast?

vaviloff8y ago

Also PhantomJS was a single statically linked binary with no dependencies that you could literally drop into a server and run scripts at once.

oelmekki8y ago

For those who may struggle with using chrome headless on server, here is a dockerfile example to get your started : https://github.com/oelmekki/chromessr/blob/master/Dockerfile

godet is the lib I use for chrome piloting, replace with your favorite one.

cowkingdeluxe8y ago

Running it as a pooled web server via generic-pool makes it run a bit more efficiently. Using the pooling method, it can do 512x512 images every 400 ms, add in Optimize, WebP & S3 for a total 1000 ms.

I based the pool off of https://github.com/latesh/puppeteer-pool/blob/master/src/ind... .

pbiggar8y ago

I have the same problem, so it's not just you.

gowanOP8y ago

this is one of the reasons i created chromedriver-proxy[0]

[0] https://github.com/ZipRecruiter/chromedriver-proxy

sergiotapia8y ago

End of an era! Congratulation to team for all their hard work and excellent contribution to help teams build better software.

All the best to everybody!

pknerd8y ago

Somehow I am having issue to use both headless FireFox|Chrome. Unlike PhantomJS where all I had to do is to drop the binary and set the path, both FF and Chrome are not following same route thus I am happy to use PhantomJS for a while

isuckatcoding8y ago

I would think PhantomJS is still quite heavily used so having some kind of migrator to puppeteer would be useful. I’m sure people would pay $$$ for it.

skrebbel8y ago

Thank you, PhantomJS contributors. You built a life saver.

chx8y ago

Drupal dropped PhantomJS too https://www.drupal.org/project/drupal/issues/2775653

kschiller8y ago

Does anyone here know if there's a way to set SSL client certs with Headless Chrome? With PhantomJS I could use

  --ssl-client-certificate-file and --ssl-client-key-file

Changu8y ago

I do lightweight web automation via Chromiums "Snippets". It is super nice to work that way because you see on screen what happens and can check everything realtime in the console. Only problem is that they dont survive page loads. So when my snippet navigates to a new url I have to trigger it again manually. What would be a good way to progress from here so I can automate across pages?

icebraining8y ago

Greasemonkey and its descendants (e.g. Violentmonkey) can run user scripts which work across pages.

Changu8y ago

Maybe it is even easier to write a Chrome extension?

moondev8y ago

I remember taking full page screenshots with phantom back in the day. Really cool project. Nightmarejs is another alt with a friendly api.

rutierut8y ago

One of the guys working on P-JS just linked from a GH issue to his open letter... He isn't very happy with the owner blah blah blah and is going to fork the master branch to make phantom great again, I'll just put this here:

"Will do as advised, as I really think PhantomJS is good project, it just needs good, devoted leader."

enitihas8y ago

It does not look like the guy has done a single commit with any meaning. His commits are basically the following: 1. Adding his own name in package.json 2. Adding and deleting whitespace. 3. Deleting the entire project and commiting. 4. Adding the entire project back again and commiting.

paulie_a8y ago

That sounds slightly ambiguous, is that person going to be that leader, out are they looking for one?

chirag648y ago

Shoot, I was just planning to use this for generating PDFs out of a URL on nodejs. Does anyone know of any other library / module out there that is good at this?

randlet8y ago

You can generate pdfs with headless Chromium/Chrome pretty easily.

    chromium-browser --headless --disable-gpu --print-to-pdf=output_file_name.pdf file:///path/to/your/html

bluehatbrit8y ago

Sadly you get 0 control over headers and footers of the output PDF, meaning you get lovely crappy page numbers around the place with no way to turn them off. This is why, sadly, I have to keep my command line markdown -> pdf converter (https://www.npmjs.com/package/mdpdf) using Phantomjs.

So this does work for very basic pdf printouts, but so far phantom is the only tool that offers full control over the PDF output. Even down to things like margins, paper size, etc.

runarberg8y ago

I think you can just use headless firefox[1] or headless chrome[2].

[1]: https://developer.mozilla.org/en-US/Firefox/Headless_mode

[2]: https://developers.google.com/web/updates/2017/04/headless-c...

laktek8y ago

Check pdf.cool (hosted API)

wnevets8y ago

is headless chrome's API just as easy to work with? Taking a screenshot or saving a page as pdf is stupid simple with phantomjs

andrewguenther8y ago

yep, just as easy

wxyyxc19928y ago

Thanks & Goodbye

j / k navigate · click thread line to collapse

132 comments

emilsedgh8y ago

Chrome and Firefox gaining headless modes is the ultimate goal Phantom could've achieved.

So I consider it a complete success.

Kudos to all contributors.

tyingq8y ago

I feel the same. No failure here. It served its purpose very well, while it was needed. It’s retiring while it’s still better than it’s replacement in a few areas...like proxy support.

overdrivetg8y ago

This is the main blocker for me right now for sure.

BinaryIdiot8y ago

poizan428y ago

Huh? They have had the WebBrowser ActiveX control since IE3 I think. That works as headless as you get on Windows.

1 more reply

luisrudge8y ago

they have webdriver support. Is that enough? I'm not sure what's the difference between headless and webdriver (if any) https://blogs.windows.com/msedgedev/2015/07/23/bringing-auto...

1 more reply

tnolet8y ago

Agree 100%. Headless & programmable browsing is a hard nut to crack. PhantomJs paved the way.

kodablah8y ago

Sadly, Chrome won't support extensions: https://bugs.chromium.org/p/chromium/issues/detail?id=706008

rnnr8y ago

but... i think you can attach it to an existing chrome with remote debugging enabled, so not that big a deal!

1 more reply

ricardobeat8y ago

The main advantage of PhantomJS at this point is the way it's built and distributed as a single binary, unlike Chrome. Much easier to maintain.

ris8y ago

Distributions attempting to package phantomjs properly had one hell of a time trying to reproduce its builds reliably. Most gave up.

Distribution from author as binaries is a whole bundle of fail from the get-go.

fareesh8y ago

Does headless chrome do PDF generation? That's the only thing I'm using phantom for at the moment.

madeofpalk8y ago

Yes.

In fact, there's a command line switch for it https://developers.google.com/web/updates/2017/04/headless-c...

The Chrome team also make Puppeteer, a node's library for interfacing with headless chrome, and has methods for making PDFs as well https://github.com/GoogleChrome/puppeteer

dewey8y ago

First result:

--print-to-pdf

https://developers.google.com/web/updates/2017/04/headless-c...

itslennysfault8y ago

Yes.

It's really easy to do using [puppeteer](https://github.com/GoogleChrome/puppeteer). The 2nd or 3rd example is PDF.

2 more replies

wgjordan8y ago

This project has been effectively dead since April 2017, when Vitallium stepped down as maintainer as soon as Headless Chrome was announced [1]:

[1] https://groups.google.com/d/msg/phantomjs/9aI5d-LDuNE/5Z3SMZ...

[2] https://github.com/ariya/phantomjs/issues/14954

micimize8y ago

Timeline of what lead to this, from what I could gather:

• phantomjs is 7 years old, @pixiuPL has been contributing for about 2 months

• @ariya didn't respond to his requests for owner level permissions

• @pixiuPL published an open letter to the main page of phantomjs.org https://github.com/ariya/phantomjs/issues/15345

• the stress leads @ariya to close the repo.

• @pixiuPL intends to continue development on a fork

This is a good reminder of why non-technical skills are so important in OS and in general.

rilut8y ago

I don't know man, but look at @pixiuPL's commits https://github.com/ariya/phantomjs/commits?author=pixiuPL

Especially his own commits (non-merge commits)

enitihas8y ago

1 more reply

ricardobeat8y ago

Changing default indentation from 2 spaces to tabs (without updating existing files?): https://github.com/ariya/phantomjs/commit/6485f5466110fc8d4b...

I couldn't find a single one containing any meaningful code changes. The closest one is a81a38f[1] which seems to introduce bugs - removing open file check, plus a hanging if clause.

Sounds like it's either an elaborate prank, or the guy has no grounding on reality.

[1] https://github.com/ariya/phantomjs/commit/a81a38ffabe2cea715...

bcherny8y ago

What about them?

2 more replies

watwut8y ago

Who is lacking non-technical skills in that scenario? I just don't see it.

enitihas8y ago

Looking at some issues filed by him (https://github.com/composer/composer/issues/7016) makes the entire thing more clear.

1 more reply

TheAceOfHearts8y ago

Some people are mentioning headless Chromium, so I wanna mention another tool I've used to replace some of phantomjs' functionality: jsdom [0].

It's much more lightweight than a real browser, and it doesn't require large extra binaries.

[0] https://www.npmjs.com/package/jsdom

AlphaWeaver8y ago

Cheerio [0] is fantastic for this as well...

[0]: https://www.npmjs.com/package/cheerio

TheAceOfHearts8y ago

2 more replies

Rapzid8y ago

Using Cheerio with TypeScript types to parse data from tables in downloaded HTML files. Great tool.

madeofpalk8y ago

seeekr8y ago

1 more reply

TheAceOfHearts8y ago

You can use the vm module [0] to securely execute the code.

[0] https://nodejs.org/api/vm.html

tonto8y ago

Right before release is a bad time to realize there are problems with your build

chrisweekly8y ago

Better than right after release.

tokenizerrr8y ago

Depends on how often you release and who sets the schedule.

h1d8y ago

jsdom is pretty unforgiving and won't load broken HTML. That's where I stopped using it. Could use some tidy tool maybe.

draw_down8y ago

enitihas8y ago

For those who haven't looked at some of the commits by @pixiuPL, the list is here : https://github.com/ariya/phantomjs/commits?author=pixiuPL.

To summarize: It does not look like the guy has done a single commit with any meaning. His commits are basically the following:

1. Adding his own name in package.json 2. Adding and deleting whitespace. 3. Deleting the entire project and commiting. 4. Adding the entire project back again and commiting.

captain_murdock8y ago

Grab some popcorn and give this a read: https://github.com/ariya/phantomjs/issues/15345.

@pixiuPL thinks he's king of the world, but gets rightfully put in his place.

enitihas8y ago

I think an interesting project may be to look at popular github repositories and searching for such 'stat builders',i.e, people who make commits of no utility just to boost their github stats.

nolok8y ago

Given that he hides those behind fake commit message (unless one counts removing a comment or a whitespace "code refactoring), I would say rather likely.

enitihas8y ago

It seems there are no limits to this madness.e,g https://github.com/ariya/phantomjs/commit/970edb9b683175a6b1...

michaelermer8y ago

Also look at stuff like this... https://github.com/SeleniumHQ/selenium/pull/5468

enitihas8y ago

gulperxcx8y ago

Sounds like you're getting ideas.

petercooper8y ago

Two alternatives:

Headless Chrome with Puppeteer: https://github.com/GoogleChrome/puppeteer

Firefox-based Slimer.js: https://github.com/laurentj/slimerjs (same API as Phantom which is useful if using a higher level library like http://casperjs.org/)

mrskitch8y ago

I’m working on building out a serverless model, which is the holy grail of headless workflows, but it’s a bit more challenging to operationalize than one would think.

I’m hoping that these efforts will lower the bar for folks wanting to get started with puppeteer and headless Chrome!

skinnymuch8y ago

Browserless seems awesome. Thanks for sharing your project!

lukebennett8y ago

tnolet8y ago

iaml8y ago

Having dabbled with both selenium and phantom, I can vouch for both being PITA to work with.

mrskitch8y ago

Have you seen _my_ startup (https://browserless.io/). The stability part is something I’m trying to solve once and for all with this project.

rumblefrog8y ago

Within the issue @pixiuPL created, I listed some of the things that he has shown incompetence on: https://github.com/ariya/phantomjs/issues/15345#issuecomment...

mkarnicki8y ago

Nicely put github comment, well done. Thank you. I feel sick in my mouth seeing PL in his username, which clearly indicates my home country. I am beyond baffled.

hrasyid8y ago

Ariya wrote a bit about his reasoning here: https://mobile.twitter.com/AriyaHidayat/status/9701730017013... also mentioning an old post in https://github.com/ariya/phantomjs/issues/14541

hartator8y ago

I still think it's premature. There is still couple of fields PhantomJS is better than Headless Chrome. Notably proxy support, and API aviability.

ComputerGuru8y ago

transreal8y ago

hartator8y ago

It's only for proxy without auth. So mainly local ones. There is no way to use username and a password right now for proxy with headless chrome.

redka8y ago

Well with Chrome going headless there isn't a whole lot of place for PhantomJS anyway. Or is there? What is it still good for?

apocalyptic0n38y ago

feelin_googley8y ago

Is this the data you were trying to parse?

https://www.cooperativepatentclassification.org/Archive.html

1 more reply

redka8y ago

Why would you need PhantomJS for that? Can't you just parse the HTML files with Nokogiri and be done with it? That would be orders of magnitude faster anyway

3 more replies

minitoar8y ago

Maintaining systems already built on top of PhantomJS.

toomuchtodo8y ago

A bit concerning, as youtube-dl relies on PhantomJS currently.

1 more reply

paulie_a8y ago

I am curious about this aspect and probably should do some research, but how will highcharts to PDF work?

Phantomjs was generally great for that type of rendering

epx8y ago

Not sure whether it is as easy to use as PhantomJS.

nkozyra8y ago

I'd say Puppeteer is on-par with Phantom for ease of basic use. It has a richer, deeper API, of course, but at its core it's modern Javascript.

1 more reply

redka8y ago

[0] http://watir.com/

Analemma_8y ago

Has anyone here figured out any tricks to get headless Chrome booted fast?

vaviloff8y ago

Also PhantomJS was a single statically linked binary with no dependencies that you could literally drop into a server and run scripts at once.

oelmekki8y ago

For those who may struggle with using chrome headless on server, here is a dockerfile example to get your started : https://github.com/oelmekki/chromessr/blob/master/Dockerfile

godet is the lib I use for chrome piloting, replace with your favorite one.

cowkingdeluxe8y ago

I based the pool off of https://github.com/latesh/puppeteer-pool/blob/master/src/ind... .

pbiggar8y ago

I have the same problem, so it's not just you.

gowanOP8y ago

this is one of the reasons i created chromedriver-proxy[0]

[0] https://github.com/ZipRecruiter/chromedriver-proxy

sergiotapia8y ago

End of an era! Congratulation to team for all their hard work and excellent contribution to help teams build better software.

All the best to everybody!

pknerd8y ago

isuckatcoding8y ago

I would think PhantomJS is still quite heavily used so having some kind of migrator to puppeteer would be useful. I’m sure people would pay $$$ for it.

skrebbel8y ago

Thank you, PhantomJS contributors. You built a life saver.

chx8y ago

Drupal dropped PhantomJS too https://www.drupal.org/project/drupal/issues/2775653

kschiller8y ago

Does anyone here know if there's a way to set SSL client certs with Headless Chrome? With PhantomJS I could use

  --ssl-client-certificate-file and --ssl-client-key-file

Changu8y ago

icebraining8y ago

Greasemonkey and its descendants (e.g. Violentmonkey) can run user scripts which work across pages.

Changu8y ago

Maybe it is even easier to write a Chrome extension?

moondev8y ago

I remember taking full page screenshots with phantom back in the day. Really cool project. Nightmarejs is another alt with a friendly api.

rutierut8y ago

"Will do as advised, as I really think PhantomJS is good project, it just needs good, devoted leader."

enitihas8y ago

paulie_a8y ago

That sounds slightly ambiguous, is that person going to be that leader, out are they looking for one?

chirag648y ago

Shoot, I was just planning to use this for generating PDFs out of a URL on nodejs. Does anyone know of any other library / module out there that is good at this?

randlet8y ago

You can generate pdfs with headless Chromium/Chrome pretty easily.

    chromium-browser --headless --disable-gpu --print-to-pdf=output_file_name.pdf file:///path/to/your/html

bluehatbrit8y ago

So this does work for very basic pdf printouts, but so far phantom is the only tool that offers full control over the PDF output. Even down to things like margins, paper size, etc.

runarberg8y ago

I think you can just use headless firefox[1] or headless chrome[2].

[1]: https://developer.mozilla.org/en-US/Firefox/Headless_mode

[2]: https://developers.google.com/web/updates/2017/04/headless-c...

laktek8y ago

Check pdf.cool (hosted API)

wnevets8y ago

is headless chrome's API just as easy to work with? Taking a screenshot or saving a page as pdf is stupid simple with phantomjs

andrewguenther8y ago

yep, just as easy

wxyyxc19928y ago

Thanks & Goodbye

j / k navigate · click thread line to collapse