2016년 4월 2일 토요일

Various Web Scrape Methods


headless browser

1. phantomjs
Very good

2. Selenium with Ghostdriver/PhantomJS

3. Selenium with HtmlUnitDriver
Selenium and driver method is not that good. 

4. Ghost - Python only. WebKit-based. Full JavaScript support. Open source.

5. ZombieJS - Node.js. Custom browser engine. JavaScript support/emulated DOM. Open source. Based on jsdom.

6. EnvJS - JavaScript via Java/Rhino. Custom browser engine. JavaScript support/emulated DOM. Open source.

7. Spynner - Python only. PyQT and WebKit.

8. jsdom - Node.js. Custom browser engine. Supports JS via emulated DOM. Open source.

9. ui4j - Pure Java 8 solution. A wrapper library around the JavaFx WebKit Engine incl. headless modes.

To conclude, as a java developer, ui4j is best! 

댓글 없음:

댓글 쓰기