Backup Complete Restrictive website with wget

Friday 15th, Jul, 2016 | #PHP #Playground

GNU Wget (or just Wget, formerly Geturl) is a computer program that retrieves content from web servers. It is part of the GNU Project. Its name is derived from World Wide Web and get.

It supports downloading via the HTTP, HTTPS, and FTP protocols. Its features include recursive download, conversion of links for offline viewing of local HTML, and support for proxies. It appeared in 1996, coinciding with the boom of popularity of the Web, causing its wide use among Unix users and distribution with most major Linux distributions. Written in portable C, Wget can be easily installed on any Unix-like system and has been ported to many environments, including Microsoft Windows, Mac OS X, OpenVMS, HP-UX, MorphOS and AmigaOS. Since version 1.14 Wget has been able to save its output in the web archiving standard WARC format.[3]

Wget for Downloading Restricted Content Wget can be used for downloading content from sites that are behind a login screen or ones that check for the HTTP referer and the User Agent strings of the bot to prevent screen scraping.

wget  --header="Accept: text/html" --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0" -rHm --no-parent -e  robots=off