If you can access a website using browsers, but cannot access with curl, then you might need to config User-Agent for curl client to make requests.
Config User-Agent for curl
It is common to see web servers block access from unidentified client. When we make HTTP request to a website, it will look for User-Agent
header to determine where the requests come from.
Browsers often make requests that includes User-Agent
similar to this
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36
Let’s try to make a request using curl and show all request headers, to see the value:
$ curl -s -v https://google.com
....<omitted>
> GET / HTTP/2
> Host: google.com
> User-Agent: curl/7.61.1
> Accept: */*
I just omitted unnecessary output, and you can see the User-Agent: curl/7.61.1
.
For many websites, this uncommon type of User-Agent
value (not browsers’ ones) looks very suspicious and might get blocked. Also, we’re in a world of bot and automation, so the block can get real.
To bypass this case, we only need to config User-Agent for curl client, so it looks like a request from real browsers.
To do that, we use -A
option to provide an User-Agent
string value for the curl request.
$ curl -A "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36" -s -v https://google.com
....<omitted>
> GET / HTTP/2
> Host: google.com
> User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36
> Accept: */*
As you can see, the User-Agent
request header has been changed, and it should return the same results as it shows in browsers.