I'm creating bash script to check what HTTP code a given URL returning. I have file with about 50k of URLs in JSON format; it's the first one (head of file):
"responseHeader":{
"status":0,
"QTime":7336},
"response":{"numFound":50032,"start":0,"maxScore":1.0,"docs":[
{
"documentURL":"http....."},
and so on
I need to loop over this file, check what HTTP code is every URL returning and save it in another file in format HTTP code + URL. So far I have only this curl command to check http code
curl -s -o /dev/null -I -w "%{http_code}\n" URL >> httpCodeFile
I would appreciate any help and advise on what tools/aproach (grep,awk,sed) I should use.
I've created this function to get URL from file but I'm sure about syntax :
function checkHTTP(){
cat URL_list | while read line
do
var = $(grep documentURL) URL_list
curl -s -o /dev/null -I -w "%{http_code}\n" ${var} + " TEST " >> httpCodeFile
done
}
I'm getting only 000 despite that many of the URL should return 404.