Getting URL address from file and check HTTP code

Question

I'm creating bash script to check what HTTP code a given URL returning. I have file with about 50k of URLs in JSON format; it's the first one (head of file):

"responseHeader":{
    "status":0,
    "QTime":7336},
  "response":{"numFound":50032,"start":0,"maxScore":1.0,"docs":[
      {
        "documentURL":"http....."},

    and so on

I need to loop over this file, check what HTTP code is every URL returning and save it in another file in format HTTP code + URL. So far I have only this curl command to check http code

  curl -s -o /dev/null -I -w "%{http_code}\n" URL >> httpCodeFile

I would appreciate any help and advise on what tools/aproach (grep,awk,sed) I should use.

I've created this function to get URL from file but I'm sure about syntax :

function checkHTTP(){



        cat URL_list | while read line
        do
                var =  $(grep documentURL) URL_list

                curl -s -o /dev/null -I -w "%{http_code}\n" ${var} +  " TEST "  >> httpCodeFile


        done
}

I'm getting only 000 despite that many of the URL should return 404.

RomanPerekhrest · Accepted Answer · 2018-01-24 09:23:24Z

The right way with jq + curl solution:

Sample valid input.json:

{
  "responseHeader": {
    "status": 0,
    "QTime": 7336
  },
  "response": {
    "numFound": 50032,
    "start": 0,
    "maxScore": 1,
    "docs": [
      {
        "documentURL": "https://unix.stackexchange.com/questions"
      },
      {
        "documentURL": "https://unix.stackexchange.com/problems"
      },
      {
        "documentURL": "https://stackoverflow.com/questions"
      },
      {
        "documentURL": "https://stackoverflow.com/issues"
      }
    ]
  }
}

Processing:

jq -r '.response.docs[].documentURL 
       | "curl -s -o /dev/null -I -w \"%{http_code} \(.)\n\" --url \(.)"' input.json \
       | sh > http_codes.out

The resulting http_codes.out contents:

$ cat http_codes.out 
200 https://unix.stackexchange.com/questions
404 https://unix.stackexchange.com/problems
200 https://stackoverflow.com/questions
404 https://stackoverflow.com/issues

Stack Exchange Network

current community

your communities

more stack exchange communities

Getting URL address from file and check HTTP code

1 Answer 1

Your Answer

Not the answer you're looking for? Browse other questions tagged shell-script curl json or ask your own question.

Hot Network Questions

Getting URL address from file and check HTTP code

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged shell-script curl json or ask your own question.

Related

Hot Network Questions