How to use Gov. Open Source API Real time Air Quality Index from various locations and fetch data with Python.

  In our day to day life data create a big challenge, Government collects the data and process in day to day life in huge amount of volume , in which some data are accessible to the citizens

If a data is open source so those data can be used by anyone under certain terms and conditions without any violation of policy.

Warning! Don’t use these data where it violates the policy.

In our day to day life data create a big challenge, Government collects the data and process in day to day life in huge amount of volume , in which some data are accessible to the citizens, civil society, those data which is not sensitive can be use by the public for social, economic and developmental purposes. Unfortunately, if you want to use these open source data you must have a knowledge of different types of programming skills.

Before starting , First you must have knowledge of what is API

Basically Api stand for Application Programming Interface, It’s like a broker or intersection of certain software which helps in creating a chain of call requests for fetching data from one software to another. API basically uses the API Key for fetching such data.
Today we are using the API that contains the information regarding the Air Quality and the Humidity of State/City of India or you can use same way to fetch with the different country’s API.

Table of Contents

Open Source real time data in JSON format:

   
 
"records": [
			{
				"id": "11",
				"country": "India",
				"state": "Andhra_Pradesh",
				"city": "Rajamahendravaram",
				"station": "Anand Kala Kshetram, Rajamahendravaram - APPCB",
				"last_update": "08-11-2020 04:00:00",
				"pollutant_id": "NH3",
				"pollutant_min": "2",
				"pollutant_max": "5",
				"pollutant_avg": "4",
				"pollutant_unit": "NA"
			}
		]


In above figure as you can see the dictionary type data named as “records” which in json format, it contain the following fields or label:

id : unique id to identify the entry of data.
country : A data that belongs to country.
state : Describe the state of country.
city: Real data fetch through city to city from their statiion.
station: A station that collects the real time data.
last_update: last time of update.
pollutant_id: polutant id it can be NO2, SO2 , PM2.5 and PM10.
pollutant_min: Minimum value of current pollution.
pollutant_max: Maximum number of current pollution.
pollutant_avg: Average value of current pollution.
pollutant_unit: Basically it is Not Added.

Import the required library for fetching the data from API whose format is .JSON file:

Now, We need to import some important libraries or modules that are being used in our program or code, the first one is json module which helps in to manage the json, the second is requests module allow to send the http request, and it return the object file. And the last one pandas, it is a open source data analysis and manipulation tool which helps in to create a structured data.

   
 
"""
Created on Mon Oct 19 22:24:55 2020
@author: littleboy8506
"""
import json
import requests
import pandas as pd
					

Set the offset and the limit of data entry, you can find these data in json file as shown below in figure :

To get proper data we need to set number of json file limit which is called data offset, in our example it around 1387 basically it is json file which can be accessible via api key and the limit define the number of entry in this json file.

   
 
dataoffset=1378;
limit=10;
record=0; 
				

Create an empty data frame according to your data which you want to fetch :

Now, create an empty data frame with columns name as shown below which is similar to the json file’s entry and add the one extra column name “offset” , it will use later which help in, to retrieve the data from exported .csv file.

   
 
columns_data=["id","country","state","city","station","last_update","pollutant_id","pollutant_min","pollutant_max","pollutant_avg",
				"pollutant_unit","offset"]
dataframe= pd.DataFrame(columns=columns_data)  	

The For loop that call using API Key inorder to fetch all the data which you want :

Make a loop till dataoffset that helps in to fetch all json files and append all data into single dataframe, there are three step to get data given below:
1. Get Data Files(using request module).
2. Count the length of records.
3. Fetch and Store the data into dataframe.

   
 
for offset in range(0,dataoffset):
	solditems = requests.get('https://api.data.gov.in/resource/
		+ 3b01bcb8-0b14-4abf-b6f2-c1bfd384ba69?api-key=579b464db66ec23bdd000001cdd3946e44ce4aad7209ff7b23ac571b
		+ &format=json&offset='+str(offset)+'&limit='+str(limit)) # (your url)
data = solditems.json() record=len(data['records']) for dataentry in range(0,record): updated=data['records'][dataentry] updated["offset"]=offset dataframe=dataframe.append(updated,ignore_index=True) #print('***********************************'+str(record)+'*'+str(offset)+'**********************************')

Export the fetched data into Excel File:

All the data are stored into the dataframe object, inorder to export the data into csv file we need to use .to_excel() function as shown below :

   
 
dataframe.to_excel("output.xlsx",index=False)
print("done")	

Output:

The exported data shown below in csv format :

API Real time Air Quality Index
API Real time Air Quality Index

Leave a Reply

Your email address will not be published. Required fields are marked *