Working with JSON data in Python
JSON is a very popular data format. Its popularity is probably due to it’s simplicity, flexibility and ease of use. Python provides some pretty powerful tools that makes it very easy to work with JSON data.
Python has a builtin module called JSON which contains all the features you need to deal with exporting, importing and validating JSON data.
What is JSON?
JSON stands for JavaScript Object Notation. It comes from JavaScript, but can be used in any programming language. It can be used to transfer or store data in a simple human-readable format.
It is a subset of JavaScript (so it is executable with eval, but you should never ever do that, as it can lead to very serious security issues)
It is important to note, that JSON is not a concrete technology it is just a standard for describing data. So it does not define things like maximal string length, biggest available integer or floating point accuracy - however the underlying language or a certain implementation of a JSON parser will certainly have these kinds of limitations.
Why is JSON so popular?
- Generating and parsing JSON is easy for machines
- JSON is a human-readable data format
- It is extremely simple
- Despite of it’s simplicity, it’s still quite powerful and flexible
What does JSON data look like?
As I mentioned above, JSON is a subset of JavaScript, but it has some restrictions. Basically, you can define JSON objects the way you would define objects in JavaScript.
An example of a piece of JSON data:
{
"exampleString": "hello",
"exampleObject": {"field": "value"},
"exampleNumber": 1234,
"exampleArray": ["aString", 1234, {"field2": "value2"}]
}
Note, that the syntax is a bit stricter than in JavaScript:
- JSON objects cannot have field names without the surrounding double quotes (
{field: "value",}
is invalid) - JSON strings must be enclosed in double quotes - single quotes are not allowed (
{"field": 'value',}
is invalid) - Trailing commas after the last field are not allowed in JSON objects (
{"field": "value",}
is invalid)
JSON data types
JSON defines four data types: string
, number
, object
, array
, and the special values of "true"
, "false"
and "null"
. That’s all. Of course arrays and objects can contain strings, numbers or nested arrays and objects, so you can build arbitrarily complex data structures.
JSON strings
JSON strings consist of zero or more characters enclosed in double quotes.
Examples: ""
, "hello world"
JSON number
JSON numbers can be integers or decimals, the scientific notation is also allowed.
Examples: 123
, -10
, 3.14
, 1.23e-14
JSON object
Objects are a collection of key-value pairs. Keys should be enclosed in double quotes. Keys and values are separated by colons and the pairs are separated by commas. Values can be of any valid JSON type. The object is enclosed in curly braces.
Example:
{"hello": "world", "numberField": 123}
JSON array
JSON arrays can contain zero or more items separated by commas. Items can be of any valid type.
Examples: []
, ["a"]
,[1, 2, 3]
, ["abc", 1234, {"field": "value"}, ["nested", "list"]]
Where is JSON used?
JSON can be used to both transfer and store data.
JSON web APIs - JSON data transfer in HTTP REST APIs
JSON is commonly used in REST APIs both in the request and the response of the body. The clients’ requests are usually marked with the application/json
header. An http client can also indicate that it excepts a JSON response by using the Accept
header.
Example HTTP request:
POST /hello HTTP/1.1
Content-Type: application/json
Accept: application/json
{"exampleData": "hello world"}
HTTP/1.1 200 OK
Content-Type: application/json
{"exampleResponse": "hello"}
NoSQL databases
JSON is commonly used for communicating with non-relational databases (such as MongoDB). NoSQL databases let you dynamically define the structure of your data, and JSON is perfect for the task because of its simplicity and flexibility.
JSON in Python - The JSON module
Working with JSON in Python is rather simple as Python has a builtin module that does all the heavy lifting for you. With the help of the json
module you can parse and generate JSON-encoded strings and also read or write JSON encoded files directly.
Working with JSON strings
Exporting data to JSON format
You can turn basic Python data types into a JSON-encoded string with the help of json.dumps
, the usage is pretty simple:
data = {
"list": ["hello", "world"],
"integer": 1234,
"float": 3.14,
"dir": {"a": "b"},
"bool": False,
"null": None
}
import json
json_encoded_data = json.dumps(data)
print(json_encoded_data)
Output:
{
"float": 3.14,
"list": ["hello", "world"],
"bool": false,
"integer": 1234,
"null": null,
"dir": {"a": "b"}
}
Parsing a JSON string
The reverse - parsing a JSON-encoded string into Python objects can be done by using the json.loads
method, like so:
json_encoded_data = '''{
"float": 3.14,
"list": ["hello", "world"],
"bool": false,
"integer": 1234,
"null": null,
"dir": {"a": "b"}
}'''
import json
data = json.loads(json_encoded_data)
print(data)
output
{
'float': 3.14,
'list': ['hello',
'world'],
'bool': False,
'integer': 1234,
'null': None,
'dir': {'a': 'b'}
}
Validating a JSON string
The Python json
module does not have a dedicated way to validate a piece of JSON data, however you can use json.loads
to do that. json.loads
will raise a JSONDecodeError
exception, so you can use that to determine whether or not a string contains properly formatted JSON.
For example, you can define the following function to validate JSON strings:
import json
def is_valid_json(data: str) -> bool:
try:
json.loads(data)
except json.JSONDecodeError:
return False
return True
This function accepts a string as its single argument and will return a boolean. It will try to load the string and if it is not a valid JSON, it will catch the raised exception, and return False
. If the JSON is valid, no exception will be raised, so the return value will be True
.
Working with JSON files in Python
The json
module also makes it possible for you to work with JSON files directly. Instead of loads
and dumps
you can use the load
and dump
methods. These methods work directly on files - they take an extra argument, and instead of reading/writing strings in memory they will let you import/export JSON data from/to the files you pass.
Exporting data to a JSON file
Export JSON data can be done by using the json.dump
function. It takes two arguments, the first is the Python object that you’d like to export, while the second is the file where you want to write the encoded data.
Example usage:
data = {
"list": ["hello", "world"],
"integer": 1234,
"float": 3.14,
"dir": {"a": "b"},
"bool": False,
"null": None
}
import json
with open('ouptut.json', 'w') as output_file:
json_encoded_data = json.dump(data, output_file)
First we opened the file for writing and passed the file handle to json.dump
as its second argument.
output.json
will contains something like (added whitespace for readability):
{
"float": 3.14,
"list": ["hello", "world"],
"bool": false,
"integer": 1234,
"null": null,
"dir": {"a": "b"}
}
Parsing a JSON file
Reading JSON data from a file to an in-memory Python object can be done very similarly - with the help of the json.load
method.
This method takes a file as it’s argument - the file that you’d like to read from.
For example, to parse the file that we created in the previous example, we can write:
import json
with open('ouptut.json', 'w') as input_file:
data = json.load(input_file)
print(data)
First we open the file for reading, and then pass the file handle to json.load
Expected output;
{
'float': 3.14,
'list': ['hello',
'world'],
'bool': False,
'integer': 1234,
'null': None,
'dir': {'a': 'b'}
}
Validating a JSON file
To validate that a file contains valid JSON data, we can use the json.load
method and try to load the JSON contained in the file. On failure we can catch the JSONDecodeError
raised by json.load
. If no exception occurs, the file contains valid JSON.
import json
def is_valid_json_file(input_file: str) -> bool:
try:
with open(input_file, 'r') as f:
json.load(f)
except json.JSONDecodeError:
return False
return True