shuffling-data-as-json

Shuffling Data as JSON

What is JSON?

JSON is pretty cool. It's JavaScript's standard for storing data in a structured way. To make a long story short, data in JSON format is just a nested stack of objects (groups of name/value pairs) and arrays (ordered lists). Objects are contained in curly brackets and have names (name, color, price) while arrays are held in square brackets and have an implied order (0, 1, 2). Each can be nested below the other. Here's an example of a neighborhood's described with JSON:

{
    "North Street":[
      {
          "Address":"12345",
          "Family Name":"Jones",
          "Members": ["Tom", "Sue", "Timmy"]
      },
      {
          "Address":"12349",
          "Family Name":"Marlin",
          "Members": ["Carl", "Nancy"]
      }
    ],
    "South Street":[
      {
          "Address":"8824",
          "Family Name":"Ross",
          "Members": ["Jim", "Barb", "James", "Peter", "Mary", "Thomas", "Martha" ]
      },
      {
          "Address":"8832",
          "Family Name":"Mack",
          "Members": ["Debbie", "Fran", "Mike"]
      },
      {
          "Address":"8836",
          "Family Name":"Earl",
          "Members": ["Sam", "Nancy", "Eric", "Beth"]
      }
    ]
}

Come to find out, JSON's simple but robust design makes it useful elsewhere too. JSON makes it very easy to define data that gets passed back and forth between different tools. That's why most modern scripting languages have tools for reading and writing JSON.

PHP and JSON

Imagine that we're building a REST-ish API in PHP that answers with JSON. The client has requested our neighborhood description and they are expecting JSON in return. We have an array (PHP uses the term array for both objects and arrays as we think of them in JSON, though they're sometimes referred to as indexed and associative arrays respectively). You'll notice that it looks a lot like our JSON structure above.

$neighborhood = array(
  "North Street" => array(
    array(
      "Address" => "12345",
      "Family Name" => "Jones",
      "Members" => array("Tom", "Sue", "Timmy")
    ),
    array(
      "Address" => "12349",
      "Family Name" => "Marlin",
      "Members" => array("Carl", "Nancy")
    )
  ),
  "South Street" => array(
    array(
      "Address" => "8824",
      "Family Name" => "Ross",
      "Members" => array("Jim", "Barb", "James", "Peter", "Mary", "Thomas", "Martha")
    ),
    array(
      "Address" => "8836",
      "Family Name" => "Earl",
      "Members" => array("Sam", "Nancy", "Eric", "Beth")
    )
  )
)

This is a PHP data structure. It's a multidimensional array that you can manipulate and interact with. If say, we wanted to get all the family names on North Street we could say:

foreach ($neighborhood['North Street'] as $home){
  echo $home['Family Name'];
  echo "\n<br>";
}

We'll get something like:

Jones
Marlin

We can add the remaining family on South Street as well:

array_push( $neighborhood['South Street'],
  array(
    "Address" => "8832",
    "Family Name" => "Mack",
    "Members" => array("Debbie", "Fran", "Mike")
  )
);

We need to return something though and it needs to be JSON. Luckily PHP is ready to go:

echo json_encode($neighbors);

Done.

Oh, and we can read it to. Suppose that we have a script that needs to call that endpoint from another server. No sweat, PHP's json_decode() will handle that. Just use it with the API endpoint that you created:

$neighborhood = json_decode(file_get_contents('http://your-new-api.com/api/neighborhood'), true);

That's it! $neighborhood is once again a multidimensional array within your PHP script.

Python and JSON

In Python, our objects and arrays are called dictionaries and lists respectively. Let's pull that same JSON with Python, only this time, we'll print a list of families on all streets with a sub-list of members, replace any instance of Timmy with Bobby and finally write the result to file as JSON:

import requests  # be sure to install requests with 'pip install requests'
import json

## import and convert

r = requests.get('http://your-new-api.com/api/neighborhood')
neighborhood = r.json()

## make use of

for street in neighborhood:
  for i, home in enumerate(neighborhood[street]):
    print home['Family Name']
    for k, member in enumerate(home['Members']):
      print "  - {}".format(member)
      if member == "Timmy":
        neighborhood[street][i]['Members'][k] = "Bobby"

## convert and write to file

outfile = open('neighborhood.json', 'w')
outfile.write(json.dumps(neighborhood, indent=4, sort_keys=True))
outfile.close

This is a great example of how simple an ETL process can be using JSON to structure our data during transport.

No results found

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.
You need to agree with the terms to proceed

Menu