If my site design isn’t to your liking, you can also read this post over at The Gopher Academy where it is included in their 2014 Advent series.
There are definitely problems with using a json file as a database, but sometimes the simplicity of no extra dependencies makes it an attractive option. The two biggest problems are performance and managing concurrent reads and writes.
We can’t do much about performance, but with Go, managing concurrent reads and writes is a breeze!
Below is a walk through of a method for managing file access so that a json file can safely be used as a database.
The general pattern I’ll be implementing is: set up a channel and push read/write jobs onto it. Meanwhile, run a goroutine which will consume those jobs ensuring exclusive access to the json file. (Remember, Go’s philosophy towards concurrency is: Don’t communicate by sharing memory; share memory by communicating. This is why we’re going to pipe requests to the component responsible for accessing the json db file rather then screwing around with lock files or synchronization.)
Follow along below where I will put it all together, or jump here for the finished product: a full REST service for managing your todos, backed by a json file database.
Here’s what my database looks like. It’s just a map of id => Todos. (I’m using a uuid
for the Id.)
{
"1a6e9148-ebe5-4bf0-9675-b76f9fab7b72": {
"id": "1a6e9148-ebe5-4bf0-9675-b76f9fab7b72",
"value": "Hello World"
},
"3e39df85-9851-4ce9-af0c-0dd831e3b970": {
"id": "3e39df85-9851-4ce9-af0c-0dd831e3b970",
"value": "Hello World2"
}
} ### Todo And here's the api model we'll be marshalling / unmarshalling it with:
In the entry point we set up our job channel, start our job processor so we’re ready when the jobs
start rolling in, and then initialize a TodoClient
which insulates us from the details of the job channel.
This is the the hub of our database. ProcessJobs
is run as a goroutine so it just hangs out running in an infinite for loop waiting for work in the form of a Job
. A job’s Run
method is where the work happens: it takes in the database data (all of it! remember, this is never going to be performant with a ton of data, so let’s just make things easy on ourselves and only operate on our database in its entirety) and returns the updated database data. The Job Processor then writes the modified database model back to disc before moving on to the next job. (There’s also a shortcut in place where if Run
returns nil, that means nothing was modified so we can skip the write.)
Here’s one of our jobs for interacting with the database. This job simply implements the Job
interface and adds in a “todos” channel so we can also return data. Since the job processor is in charge of accessing the db file, all the Run
function does is pass the todos map to the todo
response channel and return nil
since there were no modifications.
This is the piece which the rest of your application will interact with. It encapsulates the mess associated with pushing jobs and waiting for a response and signal to come through on the error channel. It also maps the raw database model into a more reasonable result (a slice in this case.)
At this point, you might be thinking: “The only reason we have to worry about concurrent writes is because you put the read/write operations in a goroutine. A single routine would provide safe reads and writes too.”
But as soon as we turn this into a web service, all bets are off. Below I layer in an http server (using the Gin framework) to utilize our TodoClient
and illustrate the example.
Same as before, but now there’s a /todo
endpoint for getting all todos
The full example is more built out with a POST, GET by id, PUT, and DELETE
And last but not least, we leverage the TodoClient
to get some data… safely!
Why is this better than managing lock files, synchronizing access, and more generally sharing the database across goroutines? Because those techniques impose limitations on your design. If instead you compose your app such that modifications to the shared resource are communicated to a single component (the ProcessJobs
goroutine) responsible for modifying the resource, you’ve eliminated the need for sharing direct access to the database. Again, this is the Go concurrency philosophy: Don’t communicate by sharing memory; share memory by communicating
Thanks for following along! Check out the full example.
p.s. What led me down this path? We use Composer for package management in our PHP stack and need to run a private package repo for tracking internal code. Satis is an excellent tool for building private Composer package repositories, but it is just a static site generator that gets its repository list from a json config file. I used the techniques above to create Satis-Go which exposes the config file as a REST api and performs the static generation of the package repository index when modifications are made. From there, adding in an admin ui and incorporating web-hooks for triggering re-indexing was straight forward.
comments powered by Disqus