Parsable Blog

Using Human Readable JSON Endpoints with Thrift (for Free)

Parsable Team

Shameless plug: I work at WI (http://wi.co/). We are changing the way people outside Silicon Valley work. If you are interested have a look at our careers page (http://wi.co/careers) or send me an email at devansh@wi.co

About Thrift: If you are not familiar with RPC or Thrift you might want to have a look at https://thrift.apache.org/

So, one problem I have heard is, “Hey, so we are publishing our Thrift APIs to third party vendors, and we need it to be easy to use.” Now, you either:

  1. Expose the Thrift endpoints and coach your API consumers on Thrift and help them debug any issues.
  2. You maintain a library that is wrapper around the different Thrift services for each language/platform you want to support.
  3. Create a way to manually convert between human readable JSON toThrift and vice versa, and maintain it for every single struct, service & function.

If these options sound unappealing to you, you’re right — they can be a lot of work. But is there another way?

I will be using the following simple Thrift file in my article below:

Now, Thrift does support a JSONProtocol which transmits data as JSON between the client and server (this is what we use for our web front end). However, one look at the data being sent will tell you it’s not going to be possible to expose this to third parties:

And this is for a very simple call: login(email, password).

The Thrift JSON has a whole bunch of jargon it uses to parse out the exact call like the types of each argument/field, the length of arrays/sets, the key number of each field in a struct, etc.

So we came up with a way to convert human readable JSON to Thrift without:

  1. Adding a whole bunch of jargon
  2. Doing any maintenance. Any changes to the Thrift definitions should automatically be usable.

An example of the same call using the new way:

Much more readable!

The format is simple. The method key’s values should be the exact name of the method in the Thrift definition. The arguments object is the list of arguments. The keys inside the arguments object are again the same as the name of the function parameters in the Thrift definition.

Here is a sample response:

An example of an API defined Exception which was thrown:

Note, it can easily handle Structs (and structs embedded inside structs) too. Above currentUser is a struct. In JSON it is simply a JSON object whose keys are the same as the field names inside the Thrift Struct definition.

Also, above err (the exception) is also a Struct since in Thrift exceptions are simply special kind of structs.

Furthermore, note that the success object itself is a struct. Its is an auto generated struct whose name is usually <MethodName>_result (in this case login_result). You can see it if you ever peek into the generated code for any other language. It has two fields: result for a successful call and err for any api defined error that was thrown.

Similarly, the arguments is also an auto generated Struct, usually call <MethodName>_args (in this case login_args). Its fields are simply the function arguments. Thus, this format follows very closely to how actual Thrift works.

There is JSON generator for Thrift, which generates all the metadata of the

  1. Services: name, list of functions
  2. Functions: name, argument list
  3. Structs: name, fields along with their names and types

Below is an example of the generated metadata for Thrift file above.

I used this metadata inside a new Protocol. A Protocol is the way Thrift marshals and un-marshals data. The problem is that Protocols in Thrift are strongly typed. It needs to be able to pass up the exact type of field (or the element type of the array or the key and value type of the map) and other stuff like the size of the map or array. That’s why the Thrift JSONProtocol has so much jargon inside it. It carries exactly this information.

This new Protocol, the HumanReadableJsonProtocol, uses the JSON that the Client sends up along with the metadata from the Thrift JSON generator to figure exactly what function was called, what arguments were sent and what their exact types are.

For example, in the above call it will first look up the metadata for the login method. Then for each key inside the arguments object in the client JSON it will find the associated key in the arguments array in the metadata. Depending on the type it will parse it out (if it is a struct or list or map, it will keep on recursing till it reaches a base type — int, string, boolean, double, etc). For example, in this case it will look for an object whose “name” value is “email” in the metadata, see it has a type of String and then verify that the value of “email” in the arguments object in the client JSON is also a String. Similarly for “password.”

This new HumanReadableJsonProtocol is still strongly typed. If you send up a key whose value doesn’t map correctly to the type defined the metadata, it will throw an error.

Here is an error thrown in such a case, where I tried to pass a number instead of a string to the password field:

Note that this JSON has an exception key, the above ones had a result key followed by either success or error. This is similar to how Thrift treats the different response types. You either get back a Result or an Exception. Result means that the call succeeded and no error other than API defined errors (the ones in the method signature) were thrown. Thus, Result has two fields: one for success which has the successful result of the call and one for error which holds the API defined error that was thrown. Exception holds any other error (including Thrift errors like corrupted data, unable to parse data, wrong types, etc.).

Since, it is just a protocol (way to marshal/un-marshal data) the Thrift generated code using it will automatically take care of stuff like required or optional fields.

So, for example, if the request has a content-type of application/json, all you need to do is use the HumanReadableJsonProtocol to parse it, otherwise use whatever standard protocol you use. You will also need pass in the directory where the generated JSON metadata lives.

Finally, it automatically uses any new changes to the Thrift definitions (as long as you also generate the JSON using the JSON generator).

Here is the source code to the project: HumanReadableJsonProtocol

If you would like the code for this or would like to help expand the code for more languages and help write tests for it, please contact me.

HumanReadableJsonProtocol Snippet in Ruby

Here is a snippet of the code that does the parsing in Ruby:


Shameless plug again: I work at WI (http://wi.co/). We are changing the way people outside Silicon Valley work. If you are interested have a look at our careers page (http://wi.co/careers) or send me an email at devansh@wi.co

Thanks to Adam Neary.