Building a GraphQL Gateway/Proxy to a GRPC Server

Building a GraphQL Gateway/Proxy to a GRPC Server

GRPC and GraphQL are both fantastic technologies. Each have their advantages and disadvantages. I’ve long wondered if a server could easily allow the client to choose to use either. In this post, I’ll discuss how I built a C# server which serves GRPC calls over HTTP/2, and then added a Ruby on Rails gateway such that web (or other) clients could fall back on GraphQL over HTTP 1.1. Importantly, the GraphQL gateway is very thin; it is informed entirely by the GRPC’s IDL. This means it’s a “code-once-and-forget” solution, with the GraphQL interface always kept up-to-date by the protobufs/IDL.

The Problem

As many have observed, GraphQL is basically RPC already. In this GraphQL query…

query {
  GetUser(id: 1) {
    name
  }
}

We’re just calling a function named GetUser with a single parameter. In theory, it’s actually quite simple for a GraphQL gateway to proxy this request to a GRPC server which can serve this function.

The problem lies with one of GraphQL’s greatest features: selections. In this example, only the value of name is meant to be returned to the client; the rest is omitted.

In the GRPC world, like Thrift, the object returned from the function call conforms to a predefined structure. The topic of optional and required attributes in protobufs has long been a topic of contention, but proto3 seems to have settled upon all values being optional. This is useful to our case, because it means any values may be excluded from the response. This means that the protobuf needs only to define all of the possible fields.

Of course, the gateway could simply throw away unnecessary data, assuming that the GRPC server responds with all available data — and filtering the response to match the selections. But this the definition of over-fetching, and while it might save a little TTL on the HTTP 1.1 RTT, the RPC server is still stuck with 100% of the load no matter what the request. Instead, the RPC server itself needs to understand selections.

RPC with Selections

The only approach I’ve found that works is to somehow inform the GRPC server of the selections. This means adding a new message type to the protobuf file…

// Generic GraphQL field selection.
message GraphSelection {
  string name = 1;
  repeated GraphSelection selections = 2;
}

Now any request message can simply include a GraphSelection

message LoginReq {
  string username = 1;
  string password = 2;
  repeated GraphSelection selections = 3;
}

The GRPC server can use this to decide what data to fetch before responding (more about this in the Hydration section, below).

Transforming GraphQL to GRPC

The gateway needs to:

  1. Accept a GraphQL request and map its RPC calls against GRPC functions.
  2. For each RPC call, transform the variables (input) into messages and dispatch the request to the GRPC server.
  3. Transform the response object back into JSON to be sent over-the-wire.

This could theoretically be done in any language. You could even do it on the same GRPC server, creating a thin transformation layer. It made more sense for me to implement the gateway in Ruby on Rails because of a unique feature: I wanted to add additional GraphQL-only functionality, provided by pre-existing Ruby code. For any RPC which could not be served by a GRPC server, the Ruby server falls back on looking for some mapping within its own system.

The key parts involved:

  • Gateway::Proxy represents one or more connections to GRPC servers and can route method calls appropriately.
  • Gateway::Function is just a wrapper for making function calls to a GRPC service.
  • Gateway::Graphql handles GraphQL<->GRPC transcoding.

To use this code, you simply need to instantiate a Proxy object with one or more GRPC client stubs and a method which will handle errors:

services = {
  # See: https://grpc.io/docs/tutorials/basic/ruby.html#creating-the-client
  :main => Moongate::Gateway::Stub.new(gateway_url, creds)
}

proxy = Gateway::Proxy.new(services, &lambda do |error| 
  render_json :error => error, 500
end)

The proxy can connect to multiple GRPC servers; it inspects the server’s protobufs and keeps track of where each function should be routed (notably, this means that function names must be unique between services). You can now invoke this proxy object in a familiar way: by invoking the RPC method name, with the first parameter as a hash of input parameters and the second as a hash of metadata/headers. To continue with the GetUser example…

render_json proxy.GetUser({ :id => 1 })

This will render the result of the RPC as JSON. However, we’ve not yet handled the actual GraphQL translation. For this, we’ll need a GraphQL parser (like graphql-ruby). My controller, then, looks something like this…

# The action which is exposed by the controller
def graphql
  document = GraphQL::Language::Parser.parse(payload['query'] || '')
  render_json @proxy.graphql.execute(document, graphql_variables)
end

private

# Extract GraphQL variables from the payload
def graphql_variables
  return @variables if @variables
  @variables = payload['variables'] == 'null' ? nil : payload['variables']
  @variables.is_a?(String) ? JSON.parse(@variables) : (@variables || {})
rescue JSON::ParserError
  {}
end

Note that the @proxy.graphql.execute function could also take a third parameter including the metadata. This is how I would forward headers and authorization data.

That’s all it takes to forward the request to the GRPC server! At this point, you should be able to actually serve GraphQL calls from your HTTP server, proxied through to the GRPC service. However, there’s still a big problem…

Hydration

Hydration is the process of filling out the data within the response object. Ideally, our GRPC server should not over-fetch data. For example, when the User is looked up from the database, performance would be improved if the other properties (say, age) are not actually selected at all. This may seem like a relatively minor optimization for scalar values that belong to the User object, but let’s imagine users have friends. Now consider the following GraphQL request…

query {
  GetUser(id: 1) {
    name
    friends {
      name
      friends {
        name
        friends {
          name
          // ... you get the idea.

Being able to query deeply into nested/relational objects is an extremely powerful feature of GraphQL. Trying to accomplish it on a RPC server requires a very sophisticated hydration layer, especially once you start thinking about the N+1 query problem. In the above GraphQL query, we could easily end up making dozens of database queries, each for the friends of a single user. In this example, there should be only four database queries total (one for the initial user, and then three for each of the nested friends queries).

This is precisely why my project involves building an ORM/data-layer as a native part of my networking layer. Because I define my data models with code, I can use the power of C# Attributes to decorate them with hints about how they might be hydrated. Let’s take a look at my AccountData class…

[Serializable]
[Table]
public class AccountData : BaseData {
  [Column]
  [NotNull]
  [MaxLength(64)]
  [Unique("FindByUsername")]
  public string username { get; set; }

  [Column]
  [NotNull]
  [MaxLength(255)]
  public string password { get; set; }

  [Column]
  [NotNull]
  [MaxLength(255)]
  public string salt { get; set; }

  [Column]
  [NotNull]
  public DateTime createdAt { get; set; }

  [Column]
  [NotNull]
  public DateTime updatedAt { get; set; }

  [Hydrate(typeof(Has), "parentID")]
  public List<PlayerData> players { get; set; } = new List<PlayerData>();
}

The first 5 properties (username, password, salt, createdAt, updatedAt) are columns in the MySQL table. The sixth, players, is an array of PlayerData objects which is hydrated by a call to another table. Fans of Ruby will notice how I modeled my C# Attributes off of ActiveRecord associations. Specifically, I built Has, BelongsTo, and HasThrough. When its time for the GRPC server to hydrate the response objects, it is informed by these attributes. In this case, the PlayerData class has a parentID property which links it to the AccountData, so the Has Attribute is able to find all children.

This requires far too much code to paste here, so I’ll do a little hand-waving around the rest and you can leave a comment if you want more details ? First, the C# GRPC server transforms the array of selections into a ResponsePlan. It accepts the data object (i.e., AccountData) and returns output type (i.e., the GRPC message, presumably AccountMessage). It goes something like this:

  1. Iterate over the selections looking for un-hydrated properties and add them to the ResponsePlan.
  2. Use the hydration techniques to fill in any missing data, so that all queries are batched (i.e., one query for several users’ friends) to avoid N+1 queries.
  3. Repeat from #1 with all of the new objects which have been hydrated, until there are no objects left to hydrate.
  4. Copy the data object’s values into the response object.

A pretty cool trait of this approach is that queries are batched regardless of their level of nesting. That is: if a Foo object has both parent and children, which are of the same type, the system will attempt to make a single query for both parent and children.

Speed Test

Check out my full speed test. tl;dr: the gateway is 1/4 faster than pure Ruby, and the GRPC-only approach was 2/3 faster.

The difference between Ruby-only and the gateway approach was within a margin of error. Which is to say: the overhead of going through the Ruby server is nontrivial, but the gateway does not add any meaningful overhead. On the other hand, the GRPC only approach was about 66% faster.

This approach successfully allows a single C# implementation of a GRPC server to also serve GraphQL requests via a gateway. The fact that the ORM/data-layer is tightly integrated with the networking layer means that hydration strategies can efficiently populate the data in a way that serves GraphQL’s selection needs.

There are a few more tricks I have already built, or am planning to build. For example: it’s quite simple to generate a GraphQL schema. I also want to add security checks / access controls for users, parameter validation, and many other features to the GRPC server.

Build Guides

Looking for even more detail?

Drop your email in the form below and you'll receive links to the individual build-guides and projects on this site, as well as updates with the newest projects.

... but this site has no paywalls. If you do choose to sign up for this mailing list I promise I'll keep the content worth your time.

Written by
(zane) / Technically Wizardry
Join the discussion

2 comments
    • Ah, it’s been a while since I wrote this, so please correct me if I’ve missed anything—my approach remains to use C# and GRPC in a way that serves both GRPC and GraphQL.

      My understanding of field *Masks* is that they are an operation performed by GRPC as a way to scrub/shrink the amount of data being returned in the pipe (post-query). Contrast this against a **Selection** which actually informs the compilation of the SQL queries (I’m describing an entire execution engine, here).

      That aside, field masks are merely field selections. This is a small subset of GQL. In the current form of the library I have built for my C# data communication, I leverage other aspects of the Selection-based GQL-style syntax… like nested RPCs, aliasing, directives, etc.