Posted on 21 Mar 2016
For the last few months I’ve been working on the API for Artifact. I started out the project using ActiveModel::Serializers and chose collection+JSON as the media type. I wanted to give collection+json a try after using it at REST Fest 2015. This series of posts will show not why ActiveModel::Serializers is bad, but bad for what I tried using it for.
AMS + JSON API
ActiveModel::Serializer is heavily geared towards using JSON API. Because of this it feels like I’ve had to bend backwards to get it to do collection+json. You can see what I did in my Collection+JSON with ActiveModel::Serializers post.
I use and abuse the meta
key in that post. Based on what I saw in the source code for ActiveModel::Serializer I don’t think I used the key as intended. While I misuse it, it was the only way I could find to get variables into the serializer from the render
method.
Serializer Context
Another issue I ran across was trying to get general context inside of a serializer. ActiveModel::Serializer does current_user
for you, extending this to include extra methods resulted in a fairly hacky monkey patch.
class ApplicationController < ActionController::Base
# Extend what the Serializer can see by injecting the
# current application.
def _render_with_renderer_json(resource, options)
options[:current_application] = current_application
super(resource, options)
end
end
This is what I added in order to push the current OAuth application inside of the serializer. I found this by browsing the source of ActiveModel::Serializer enough to finally see them do it for current_user
. Inside of the serializer this is required:
def current_application
@instance_options[:current_application]
end
It took a good deal of poking around in order to figure out how to inject this extra context into the serializer.
Collection Serializers
Collection serializers are second citizens in ActiveModel::Serializers. With collection+json, everything is a collection. Every resource I have in Artifact has an item and a collection serializer. While writing the collection serializers you start to notice things that are missing in the ActiveModel::Serializer::ArraySerializer
. A lot of the context isn’t present and you see missing method errors more than you would expect.
Implicit Resources
One thing that seems like a nice feature of ActiveModel::Serializers is automatically detecting the serializer class. In practice I rarely use this feature. I have started treating API endpoints as resources and typically create a special serializer for the one endpoint. I found that I don’t create just a BookSerializer
and use it everywhere.
For example, in Artifact you can create a collection of books. For this I would create a CollectionBookSerializer
because the URLs contained within will be different than just a BookSerializer
. I would also need the associated collection serializers.
Given this here is how a typical controller method renders:
def index
render({
:json => books,
:each_serializer => CollectionBookSerializer,
:serializer => CollectionBooksSerializer,
:meta => { }, # "..."
})
end
I personally now prefer explicitly referring to a serializer I want to use on an endpoint.
Next Time
These are all fairly minor pain points, but it has recently gotten me to thinking how I can make it better. In the next post I’ll go over what I came up with.
Rethinking Rails API Serializations Series
Posted on 29 Jan 2016
For my side project, Worfcam, I use Google Cloud Storage to host files. I have files that I want to be private most of the time, but allow anonymous access when I want. S3 has this by allowing you to sign a URL so it has an expiration. Luck for me Google Cloud Storage does this as well, the only problem was that the ruby SDK didn’t do this for you like the S3 SDK does.
After a lot of searching I finally figured out how to generated signed URLs that expired for Google Cloud Storage.
signer.rb
private_key = "..."
client_email = "..."
bucket = "..."
path = "..."
full_path = "/#{bucket}/#{path}"
expiration = 5.minutes.from_now.to_i
signature_string = [
"GET",
"",
"",
expiration,
full_path,
].join("\n")
digest = OpenSSL::Digest::SHA256.new
signer = OpenSSL::PKey::RSA.new(private_key)
signature = Base64.strict_encode64(signer.sign(digest, signature_string))
signature = CGI.escape(signature)
"https://storage.googleapis.com#{full_path}?GoogleAccessId=#{client_email}&Expires=#{expiration}&Signature=#{signature}"
In the end this is pretty simple, but it took a long time to figure out exactly what I needed. This uses the private key that Google lets you download from the console, not the interoperability keys you can also downlown. This is very well explained in a Cloud Storage support page, but it only lists Java, Python, Go, and C# as the languages.
I couldn’t find anything for ruby so hopefully this helps someone else when they try to implement signed urls with ruby. This is also available as a gist.
Posted on 19 Jan 2016
In my side project, worfcam, I host photos on Google Cloud Storage. When viewing pages that had multiple thumbnails on them I noticed loading photos took a while. Each photo took 400-500ms which isn’t bad but comes with a noticeable flicker of them loading. My application serves photos out, users do not go directly to GCS to load them. Since users are coming to my application, I was able to add a simple caching layer in front of the ruby GCS service.
This is a cache layer above the google-ruby-api-client gem.
gcs_service_cacher.rb
class GcsServiceCacher
SIXTY_MINUTES = 60 * 60
def initialize(service, redis_pool)
@service = service
@redis_pool = redis_pool # This is a ConnectionPool
end
def method_missing(method, *args)
if @service.respond_to?(method)
@service.send(method, *args)
else
super
end
end
def insert_object(bucket, name:, upload_source:, cache: false)
if cache
@redis_pool.with do |redis|
redis.setex("photo-cache:#{name}:binary", SIXTY_MINUTES, upload_source.read)
end
upload_source.rewind
end
@service.insert_object(bucket, name: name, upload_source: upload_source)
end
def get_object(bucket, path, download_dest:, cache: false)
if cache
data = @redis_pool.with do |redis|
redis.get("photo-cache:#{path}:binary")
end
return StringIO.new(data) unless data.nil?
end
body = @service.get_object(bucket, path, download_dest: download_dest)
body.rewind
if cache
@redis_pool.with do |redis|
redis.setex("photo-cache:#{path}:binary", SIXTY_MINUTES, body.read)
end
body.rewind
end
body
end
end
This is a pretty simple class. The two important methods are #insert_object
and #get_obejct
, everything else is delegated down to the actual service class via #method_missing
. The redis_pool
object is a ConnectionPool instance loaded with Redis.
The #insert_method
class just sets the upload_source
binary data to a redis key and delegates down. I have photos expire after 60 minutes to not clog up redis with old photos.
It should be noted that these two methods keep the same method signature as the service class itself, except for the cache
keyword. I needed a way to let objects above this one hint that a file should not be cached. I only wanted to cache small thumbnails in redis to not completely blow away the memory. Thumbnails are the most accessed file as well.
The #get_object
class is a little more complicated. If cacheing is enabled, it will check redis to see if the key exists and return that if it does. Otherwise it will pull it down from the service and cache it for quicker access next read.
Final results
With the caching layer in place, I have noticed extremely quick reads compared to loading from GCS each page load. Photo loading dropped to about 100-200ms total. Adding in correct caching headers completely reduces load time as well.
A nice side benefit to this is potentially reducing GCS bandwidth. Worfcam is completely hosted inside of the Google Cloud so I don’t have bandwidth fees between the Compute Engine and Cloud Storage, but if you weren’t hosted inside of Compute Engine then you could read from your caching layer to prevent huge bandwidth bills. Worfcam pulls down each file at least once and rolls every hour into a gif so this would help me out if I ever decide to host elsewhere.
Posted on 11 Jan 2016
I recently switched from using a regular Loadbalancer
in kubernetes to using a NodePort
load balancer. I switched because I wanted to get away from using the fairly expensive network load balancer in Google Cloud Compute in a personal project.
I did this by creating a new service for nginx inside of the cluster, and setting it to use a NodePort
load balancer. Then creating a micro instance that hosts another nginx that proxies to the private IPs of the cluster nodes.
nginx-service
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
name: nginx
spec:
type: NodePort
ports:
- port: 80
nodePort: 30080
name: http
- port: 443
nodePort: 30443
name: https
selector:
name: nginx
The import pieces here are type
and the nodePort
line inside of ports. This will cause kubernetes to listen on 30080
and 30443
respectively. This port has to be between the range of 30000-32767.
Once the service is defined as a NodePort
load balancer you need to set up the micro instance.
/etc/nginx/sites-enabled/example
upstream example-https {
server 10.0.0.1:30443; # private node ip
server 10.0.0.2:30443; # private node ip
}
server {
# ... ssl setup ...
location / {
# ... proxy_pass configuration ...
proxy_pass https://example-https;
}
# ... other nginx setup ...
}
Here we have a stripped down sample file sites-enabled/example
that nginx will load and start serving. I have SSL set up and proxy_pass
to the private IP of each kubernetes node. I do keep SSL between this front nginx and the nginx inside of the cluster.
There isn’t anything special about this file so I’ve stripped out the unimportant pieces. You can see a more complete example of my nginx files in the post hosting nginx in docker.
Drawbacks
There are a few drawbacks using this approach. I’ve had at least once the private IP change (I think the nodes rebooted themselves after a crash) and I had to update this frontend nginx. You also have to manually add in an new nodes. The normal load balancer would have handled each case for me.
Overall I’ve been happy with this setup. It was nice to see that kubernetes let me handle my own external load balancing.
Posted on 30 Nov 2015
At REST Fest this year I did a small Collection+JSON ruby client to work against a hypermedia server. The full source code is hosted on github.
Client classes
class CollectionJSONMiddleware < Faraday::Middleware
def initialize(app)
@app = app
end
def call(env)
env[:request_headers]["Accept"] = "application/vnd.collection+json"
@app.call(env)
end
end
A small middleware class to shove the correct accept header in.
class Client
def initialize(url)
@url = url
end
def get(url)
response = connection.get(url)
json = JSON.parse(response.body)
Collection.new(json)
end
def delete(item)
response = connection.delete(item.href)
json = JSON.parse(response.body)
Collection.new(json)
end
def update(item, template)
response = connection.put(item.href) do |req|
req.headers["Content-Type"] = "application/vnd.collection+json"
req.body = template.to_json
end
json = JSON.parse(response.body)
Collection.new(json)
end
def add_template(collection, template)
response = connection.post(collection.href) do |req|
req.headers["Content-Type"] = "application/vnd.collection+json"
req.body = template.to_json
end
json = JSON.parse(response.body)
Collection.new(json)
end
private
def connection
@connection ||= Faraday.new(@url) do |conn|
conn.use CollectionJSONMiddleware
conn.adapter Faraday.default_adapter
end
end
end
Model Classes
These are Collection+JSON classes that are not specific to the underlying data at all.
class Collection
attr_reader :version, :href, :links, :items, :queries, :template
def initialize(json)
collection = json.fetch("collection")
@version = collection.fetch("version", nil)
@href = collection.fetch("href", nil)
@links = collection.fetch("links", []).map { |link| Link.new(link) }
@items = collection.fetch("items", []).map { |item| Item.new(item) }
@queries = collection.fetch("queries", []).map do |query|
Query.new(query)
end
template = collection.fetch("template", nil)
@template = Template.new(template) if template
end
def query(rel)
queries.detect do |query|
query.rel == rel
end
end
end
class Query
attr_reader :rel, :href, :prompt, :data
def initialize(attributes)
@rel = attributes.fetch("rel")
@href = attributes.fetch("href")
@prompt = attributes.fetch("prompt")
@data = attributes.fetch("data", []).map do |data|
DataItem.new(data)
end
end
def set(data, value)
@values ||= Faraday::Utils::ParamsHash.new
@values[data.name] = value
end
def to_url
uri = URI.parse(href)
uri.query = @values.to_query
uri.to_s
end
end
class Template
attr_reader :prompt, :rel, :data
def initialize(attributes)
@prompt = attributes.fetch("prompt", nil)
@data = attributes.fetch("data", []).map do |data|
DataItem.new(data)
end
end
def set(data, value)
@values ||= {}
@values[data.name] = value
end
def to_json
{
:template => {
:data => data.map do |data|
{
:name => data.name,
:value => @values[data.name],
}
end
},
}.to_json
end
end
class Link
def initialize(attributes)
@attributes = attributes
end
def to_s
"#{rel}: #{href}"
end
def [](key)
@attributes[key]
end
def rel
@attributes.fetch("rel")
end
def href
@attributes.fetch("href")
end
end
class Item
attr_reader :href, :data, :links
def initialize(attributes)
@href = attributes.fetch("href")
@data = attributes.fetch("data", []).map do |data|
DataItem.new(data)
end
@links = attributes.fetch("links", []).map { |link| Link.new(link) }
end
def attribute(name)
data.detect { |attribute| attribute.name == name }
end
end
class DataItem
def initialize(attributes)
@attributes = attributes
end
def to_s
"#{name}: #{value}"
end
def [](key)
@attributes[key]
end
def name
@attributes.fetch("name")
end
def value
@attributes.fetch("value", nil)
end
def prompt
@attributes.fetch("prompt", nil)
end
end
Putting it together
Here is a class that lets you perform general collection+json actions on the command line. It is slightly specific for the app at REST Fest, but there isn’t that much specific.
class CommandLine
attr_reader :collection
def run
@collection ||= client.get("/")
puts "What do you want to do?"
puts " - Refresh Items (refresh)"
puts " - Items (items)"
puts " - Queries (queries)"
puts " - Template '#{collection.template.prompt}' (template)" if collection.template
puts " - Delete (delete)"
puts " - Edit (edit)"
puts " - Exit (exit)"
choice = gets.chomp
puts
case choice
when "refresh"
refresh
when "items"
print_items(collection.items)
when "queries"
queries
when "template"
template
when "delete"
delete
when "edit"
edit
when "exit"
return
end
run
end
def refresh
@collection = client.get(collection.href)
end
def queries
queries = collection.queries.map do |query|
"#{query.prompt} (#{query.rel})"
end
puts queries
puts "Choose a query"
query = gets.chomp
search_query = collection.query(query)
unless search_query
puts "Query not found"
return
end
puts
if search_query.data.all? { |data| data.value.empty? }
edit = "yes"
else
search_query.data.each do |data|
puts "#{data.prompt}: #{data.value}"
end
puts "Edit? (yes/no)"
edit = gets.chomp
puts
end
case edit
when "yes"
search_query.data.each do |data|
if data.value.present?
puts "#{data.prompt} (#{data.value}):"
else
puts "#{data.prompt}:"
end
search_query.set(data, gets.chomp)
end
puts
when "no"
search_query.data.each do |data|
search_query.set(data, data.value)
end
end
collection = client.get(search_query.to_url)
print_items(collection.items)
end
def template
template = collection.template
puts template.prompt
template.data.each do |data|
if data.value.present?
puts "#{data.prompt} (#{data.value}):"
else
puts "#{data.prompt}:"
end
template.set(data, gets.chomp)
end
puts
@collection = client.add_template(collection, template)
end
def delete
item = choose_item
@collection = client.delete(item)
end
def edit
item = choose_item
@collection = client.get(item.href)
item = collection.items.first
template = collection.template
item.data.each do |data|
next unless template.data.any? { |td| td.name == data.name }
if data.value.present?
puts "#{data.prompt} (#{data.value}):"
else
puts "#{data.prompt}:"
end
template.set(data, gets.chomp)
end
@collection = client.update(item, template)
end
private
def client
@client ||= Client.new("http://hyper-hackery.herokuapp.com/")
end
def choose_item
print_items(collection.items, true)
puts "Choose an item"
item_number = gets.chomp.to_i
collection.items[item_number]
end
def print_items(items, include_numbers = false)
puts "Returned items"
items.each_with_index do |item, index|
value = item.attribute("completed").value
finished = value == "true" || value == true ? "(X)" : "( )"
number = " (#{index})" if include_numbers
puts " - #{finished}#{number} #{item.attribute("title").value}"
end
puts
end
end
CommandLine.new.run
Here is a sample run:
ruby perform_search.rb
What do you want to do?
- Refresh Items (refresh)
- Items (items)
- Queries (queries)
- Template 'Add ToDo' (template)
- Delete (delete)
- Edit (edit)
- Exit (exit)
items
Returned items
- (X) one more test again
- ( ) danny boy
- ( ) fishing
- ( ) goofing around
- ( ) one more simple test
- (X) one more minor test
- ( ) update these docs
- ( ) wheee
- ( ) asdasd
- ( ) something to test
- (X) update cj parser
- ( ) trying to test
- ( ) one more simple test
- (X) additional stuff
- ( ) ssssss
- (X) more stuff
- ( ) one more simple test
What do you want to do?
- Refresh Items (refresh)
- Items (items)
- Queries (queries)
- Template 'Add ToDo' (template)
- Delete (delete)
- Edit (edit)
- Exit (exit)
queries
Active ToDos (active collection)
Completed ToDos (completed collection)
Search ToDos (search)
Choose a query
search
Title:
one
Returned items
- (X) one more test again
- ( ) one more simple test
- (X) one more minor test
- ( ) one more simple test
- ( ) one more simple test
What do you want to do?
- Refresh Items (refresh)
- Items (items)
- Queries (queries)
- Template 'Add ToDo' (template)
- Delete (delete)
- Edit (edit)
- Exit (exit)
exit