Caching

Strategy for caching information on the Rails back-end of Empirical-Core

overall

We are using this strategy for implementing caches :
https://signalvnoise.com/posts/3113-how-key-based-cache-expiration-works

This strategy is possible in our system because our Redis-To-Go heroku add-on is configured to expire keys on a “Least Recently Used” basis (just as memcached does, which is mentioned in the above blog post).

This strategy will allow us to implement caching with less code, because we won’t have to manually invalidate keys every time a relevant model is updated (instead, we just need to make sure the correct ‘touch’ associations are declared, as specified in the above blog post).

here are some excerpts from an email thread which discuss the past and future caching strategies:

Excerpt 1 from email exchange:

marcello to james

Problem

The problem is that I recently re-wrote the controller action for a student's profile (ProfilesController#student), as well as the view for a student profile (app/views/profiles/student.html.slim, which is now erb).

In my re-write I did not include any caching of information in the controller action or in the view.

You had originally included caching in both of those places. Now we are seeing on New Relic that this controller action (and the rendering of the associated view) are extremely expensive. So it looks like caching was a good idea ;)

I am now trying to 're-insert' the caching on this action and view.

Original Cache

Your original code for caching involves these parts :

  1. a StudentProfileCache module in app/models that has an invalidate method.

  2. after_save callbacks in activity_session.rb, classroom_activity.rb, and classroom.rb which invoke StudentProfileCache.invalidate (passing as input
    either a student or set of students).

  3. a line in ProfilesController#update which invokes StudentProfileCache.invalidate if the user whose profile is being updated is a student.

  4. a line in ProfilesController#student which establishes a cache on a set of instance variables (the information which is used in the view, namely @classroom, @activity_names, @activity_table, @section, and @topics.
    This line looks like this :

@classroom, @activity_names, @activity_table, @section, @topics = cache('student-profile-vars-'+ current_user.id.to_s, skip_digest: true) do
(code for computing the relevant values)
end

  1. a line at the top of app/views/profiles/student.html.slim that establishes a cache on the entire contents of the view.
    This line looks like this :
  • cache('student-profile-'+ current_user.id.to_s, skip_digest: true) do
    (slim code specifying contents of view)

Re-inserting Cache

'Re-inserting' the caching is not too involved, because nothing has to be done for 1, 2, and 3. All of the code that implements 1, 2, and 3 has been left untouched since you wrote it.

As for 4, I am establishing an analogous cache. The only thing that has changed is the values which are being computed, and the code inside the block which computes those values. That code looks like this :

@units, @next_activity_session, @next_activity = cache('student-profile-vars-' + current_user.id.to_s, skip_digest: true) do
(code for computing the relevant values)
end

As for 5, I am also establishing an analogous cache. The only thing that has changed is the html specified beneath it (and the fact that everything is erb instead of slim). That code looks like this :

<%= cache('student-profile-vars-' + current_user.id.to_s, skip_digest: true) do %>
(erb code specifying contents of view)
<% end %>

Expiration

I am considering adding an "expires_in: 20.days" argument to the cache method. So for example, this would make the view cache code look like this :
<%= cache('student-profile-vars-' + current_user.id.to_s, skip_digest: true, expires_in: 20.days) do %>

This would be to save memory. But perhaps there is already a mechanism that deals with that (?)

Going Forward

I hope there isn't anything too controversial about the 're-inserting' of the old cache. If it is all cool, then if you have any time it would be great to get your opinion on an alternative caching process (one where keys change instead of values, though I can go into detail once the above is all given your blessings).

Excerpt 2 from email exchange :

marcello to james
I'm now going to implement a cache on the scorebook, but using a slightly different pattern (where updates don't directly invalidate caches, but cause keys to change since updated_at value becomes part of key). This pattern involves a bit less code, so I believe it will be preferable (it is also recommended by rails docs). The implementation of that seems pretty straightforward, but if I run into issues perhaps I'll reach out again.

Excerpt 3 from email exchange :

  • marcello to james *
    Duly noted. The Rails post mentions that memcached deals with the cache garbage by evicting the oldest key. However I don't believe we have memcached provisioned, and I'm not sure if that behavior is possible through Redis.

Indeed, I'll move that stuff into a model. Would caching those nuggets be more useful because they are used in other contexts (so caching helps not only this context but others) ?

I was going to cache the view because there are some non-trivially expensive queries going on in there. But, inspired by your 'nuggets' (lol) suggestion, I realize that many of those queries are handled by (or could be handled by ) methods in a helper, so perhaps it is better to do that caching in those helper methods in the helper file rather than caching the view itself.

The 'redis backed structure', not familiar with that kind of thing, but perhaps I should look into Redis caches? Would that be the right google search?

Excerpt 4 from email exchange :

  • marcello to james *

Hey James,
I looked into the details of this stuff a bit more. Here's what I found out :

We use Redis

Apparently our cache is indeed a redis backed structure - we've provisioned redis-to-go on heroku, and in config/production.rb we have :

config.cache_store = :redis_store, ENV["REDISTOGO_URL"]["REDISTOGO_URL"], { expires_in: 90.minutes }

This config setting determines what service will be used for the 'cache' commands within the rails app.

So it seems we're good to go on using redis.

Non expiration of keys

You mentioned that I should look into whether or not it's ok to use the 'fluid-keys, fixed-values' design pattern that Rails (DHH) recommended (https://signalvnoise.com/posts/3113-how-key-based-cache-expiration-works), given that it doesn't involve the explicit invalidation of cache values, and a potentially overflowing number of keys.

Memcache Evicts Least Recently Used (LRU)

DHH based his justification for this pattern on the fact that memcached (default cache service used by Rails), upon reaching its memory limit, automatically deletes the "Least Recently Used" keys (and their values), removing the possibility of an overflow of things stored by cache.

Redis - Evicts LRU too ?

There was some concern about this, because I wasn't sure if our app happens to use memcached. It turns out that we don't - we use Redis. However,
the aspect of memcached required to justify the 'fluid-keys, fixed-values' design pattern is present in our configuration of Redis-to-go (the heroku add-on providing redis caching to our production app).

The specific configuration lines are :

# =Limits===================================================================
maxmemory 104857600
maxmemory-policy volatile-lru

I would link to the configuration file but the link I believe only works when one is signed into heroku on my account :(

The maxmemory-policy, 'volatile-lru', is what specifies that Redis should evict the "Least Recently Used" keys upon hitting its max memory usage.
More precisely, this policy allows for the eviction of "Least Recently Used" keys that have an expiration set on them. That is not a worry, however, because all of our keys have an expiration set on them, given the option "expires_in: 90.minutes" specified in config/production.rb

Given that redis has the requisite configuration, I'm going to go ahead with the "fluid-keys, fixed-values" design pattern for caching, unless there are any other red flags that pop up to you.

In other news

The value of your suggestion to normalize caching into the models dawned on me last night, and I realized it was basically what Peter was expressing to me in a conversation about caching 'parts' of the profile data instead of all of it in one chunk. Basically that the cache doesn't become entirely obsolete after each incremental activity_session is added.

I hope the above makes sense. Thanks for pointing me towards the details to pay careful attention to in these matters.