Enumerable Magic – Boolean Methods

Learning over 50 methods in Ruby's Enumerable module can be a little daunting. It doesn't help that the official docs (which are awesome, by the way) sort the methods alphabetically, not by how they might be used. So for this tutorial, we'll start by looking at just those methods with a boolean (true or false) outcome. All of the source code for the following examples can be found on github:

https://github.com:bellmyer/rubycuts/enumerables

First, you should know that all examples here and in other Enumerable tutorials on this site will use one common set of example classes. A PetInventory class, which manages a collection of Pet objects, a file that contains a list of "pokey things", and a LogData class which iterates through Heroku log files in a RAM-friendly way.

all?

The all? method takes an enumerable collection and tests whether the given condition is true for every item in the collection. It returns true only if every element meets the condition.

Pet Inventory

The PetInventory class keeps track of how many pets our imaginary pet store has in stock at any given time. We also track the number of legs of each animal, in case somebody comes in not knowing what pet they want, but knowing how many legs it should have. Here's our initial pet inventory:

pet legs # in stock
dog 4 100
cat 4 50
fish 0 1000
scorpion 8 1
beetle 6 10,000
monkey 2 2
rock 0 0

Now, some code:

# view full source at https://github.com/rubycuts/enumerables/blob/master/lib/pet_inventory.rb

inventory = PetInventory.new

# do all pets have legs?
inventory.all?{|pet| pet.legs > 0}    #=> false

# are all pets in stock?
inventory.all?(&:in_stock?)    #=> false

# all pets have a name? 
inventory.all?{|pet| !pet.name.nil?}    #=> true

# if list is empty, is everything true? 
[].all?{|pet| pet.legs > 1000}    #=> true

Let's walk through these examples. Do all pets have legs? No, fish and rocks don't pass the legs > 0 test. Are all pets in stock? No, we're out of pet rocks. Do all pets have a name? Yes, no pet names are nil.

This last example is something I found interesting while playing with this method. If your list is empty, what does the all?

method return? By default, it returns true because no items failed the comparison test. Technically, no items passed either, which is a bit confusing, but I guess the authors picked the default that would work as you expect most of the time.

any?

The all? method takes an enumerable collection and tests whether the given condition is true for every item in the collection. It returns true only if every element meets the condition. Pet Inventory The PetInventory class keeps track of how many pets our imaginary pet store has in stock at any given time. We also track the number of legs of each animal, in case somebody comes in not knowing what pet they want, but knowing how many legs it should have. Here's our initial pet inventory:
petlegs# in stock
dog4100
cat450
fish01000
scorpion81
beetle610,000
monkey22
rock00
Now, some code:
# view full source at https://github.com/rubycuts/enumerables/blob/master/lib/pet_inventory.rb

inventory = PetInventory.new

# do all pets have legs?
inventory.all?{|pet| pet.legs > 0}    #=> false

# are all pets in stock?
inventory.all?(&:in_stock?)    #=> false

# all pets have a name? 
inventory.all?{|pet| !pet.name.nil?}    #=> true

# if list is empty, is everything true? 
[].all?{|pet| pet.legs > 1000}    #=> true
Let's walk through these examples. Do all pets have legs? No, fish and rocks don't pass the legs > 0 test. Are all pets in stock? No, we're out of pet rocks. Do all pets have a name? Yes, no pet names are nil. This last example is something I found interesting while playing with this method. If your list is empty, what does the all? method return? By default, it returns true because no items failed the comparison test. Technically, no items passed either, which is a bit confusing, but I guess the authors picked the default that would work as you expect most of the time.

Become a Code Ninja with Codewars

I am not a ninja, in the traditional sense. My chubby frame and fuzzy beard can attest to that. But Codewars has given me the ability to be a ninja in one very special area of my life: programming.

Kata are moves that one practices and perfects as they attempt to master a martial art. In this spirit, Codewars presents you with hundreds of kata (coding challenges) in the programming language of your choice. As you earn honor points for completing kata, you advance through the “kyu” (ranks).

The concept of programming kata has been around in other forms, but Codewars is an entire platform of crowd-sourced programming challenges. Users are encouraged to create kata of their own that are curated and used by others. It’s a self-perpetuating system (with a lot of hard work from the Codewars staff).

Each kata has a description, a browser-based code editor, and even a small testing library you can use to verify your own solutions. Once you’re confident you’ve solved the kata, you can submit your code to see if it passes the kata author’s own set of tests. Instant gratification!

kata

Why Train?

As I said earlier, programming kata is not a new concept. Years ago, Chad Fowler published a book that stressed the importance of practicing your programming craft. Chad posed an interesting question: you wouldn’t expect to become a great musician with all performance and no practice, so why do we think we can become great developers without challenging ourselves outside of work?

Malcolm Gladwell explored the concept of “10,000 hours” in his book Outliers. The theory is that it takes about 10,000 hours of dedicated practice to become an expert at something. If you’re like most developers, 90% of what you do at work doesn’t count.

Becoming an expert requires constant challenge, and that’s what Codewars provides. They gamify the act of self-improvement, and they do it well – it’s downright addictive!

 

 

Automatic Summary Tables in PostgreSQL

Has your app begun to slow down over time, whether you're seeing a large increase in traffic or not? It might be that those complex database queries that were snappy when you first launched are now taking longer and longer to execute. The first step is proper indexing, but you likely did this when you created your tables. It might be time for summary tables!

What are Summary Tables?

Summary tables are a simplified view of your complicated data that can be queried more quickly and easily. Are they magic? Absolutely, but not how you might think. Real-life magic requires a real-life magician pulling strings behind the scenes, and that's what you'll have to do. You will setup the trick, and your app's users will enjoy the show. But with just a little extra knowledge, you can be magic and lazy at the same time.

The Process

The trick follows this formula:

  • Create a table with the summary data as you'd like to see it.
  • Build a stored procedure that updates the summary table.
  • Setup a trigger that will run when your original tables are updated.

We're going to begin with a simple data model: a table for species of animals, including how many legs each species has, and a table for pets, who belong to a specific species. We want to easily query the database for the number of pets who have, say, four legs. Here's the structure for our animal tables, minus any keys:

CREATE TABLE species (
    id integer NOT NULL,
    name text,
    leg_count integer
);

CREATE TABLE pets (
    id integer NOT NULL,
    name text,
    species_id integer
);

And here's the query that gives us a tally of how many pets have each number of legs:

SELECT species.leg_count as leg_count, count(*) as pet_count
FROM pets
JOIN species on pets.species_id = species.id
GROUP BY species.leg_count

Now, to the magic.

Create the Summary Table

Our summary table is easy; it has the same format as the output of our complex join query:

CREATE TABLE leg_counts (
    id integer NOT NULL,
    leg_count integer,
    pet_count integer
);

Each record stores a leg count, and the number of pets that have that many legs.

Build the Summary Table

Here's the code:

CREATE FUNCTION update_leg_counts() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
    DECLARE
     pet_leg_count INTEGER;
     pet_count INTEGER;
    BEGIN
      SELECT INTO pet_leg_count species.leg_count
        FROM species
        WHERE species.id = NEW.species_id;
    
      SELECT INTO pet_count count(*)
        FROM pets
        JOIN species ON species.id = pets.species_id
        WHERE species.leg_count = pet_leg_count;
    
      DELETE FROM leg_counts where leg_counts.leg_count = pet_leg_count;
      INSERT INTO leg_counts (leg_count, pet_count) values (pet_leg_count, pet_count);
  
      RETURN NULL;
    END;
    $$;

Now that's a mouthful! We start by creating the function, with a return type of "trigger". This particular stored procedure has no parameters - it will be triggered by the creation of a new "pets" record, and that new record will be available within the function as "NEW". We set the language of the stored procedure as plpgsql, and declare a couple of variables that will be used in our code.

We set pet_leg_count to the number of legs that our new pet has. This is important, because we don't want to update ALL the records in our summary table, just the one that keeps track of our pet's same-legged friends. Next, we set pet_count to the number of pets with this many legs. Finally, we delete any summary table records with the same number of legs, and recreate it with the new leg count. We do it this way so our code works whether a record for that many legs previously existed, or not.

Setup the Trigger

Now for probably the easiest part of this process: telling Postgres to execute the stored procedure whenever a new pet is added:

    CREATE TRIGGER update_leg_counts
    AFTER INSERT ON pets
    FOR EACH ROW EXECUTE PROCEDURE update_leg_counts();

I gave the trigger the same name as the stored procedure, sorry if that's confusing. I didn't see a good reason to give them distinct names.

The Results

The difference is...magic? As we add pet records, we can measure the speed of querying pet counts by number of legs, both before and after the summary table. These were my actual performance numbers, which you can recreate with the code in the git repository I provided at the top of this article.

Number of Pets Query Time w/o Summary Query Time with summary Speed Improvement
100 0.0006 0.0002 65.98%
500 0.0007 0.0002 71.71%
1,000 0.0013 0.0002 79.86%
2,000 0.0019 0.0002 88.52%
3,000 0.0026 0.0002 91.04%
4,000 0.0033 0.0002 92.66%
5,000 0.0040 0.0002 93.41%
6,000 0.0049 0.0002 94.91%
7,000 0.0055 0.0002 95.77%
8,000 0.0062 0.0002 96.34%
9,000 0.0069 0.0002 96.46%
10,000 0.0094 0.0002 97.12%

The speed of querying the summary table remains constant no matter how many pet records we add to the database, while performing the complex join query gets increasingly sluggish. Even at just a hundred pet records, there is marked benefit. By the time we're storing 10,000 pets, we're saving over 97% of the time it takes to find out how many same-legged friends Fido has. Magic!