Software Development – The Optimistic Programmer

Sharing Code Between Firebase Hosting and Functions

One issue my team had for the Glamis Recovery project was sharing code and logic between our frontend (Firebase Hosting) and our backend (Firebase Functions). If you’re not familiar with firebase, Hosting and Functions have certain assumptions and requirements of your directory structure for deploying to these services:

Your frontend code will most likely be bundled and placed in a distribution folder for the Firebase CLI to see. Each page of your web app needs an html file with its Javascript dependencies linked to it.
All of the backend code needs to be in the functions directory to be deployed. The index.js file will export all of the functions for Firebase to deploy.

Depending on how you setup your project’s directory structure and devops processes, which new web developers and users of Firebase may find challenging (because I certainly did), you may run into the problem that it’s difficult to share code between Hosting and Functions.

This blog post is about the dangers of keeping your frontend and backend logic separate from one another and what to do about it to improve the maintainability of your software project.

Divergence of the Codebase

Our codebases diverged for the following reasons. Don’t let these happen to your project.

The frontend was using ES6 modules but the backend was using CommonJS modules. Though the Firebase Functions documentation features examples with CommonJS require statements, you can just as easily use ES import statements as well. VS Code will even give you a hint about it and do it for you.
Letting the two different directories be a “knowledge gap” for your codebase. There are ways around this that we’ll talk about below.
Poor adherence to software engineering principles. Focus on modularity, Single Responsibility, and testability from the beginning. Refactor your code when is out of compliance with these principles.

Keeping the Codebase DRY (Don’t Repeat Yourself)

Our codebase truly wasn’t very big, so we decided that we would just duplicate any logic that was needed for both the frontend and backend. Maybe that meant writing duplicate functions or maybe that meant copy-pasting entire directories of class files.

We thought, “if we change it in one place, we’ll just make sure to change it in the other.” BIG MISTAKE FOLKS. BIG MISTAKE.

This was just a two person team. Sure, maybe this would work out because we both knew everything about the codebase…at the time. Maybe we could keep each other accountable…for a while. Maybe we could be careful and diligent…until that one time.

Just think back to a time when you stepped away from a piece of code for a month. After coming back to it, it’s basically like we weren’t the original authors at all. Our brains are the epitome of an LRU Cache. And attempting to manage the continuous upkeep of this much duplicated code went south pretty fast.

It wasn’t maintainable. The code diverged. There were bugs.

Solutions that Didn’t Really Work

There were a couple of things that we tried that didn’t pan out as solutions. Keep in mind the following directory tree. This is how our project is generally structured. Most of the code we want shared originated in the src directory with all our frontend code.

project-root/
├─ functions/
│  ├─ node_modules/
│  ├─ .gitignore
│  ├─ index.js
│  ├─ package.json/
├─ node_modules/
├─ hosting_distr/
│  ├─ bundled_frontend_code
├─ package.json/
├─ src/
│  ├─ shared_resources/
│  ├─ frontend_code.js

Symlinks

In Linux, Symlinks are special files that exist in one directory and “point to” a file somewhere else. We thought we could keep all of our frontend code in the src directory and have symlinks in the functions directory pointing to the shared files in src. This works when you’re running the firebase emulators locally on your computer, but as soon as you got to deploy said functions directory, expect to get errors because ALL of the code needed for the functions needs to exist in that directory and the CLI can’t resolve the symlinks for you.

Local Node.js Modules

The Functions docs say that you can use local Node.js modules as part of your function. The local module is actually just another pointer, this time managed by npm instead of the OS; npm copies the files over to where they get used. You need to run npm install (from within the functions directory) to actually get these local modules from src copied put into the functions/node_modules directory.

This seemed promising, until you read the starred note: “Note: The Firebase CLI ignores the local node_modules folder when deploying your function.” In essence, this means that Firebase will call npm install for you on their backend and if that local module isn’t inside of the functions directory when they do that, npm will be very unhappy.

Private Modules

What about taking local modules a step further and putting the shared code into a private package on the npm servers. That way Functions can have the code when it calls npm install.

For $7 a month to have a paid npm organization, be my guest. I pay $0.22 to Firebase each month for the entirety of my backend services. I wasn’t going to pay $7 to share code with myself.

Current Solution

Identify and Organize Shared Resources

Instead of actively maintaining two copies of the shared resources with man-power and version control, we keep the authoritative copy in the src directory and we deleted all duplicate code out of the functions directory. Anything that may need to be shared between the frontend and backend is organized in the shared_resources directory:

shared_resources/
├─ classes/
├─ constants/
├─ enums/
├─ utils/

Have Backend Code Import from Shared Directory

Let the backend code import from the shared directory as if that shared directory existed inside of the functions folder. I promise that it will be there when the code runs, even though I just said that we removed it all.

import Vehicle from "./shared/classes/Vehicle.js";


import PERMISSIONS from "./shared/enums/Permissions.js";
import { MIN_COVERAGE_LENGTH } 
from "./shared/constants/coverage.js";

Using npm scripts to Copy the Shared Directory

We identified only a few moments when we needed to copy the shared resources over to the the functions directory, and we automated that copy operation into our existing development and build practices.

Before we launch the firebase emulator suite, the shared directory should exist in the functions directory.
Before we deploy any code, we always do a build. The build step is really for the frontend code, but now we just get the backend up-to-date at the same time.

We use npm Pre Scripts to automatically do the copy operation for us so we don’t have to think about it at all. Here are the relevant scripts:

"scripts": {

    "watch": "webpack --watch --config ./webpack.dev.cjs",
    "prefirebase:emulators": "npm run copy-shared",
    "firebase:emulators": "firebase emulators:start",



    "prebuild:prod": "npm run copy-shared",
    "build:prod": "webpack --config ./webpack.prod.cjs",
    "copy-shared": "rm -rf ./functions/shared && cp -R ./src/shared ./functions/shared"
  },

Have Git Ignore the Duplicate Directory

Because we only want a single authoritative location for these shared resources, we have git ignore the functions/shared_resources directory. No reason to be worrying about including it into version control.

Result

All of these work together to make sure the functions directory has its own copy of the shared resources when its needed. In reality, that copy is ephemeral: it’s not being tracked by version control and it’s getting deleted and recopied frequently by our npm scripts.

We no longer have to worry about our frontend and backend diverging because there is only one authoritative copy.

Shortcomings and Future Work

Two things come to mind when I think about this setup.

First, the entire time I was writing this post, all I could think about was how we hadn’t tried just putting the authoritative shared directory into the functions directory instead of src. src isn’t needed by Firebase Hosting; we only care about the distribution folder that gets made after everything is bundled together. Webpack could certainly handle reaching into the functions directory for any dependencies. I think we discounted this idea because of how much more code there is for the frontend. There was almost a psychological barrier there for us to write code for the frontend that looked like this:

import { SharedClass } from "../functions/shared/class.js";

Maybe it’s worth trying out instead.

Deploying Functions

Second, and more importantly, depending on how/when you deploy your functions, they may be utilizing stale code instead of the newest source code. Every time you modify the shared directory, there is a chance that code your functions depend on has changed. This should trigger a new deployment of your functions.

With Firebase Functions, you have the ability to deploy single functions, categories of functions, or all of the functions. We leaned on deploying singles for a while, but we recently reorganized so that we deploy categories. But we’ve been doing so manually instead of automatically in a CI pipeline.

Our hubris is begging for another important lesson. There’s an implicit assumption here in our setup that:

We can remember all of the dependencies inside of our functions code.
We will remember to deploy those functions when the dependencies get updated.

I’ll be working on this problem soon.

References

Feature image: Photo by Chester Ho on Unsplash

Testing Sorely Needed

I’m working on an implementation of the dgemm matrix multiply function for the BLAS interface ^[1] for a Parallel Computation course. While trying to introduce a feature to take advantage of the processor’s cache hierarchy, I ran into a reoccurring issue in my software development career. This time it bothered me more than usual.

In order to complete the feature, four components would need to be added or changed for the program to function correctly. For someone who isn’t habituated to test their code after incremental changes, four changes is basically nothing. “Watch me perform dependency inversion while thrashing every class in the codebase,” says the programmer who had it coming. ^[2] But for someone who now expects more from himself as a software developer, this was now uncomfortable.

“If I think this through, map it out on paper, and act cautiously, I can probably get it on my first try,” is what I use to say. I knew better this time, and I was correct. It didn’t work the first time.

The problem isn’t that it didn’t work the first time. The problem was that I had made too many changes at once and now it was impossible to pinpoint which piece was failing just by looking at the result of my Frankenstein-memory-access-matrix-multiply.

Three of the four components were mostly self-contained and should have been unit tested. These were also the most complex changes, so they DEFINITELY should have been unit tested. What were the two obstacles that prevented me from unit testing this code?

First, was inexperience. I should own up to my own shortcomings before criticizing the code of other people. Even though I’ve been using C/C++ longer than any other language, there’s a lot I still need to learn. Just in this codebase alone: make files, extern ‘C’ code, and static inline functions ^[3] all showed me how little I truly know. Furthermore, I’ve never properly tested C/C++ software before with unit and integration testing. So there was no way I was going to slap a testing framework into this codebase without first receiving many well deserved battle scars.

Second, was code structure. Two things seem almost inescapable in software.

If you don’t start a program with the intention of testing, as the program grows, it will probably get too difficult to ever test.
Even when you start out testing a codebase, as the project grows and becomes more complex, it will eventually also get too difficult to test.

This code is strictly academic, for learning purposes only. They thankfully included some handy debugging functionality to the project, but its creators didn’t imagine people would need to properly test this code. From the complexity of how the code is linked together to the static inline functions that make some important components impossible to access, this is not convenient to test.

I write this the day of my well learned lesson. In the coming weeks, I hope to learn a number of things about software testing in general, testing C/C++, and understanding some of the internals that have given me so much grief today.

References:

http://www.netlib.org/blas/
Yea, I’ve done that. Not proud of it. I got what I deserved.
Static inline functions actually made a lot of sense in this context since we’re trying to optimize the code to be wicked fast. Why declare a C function as static inline?

What is software engineering?

This was the prompt of a small assignment in my software engineering course. At first, I was skeptical of such a trivial assignment, but it actually captivated my attention and helped me reflect on an important work experience. What this question is really about is understanding the responsibilities and expectations of the professionals and their work in this industry. And that, is no trivial matter.

Though I’m not able to fully answer this question yet, what I have thus far lies in an experience I had in my short time at Northrop Grumman and in a textbook.

I had the opportunity to get to know about other projects at the Northrop facility. One of them was digitalization of the cockpit for the UH-60L Black Hawk helicopter (UH-60V).^[2] Developing software that controls the flight of military aircraft requires adherence to strict guidelines and regulations, namely DO-178B. For this reason, the software developers on the UH-60V team don’t do any programming, at least not in the way computer science students instinctively think about programming. They use Model-Based (or Model-Driven) Development and the SCADE software suite. For users of SCADE, their models are their documentation (at least one form of it) and their code. SCADE generates code using the developed models and the code it generates is certified to be DO-178B Level A.^[3] “The levels are defined in term of the potential consequence of an undetected error in the software certified at this level.” ^[4] And Level A means, if there is an error in the code, there is more than likely a “catastrophic” outcome.^[4]

There are many industries that software is developed for, each with its own set of regulations that need to be followed, some more strict than others. Developing software to keep aircraft in the sky has some of the strictest regulations. “Software Engineering” in this sub-industry means almost zero hand-written code. I don’t think projects are prohibited from coding, but the hoops that must be jumped through to prove that the code is error free and meets DO-178B Level A standards makes it financially impractical to introduce coding. With SCADE and Model-Based Development, they can provide the level of quality that is expected by the customer and do so on budget.

As a kid straight out of school, this was foreign to me. It didn’t look anything like the software engineering I’d seen. But I got to know people on that team and they shipped great software.

In his book, Software Engineering: A Practitioner’s Approach, Roger Pressman describes the realities and challenges of software in the twenty-first century before defining software engineering. Below are the summaries of those challenges and their takeaways ^[1]:

Software is now in such high demand that there are many forces pushing and pulling the direction of a project. “We need to understand the problem before building a solution”.
The requirements demanded by customers becomes more and more complex each year. “Design is a pivotal activity”.
More people and organizations are relying on software and to a higher degree than ever before. “Software should exhibit high quality”.
Software projects can see long term growth in their user base and increased demands in their capabilities. “Software should be maintainable”.

Pressman began with the challenges we will face developing software to prevail upon us that software projects won’t succeed on accident. Software will need to be ENGINEERED.^[1] He puts an emphasis on the verb engineering. Pressman uses the diagram below to highlight the importance of various aspects of software engineering.^[1]

The foundation of software engineering, Pressman teaches, is a focus on quality. A focus on quality, dictates the processes, methods, and tools that we’ll use to reach that standard of quality. And what’s better, if our focus is building quality software, then over time we’ll iterate and develop better processes, methods, and tools than what we already have.

So why did the Black Hawk team’s version of software engineering look so different from my preconceived notions? Because all I saw were their processes, methods, and tools. And if I’m being totally honest, all I really saw were their tools. What I didn’t see was the massive part of the iceberg hidden underneath the surface of the water: their processes and definition of quality. If you start with a focus on quality, then everything else can be great software engineering.

References:

“Software and Software Engineering.” Software Engineering: A Practitioner’s Approach, by Roger S. Pressman, 7th ed., McGraw Hill, 2010, pp. 12–16.
“UH-60V Black Hawk Integrated Mission Equipment Package.” Northrop Grumman, www.northropgrumman.com/what-we-do/air/uh-60v-black-hawk-integrated-mission-equipment-package/.
“SCADE Suite: Integrated Model-Based Design & Development Environment.” Ansys, www.ansys.com/products/embedded-software/ansys-scade-suite.
“Airborne Software Certification Explained.” Open, www.open-do.org/about/software-certification-101/.

First Real Software Testing Experience

In the spring of 2020, I participated in a graduate software engineering course focused on the software principles of modularity. We read many canonical articles and put their messages into practice by refactoring an Android mobile game. My team and I implemented very few automated tests in our test suite, mostly because we spent so much time attempting to refactor the system. But when we did make tests, they became outdated in a week’s time because of how rapidly we were making changes.

This summer, I set my sights on building a website for Toastmasters International using Python and Django. The tutorial I used to learn Django was Vitor Freitas’s. Vitor’s dedication to sharing knowledge with others is astounding and his Django tutorial is phenomenal. What surprised me most throughout the course was his emphasis on testing. Looking back, the tutorial was as much an intro to software testing as it was an intro to Django.

Once my project diverted from the tutorial material, most of my testing involved copying and pasting test classes to fit all my new views. Later, I learned that I could test whole models. And I was most proud of how I tested user permissions for some of the views.

Though there wasn’t much to my test suites, I had gotten far enough along to develop an eye for what could be improved in my system:

First, I was repeating myself a lot! I’m sure the Don’t Repeat Yourself (DRY) principle also applies to test suites. I attempted to utilize the “setup” method of the Django TestCase class as best I could, and even used some inheritance to make the concrete test classes less repetitive. Still, I couldn’t shake the feeling that there wasn’t an “authoritative and unambiguous representation” ^[1] for components of my test suite. The finest example was user permission tests. Instead of directly testing the mixins that I created by inheriting from UserPassesTestMixin, I repeatedly tested the views that used these permission mixins.

Second, the dependency chains created by my models made it more complex to test individual components. If every Club could have Meetings, and Meetings could have Performances, and Performances could have Evaluations, then I was creating Clubs, Meetings, and Performances simply to test the Evaluation components of my system. Surely, this problem lies in a failure of mine to properly include abstractions between my components and utilize Dependency Inversion. ^[2]

As my first real experience in software testing, the following are just a few of my big takeaways for why I’ll be taking testing more seriously in my projects from now on.

I was much more confident to make changes when I knew I had tests backing me up. With each added test, I was less worried about making mistakes because my failed test cases were my safety harness and light in my tunnel.
While learning C++, I remember relying on the compiler to be my test suite. This Django project has really taught me that you can’t rely on the compiler errors to know you’ve done something wrong. The Django template language and many components within Django rely on strings. On several occasions I had features that simply weren’t working and couldn’t believe that uninstantiated variables in these string types didn’t throw helpful errors. All the more reason for better test suites.
I even got into the habit of using test cases that always failed as a reminder that I hadn’t yet implemented a certain feature. With tons of links on each webpage, it was incredibly easy to forget a hyperlink here and there. Testing then became a habit that was more congruent with the David Allen “Getting Things Done” mindset I’ve been trying to develop. I let my testing suite be my reminder of unfinished tasks so I could free up space in my head for the task at hand.

You can find more info about my Toastmasters Feedback project here.

References:

Orthogonality and the DRY Principle, A Conversation with Andy Hunt and Dave Thomas, Part II by Bill Venners, March 2003.
Robert C. Martin, “The Dependency Inversion Principle”, C++ Report, May 1996.