Knowledge, Skills, and Learning to Love Databases

Trace Underwood
12 min readJul 23, 2019
(Source)

This post is part of my “Speedrunning College” series. If you’d like to see the background, start here.

Bookending the last class I wrote about (and still have more to say on), I had the great pleasure and frustration of taking two courses on database design: one providing an overview of databases as a whole, the other focusing my attention on SQL and actual use of databases. As I alluded to before, they were at once the most satisfying and the most frustrating of the courses I’ve taken thus far. This was a bit counterintuitive for me, since “database” as a word tends to conjure up images of grey office spaces and tired accountants, bland pictures and vague spreadsheets — images like what you’ll see throughout this article. What I found was, well, cool.

Before I explain, let me pause a moment to bore you with some education theory.

Boring you with some education theory

When learning any subject, there are two core parts: acquiring knowledge and developing skill. Some fields of learning — history, for example — are heavier on the first. Others — say, ear training — are heavy on the second.

The two require entirely different approaches. When acquiring knowledge, you spend the bulk of your energy at the outset, understanding, contextualizing, and committing a wide range of ideas to memory. Once learned, retention is trivial and takes almost no time or effort if spaced correctly. It takes seconds to review the musical tones in an interval sequence, but you can’t hope to master them without hours of practice, spread out over weeks or months. Your mind needs that sort of concerted effort to adapt.

When your knowledge progresses faster than your skill, as with ear training, it creates an uncomfortable gap. You know the right way to do something, then you go out and immediately screw it up. On the other hand, if skill is too prioritized over knowledge, someone can end up getting better and better at manually copying every cell from an excel spreadsheet instead of learning a macro to do the whole job in seconds.

Ok, you may be saying, cool distinction, but what on earth does that have to do with databases?

Well, this gap was on my mind throughout the course, for two reasons.

The value of knowledge acquisition

The first is that this course helped me appreciate the knowledge-acquisition side of learning more. I’ve never disliked it, exactly, but like that obnoxious kid in math class asking “When am I going to use this?”, I’ve often been frustrated by just how many disconnected bits of information I’m asked to learn and how much it’s possible to “learn” without being competent at something. From that framework, it’s easy to be a bit skeptical towards knowledge acquisition.

As I worked through the first database course, though, a couple of stories shoved their way to the front of my mind, two quietly embarrassing bits of my history.

A few years back, a friend tried to get an online startup going with me. He wanted to work on the marketing and content side of things while I took care of building a website and handled the back end of things. I hadn’t built much from scratch before but was optimistic that I could work from what I learned and understand the rest of what I needed on the fly. That worked well enough when I was constructing a front-end for the website. For the rest, though, I realized pretty quickly I was out of my depth. I downloaded some server software, started looking at a bit of information about how to store the information we hoped to handle on the site, realized I had no clue what I needed to do, and never really picked the project back up after putting it on pause to hitchhike around the country.

More recently, around January this year I started a project intending to examine the placements of Utah, my home state, on various quality of life national rankings: crime rates, education, and so forth. I collected a lot of rankings in an excel sheet. The data itself was fascinating, but I was inexperienced with Excel and hit some hiccups pretty quickly. One time, I accidentally shifted a bunch of rows to the left and had to carefully examine and reset things to make sure the rankings still matched what they were supposed to. Later, I wanted to compare Utah to a set of states loosely matched on demographics and realized the only way I really knew to do it was manually pulling the numbers for each state. The delays piled up, and midway through the project, knowing there was a more efficient way of doing things but not quite having a picture of what it was, I put the whole thing on hold.

Why did they come so forcefully to mind?

Those of you familiar with databases already know. The course material effectively looked at both my stalled projects, said, “Oh, that? Yeah, easy! All you do is…” and laid out exactly how to handle them.

Is it weird to be fascinated by something as dry and corporate-sounding as databases? Possibly, but I am. The whole thing was like a sped-up history of clever people solving complex problems in sensible ways, and I got to watch! Not only that, but suddenly chunks of the internet are much more legible to me. The whole time, I found myself nodding along: Oh, this is why all these websites use this format for forms. Wow, so that’s how this website looks behind the scenes. It was great.

It would be a bit too much to describe the entire course, but I’ll pull one of my favorite examples from it to show what I mean. Take a look at this spreadsheet:

All examples from here and WGU’s Data Management Foundations/Applications courses

Isn’t it fascinating?

Okay, maybe a made-up list of skill certifications for a random company’s employees doesn’t excite the rest of you as much as it does me, but take a look at the questions they bring out. These are the sort of things that become apparent the moment you start trying to wrangle data. In my Utah statistics project, I couldn’t figure out why my top Excel row was being excluded from a low-to-high sort at one point, then realized the software was assuming I’d have a title and needed to go back and reformat things. Little issues pop up constantly. And they went through and handed me elegant solutions on a silver platter: Just translate business rules into data model components, toss the data in relations with primary keys and connected by foreign keys in a relational database, ensure entity and relationship integrity, simplify composite attributes, remove repeating groups, add linking tables to implement M:N relations, and root out partial and transitive dependencies to normalize the database to 3NF or BCNF, optionally index it on an attribute, then use SELECT SQL queries from within a DBMS to retrieve the data you need.

Simple!

Really, the information was interesting and useful, and immensely practical. And honestly, its presentation was all I could ask as well. In courses, I look for a smooth flow with lots of opportunities for active recall: pre- and post-quizzes, activities, so forth. I should never be pausing and wondering “wait, where should I go next?” or reading through a bunch of content that I’m never expected to remember again. The course did great with that. The material was readable and not unnecessarily drawn out, the examples were useful, concepts got an appropriate amount of time. Best of all, the course contained well over 500 quiz questions on the material, asked about each important concept enough times that it was easy to remember by the time I reached the end, and by and large exceeded my expectations.

I do have some minor gripes with the way the quizzes were handled — a few simple design elements. Each question loaded individually, sometimes taking upwards of a few seconds. They had a review mode where you would “master” a question after getting it right three times, which sounds good except they would barely wait another question or two before showing you one you had just gotten right. That doesn’t help anyone, and I didn’t use that mode. In an quiz centered around information recall, people should be allowed and encouraged to go quickly, since they either know the information or not. Any barriers to that are frustrating. This is small, yes, but never underestimate the power of mild inconvenience. These design decisions matter.

You shouldn’t see screens like this mid-test

There were also some fascinating moments when I pre-tested for the chapters and started learning some of the information through the quiz itself. I have a lot to say about this, more than appropriate when I’m focusing on databases, so look for my thoughts there in a later article. Anyway, despite minor gripes, I was super excited about the information available in the courses.

The Other Shoe, Dropped

Alongside knowledge, though, comes skill. This is where both courses stared directly at some amazing opportunities, meandered around them, poked at them, and then wandered away. I wasn’t impressed. It was such a practical course full of so many useful tools that every few minutes I found myself wanting practice with some of them. Unfortunately, very little of that practice was forthcoming.

Let me give a few examples:

Remember that picture I shared up above? Like I said, the course spent a lot of time explaining how to translate data like that into a useful, elegant database through the process of normalization. The first step, it explained, was to ensure that each row in a table referred to one item, and that each could be identified by a unique code, or its primary key. For a list of employees assigned to projects, for example, you could sort them into a table by their employee number and project number, making sure no employee was assigned to the same project twice.

Like so

Anyway, it was great information, particularly since it’s immediately practiceable. Just give a few examples of unprocessed data and have me go through them. Let me get used to it and build the skill a bit. To the course’s credit, it did that, a little bit, eventually. The second course let me drag and drop a few examples with a fair bit of hints & guidance. It just wasn’t much, and wasn’t immediate.

The course also spends some time explaining data modeling, giving examples of the way organizational structure translates into databases. For example, in a university, you have departments containing many separate courses, each taught in one or a few sections and learned by many students. How would you translate that to a database?

All clear, I hope!

Again, the course explained the idea, and it was useful information — sometimes intuitive, sometimes less so. I could have been asked to examine the structure of an example organization and come up with a way to translate it into database-usable relationships. But again, that practice just wasn’t there much, and when it was, it was very much a step-by-step process.

The worst offender for this was when it came to learning SQL, the most common language used to build and search databases. I went through, studied and memorized the important commands, learned how it all related to each other… and then practiced only sporadically, and with a couple of brief lines each time. It felt like being handed a Ferrari and then being told I could work the turn signal while they drove it for me. There was a bit of interactivity, and there were a few examples, but this was by far the most important part of the course in my eyes and it seemed almost like an afterthought.

One of a few opportunities for practice

Probably the most, ah, fun part of either course came when I went to the appendix of the first half and found a couple of projects. I was pretty excited to go through them until I actually started going through one and realized they were taking me on a fully guided tour of Microsoft Outlook and telling me everything I needed to click or type.

With helpful red arrows!

There are some potential objections to my desire for courses like this to focus on skill and knowledge in equal measure. Let me focus in on a few:

1. School is about gaining knowledge first and foremost. It’s not every course’s responsibility to develop these skills.

I’ll challenge this assertion. So far, spaced repetition has worked great helping me to remember anything I want to remember for these courses, but how often do students talk about cramming for finals and then brain-dumping everything afterwards? As I mention at the start, skill-building doesn’t work like that. They can’t be crammed. They reinforce information, often in surprising ways — see my previous post on rabbit holes for a reminder of just how many steps can lurk beneath the surface of something you “know how to do”. In technical fields, familiarity and practice are vital to go beyond the illusion of knowing things.

2. Skills take a lot of time to learn, and courses only have so much.

Skills are slow, and information is fast. These two database courses took me about two weeks between them, but in a typical university that would be stretched over a whole semester, and students are liable to brain-dump massive chunks of it. It might be odd for me to say this, given my goal, but the reality is that I shouldn’t be able to master a course like this in a week. If it had an integrated learning curve expecting me to practice and develop these skills to a high level, it would take longer, but if the information I’m using is actually useful — and in this case, it is — longer is okay.

3. It takes a lot of time and effort to prepare good interactive online tools.

I raise this objection because it’s true and because there’s a reason education, particularly online, is the way it is. Courses are large, degrees are larger, and the sum total of human knowledge that could be taught is immense. There’s a reason textbooks and videos are the default, and a large part of it is that they’re more practical to produce and scale.

But it takes a lot of time and effort to prepare all sorts of things, from games with dev teams of hundreds to movies with crews of thousands. People are already spending tremendous time and effort in education, but most of it goes to repeating the same curricula at a classroom level, when one teacher is reaching a room of a few hundred students at most. Online courses are scaleable, and a good tool can spread and be reused indefinitely. It’s *worth* the time and effort to provide courses with demanding, polished interactive tools.

More directly, I expect time and effort, since tuition is thousands of dollars a semester.

The bottom line

Please don’t take this to indicate I got no value out of the course, the whole thing was a joke, or anything like that. I understand far more about the structure of databases than the ground zero I came in with, built a foundation to develop future skill in the area on and now have a clear path to improvement if and when I want to work with them, and honestly appreciated the course. It just has so much potential, and I want to see that potential reached so I tend to hone in on flaws.

Here, the flaw was that it taught me a ton that I needed to know but spent less time developing my ability to do what I needed to do.

Like I said, these two courses were by far my favorite. It was a glimpse into a surprisingly beautiful world of creative problem-solving, hidden behind a previously intimidating corporate veneer. It provided new tools for problems I’ve struggled to solve, and did so in a way that gets me excited about technology. The whole time, though, I wanted to dive in and play around, to revel in the subject and to become really confident in my skill in the area going forward, and the course just wasn’t completely equipped for that.

For that reason, I say: Thanks! Now can I have more, please, and better?

Throughout this project, I intend to continually experiment on the fly and provide regular updates analyzing my successes and failures in this game of learning. If you’d like to follow along or read my meandering thoughts on education, optimization, and whichever other topics catch my eye, you’re welcome to follow me here, on reddit, and maybe even on Twitter if I’m feeling particularly careless.

--

--

Trace Underwood

Passionate about learning, expertise, education, and the strength of narratives and deliberate restrictions. Rarely original, occasionally accurate.