Wingtip Labs Blog

Use Real Data in MySQL Performance Tests

MySQL’s mysqlslap utility helps you visualize how your database performance will improve with more hardware, new tuning, or different indexes.

The trouble is, it’s designed to use fake data. You can tune the fake data to look increasingly like your real data, and that will help you get a feel for changes like bigger hardware. But fake data teaches you almost nothing for deep changes like altering your schema, adding new indexes, or tweaking memory and cache parameters.

So we’re going to get completely authentic data from your database so you can plan and test performance changes with higher confidence.

Four Remarkable MySQL Storage Engines

Storage boxes from the Mythbusters workshop, M5 Industries.

Some DBAs go their whole career using just one or two storage engines. In this article, we’ll take a peek at four remarkable storage engines you might have overlooked, all of which ship with MySQL 5.5.

What’s a Storage Engine?

Different applications create different data with different needs. Some applications require consistency and crash safety, others require speed, others require vast storage and can accept slow queries. MySQL’s pluggable storage engines let DBAs pick a storage model that fits the data while the application continues to use the same MySQL client libraries and SQL statements.

Domain Specific Markup Language

HTML is the worst markup language, except all those others that have been tried.

Raw HTML is a low-level language, and it’s starting to bum me out. I’m working on a project that has me writing a large number of relatively simply marked up pages. (We’ll see the structure below.) In this post, I’m going to implement a Domain-Specific Markup Language, using PHP. It’ll use nouns relevant to my subject as I write it, and output good old HTML when it’s time to render to a user.

What’s So Great About Sudoedit?

Yesterday I learned about a tool that’s going to change my daily behavior working on servers.

I was setting up replication on a new MySQL server, which starts with turning on binary logging by editing /etc/my.cnf. Of course, I was logged in as a low-privilege user, and /etc/my.cnf is owned by root, and I don’t have write privilege to it.

1
2
lurkdata ~ $ ls -l /etc/my.cnf
-rw-r--r-- 1 root root 480 Jan  3 19:19 /etc/my.cnf

Typically, I’d run sudo vi /etc/my.conf That works, but it wasn’t a good long term fit here. I’m writing a hands-on MySQL course and I want to give students all the access they need to administer the MySQL database, but not access to, say, turn the lab server into a BitTorrent seed at my expense.

My Hands Remember

Wingtip Labs is hard at work on a course called My Hands Remember MySQL. This is a story about where that name comes from, and how it shapes a course that kicks dry theory to the curb in favor of hands-on experience.

A PHP Iterator for Amazon SimpleDB

I’m using Amazon SimpleDB to store all student progress information in my Regular Expressions tutorial. All the data I discuss in the article Why I Love My Error Logs (and You Should, Too) is in Simple DB: student records, all the solutions students have tried, and a few summary tables for data I’d just GROUP BY to get in a relational store (e.g., total time on site, fastest time to complete a level).

SimpleDB (SDB) has been great to us, but the PHP API that Amazon provides fetches data in batches. Makes sense, except they expose that annoying implementation detail in a way that makes my calling code ~10x longer than it has to be. This is a story about how I fixed that, with a PHP 5 feature called Iterators.

Why I Love My Error Logs (and You Should, Too)

Don't worry, our staff is accustomed to dumb questions.

Wingtip Labs makes a Regular Expressions Tutorial that has given hundreds of programmers the chance to learn thousands of regular expressions.

Most people beat the first few levels with no trouble: match literal text, use | for alternatives, use [] for character ranges. But pretty soon you’re matching log lines by date, or validating IP addresses, and–because this is a learning tool–people start to make mistakes.

Teaching people to correct those mistakes is hands-down the best part of my job.

It turns out the mistakes people make while learning regular expressions follow a Pareto distribution: 80% of people make the same 20% of mistakes. So if we can anticipate a relatively small number of mistakes, we can help a large number of students.

Using Regular Expressions to Nose Around a Large PHP Project

I love PHP, and I love Regular Expressions. I’m putting together a few blog posts to show how I used regular expressions and PHP together in a relatively large project that I was in charge of for the past few years. (Unfortunately, this project is owned by my previous employer, so I’ll be sharing metadata and short code samples rather than the whole shebang.)

Why? I built a regular expressions tutorial, and part of what makes it amazing is that it uses real world examples. This was a chance to find more examples to incorporate.

In this post, I’ll be using regular expressions to dredge up some information about the project:

  1. How big is it?
  2. How many files are actually PHP? (vs CSS, PNG, etc)
  3. How often did I use regular expressions?
  4. Which of PHP’s regular expressions functions do I lean on?

Today you’ll see some practical uses for regular expressions with find and egrep.