New site is coming soon.
I promise.
New site is coming soon.
I promise.
Perhaps some of the most annoying bugs to fix are charset encoding failures. If you’ve ever had to make an application that supports non-latin characters (and these days any serious application should) you probably know what I’m talking about. This is the kind of problem that you don’t learn to fix until you’ve experienced it yourself a few times, and sometimes it’s actually quite difficult to do so as any part of the application could be the culprit. It can even be some part that you yourself did not make.
![]()
It’s important to realize that most of these problems can be avoided by making sure everything is correctly stored into the database. It’s a bad thing to do any kind of character conversion upon retrieving data. (What if you also get the data at some other part of the application and forget to do the conversion there? What if the encoding changes yet again?) However, the problem cannot be solved just by making sure your cols are encoded in UTF8: you must make sure the connection itself is in UTF8, too. This is where set names comes in.
mysql> create database `a`;
Query OK, 1 row affected (0.07 sec)
mysql> use `a`;
Database changed
mysql> create table `test` ( `str` varchar(255) ) default charset=utf8;
Query OK, 0 rows affected (0.09 sec)
mysql> set names 'utf8';
Query OK, 0 rows affected (0.04 sec)
mysql> insert into `test` ( `str` ) values ( 'ä' );
Query OK, 1 row affected (0.03 sec)
mysql> set names 'latin1';
Query OK, 0 rows affected (0.00 sec)
mysql> insert into `test` ( `str` ) values ( 'ä' );
Query OK, 1 row affected (0.00 sec)
mysql> set names 'utf8';
Query OK, 0 rows affected (0.00 sec)
mysql> select `str` from `test`;
+------+
| str |
+------+
| ä |
| ä |
+------+
2 rows in set (0.00 sec)
As you can see, even though we wanted to insert the UTF8 string "ä" into the UTF8 column str, the data ended up being malformed under the latin1 connection.
Of course, this is only one possibility, but one that I’ve found to be the cause quite often. Keep in mind that you may not experience this problem until you deploy, as the server’s default connection might use a different encoding than your local server’s.
Everybody has his own coding style. As they should.
Okay, not everybody does; some, especially new programmers, still flounder about helplessly as they try to maintain their junk. I’m sure every experienced programmer has, at some point, edited a file just to get rid of random spaces, linebreaks and tabs, as well as huge wads of commented code that’s kept around “just in case” even though it’s in the repository anyway.
A popular question to ask programmers is what they do first when they start a new project from scratch. Some would start with something tangible like a file of constants they’re likely to use; I start off by editing whatever source code’s available (assuming I use a framework) to conform to K&R style (specifically, 1TBS). I love it for its simplicity, even if it isn’t perfect.
function do_something_fancy(arr, number)
{
var collection = [];
for (var a = 0, z = arr.length; a < z; ++a) {
var current = arr[a];
if (current.value != number) {
continue;
}
collection.push(current);
}
return collection;
}
Of course, problems occur when two programmers who have to work together on a project can’t agree on a style. This leads to a strange game of chicken whereby the worst outcome is both programmers continuously editing each others’ code in the hope that the other will eventually concede. (Geez, can’t we all just get along?) It never ceases to amaze me how zealous people can be when faced with the thought of having to give up their preferred style. I mean, mine is obviously better. Come on now!
If I had any readers at all, I’d toss up a poll.
Welcome to WordPress. This is your first post. Eat it or delete it, then start blogging!