Friday, November 28, 2008

Keyfixer 0.4 for Firefox and Thunderbird

Firefox/Thunderbird keyfixer, starting with version 0.4, is now as a Mozilla Extension.

Keyfixer makes the keyboard bindings for Firefox or Thunderbird on Mac OS X behave like Windows. This is very useful for people who use both Windows and Mac (like I do), and don't want to have to continually remap your brain for each system

This new version is a regular Mozilla Extension (a type of Add-on), so now you don't have to uninstall and reinstall every time you upgrade Firefox.

Thanks to Jim Mendenhall of Starry Hope for the original version!

Let me know if you have any issues!

Cheers, -Matt

[Update: Version 0.4.2 is now available as a feature release to additionally bind the Control key instead of just the Command key. Some people prefer to use Control, and now you can use either...)

Monday, September 15, 2008

Joel Spolsky Co-Launches "Stack Overflow" programmer's forum

One of my favorite programming bloggers, Joel Spolsky, recently announced his latest co-project:  Stack Overflow.  This is basically a 'fixed' version of the question forum that you frequently encounter when searching for programming questions.  The improvement is that this is moderated, and the answer gets promoted to the top so that you don't have to wade through endless comments to find the answer (if you do find the answer).

It's still in beta, and it doesn't have too many users yet, so it's hard to say whether Google will give it enough PageRank to reach critical mass.  Just in case, I registered myself (using OpenID, with my blogger website -- this blog) so that I could get a low enough user number to have Web Cred.  I got #7448.  Not too bad.  If this ever grows to SlashDot proportions, people will be like "ahh, he's got a four digit user number -- he must know what he's talking about"  (says the 7-digit user number user).  Of course, it will be impossible to ever win the user number war against Jeff Atwood:  #1.

I browsed the site a little bit and couldn't find any big complaints about the user interface, although I haven't tried posting yet (I did get a 'bronze medal', though, for filling out my biographical information!).  They've got syntax highlighting on the code samples, so it can't be too bad...

I'm half tempted to move a bunch of the programming tidbits I've collected on this site over to Stack Overflow.  I just want to make sure that the answers still link over to this blog so that this blog can build web presence as well.  (One of the great things about posting useful answers in your blog is that it raises the blogs PageRank -- if these same answers are posted elsewhere, your blog doesn't improve).  Stack Overflow lets you link your website through your user profile, but you have to click the user profile first.  I've also noticed that the last person to edit the question gets the credit for asking it -- not the original asker.  This creates a situation like that game where you try to put your hand on top of another person's hand, who tries to put their hand on top of yours, until you both end up slapping each other instead of putting hands on hands...

In any case, I hope that this website builds to the point that Google gives it top PageRank for the questions it answers.  This would let me reap the benefits of this system without all the work... :)

Monday, September 1, 2008

Disabling Spotlight Search on Mac OS X 10.5

(Update 2009-11-09: WARNING: It appears that the information in this blog post is not compatible with Snow Leopard 10.6, and may render your system unbootable)

I recently installed the Google Desktop application on my MacBook, and have had good luck with using it. I know there's all kinds of privacy issues, but for now I'm willing to exchange this a little for the convenience of Google searching of all my personal documents... :)

One problem, though, is that Spotlight is still on and chewing up CPU time, in addition to Google's indexing service. Since I'm using Google search and not Spotlight search, it would be nice to my battery life and fan life (and ears) to not have Spotlight needlessly burn cycles.

Unfortunately, there doesn't appear to be any good way to disable Spotlight. The only way that I could easily find was to go to "Apple Icon->System Preferences...->Spotlight->Privacy" and select my whole hard drive under the "Prevent Spotlight from searching these locations:" box. This worked, but after looking at the Google Desktop settings (in System Preferences...->Google Desktop), I discovered that "Google Desktop will not search items in Spotlight's Privacy list". So essentially by disabling Spotlight (in this manner), I've also disabled Google Desktop search.

With more searching, I found a way to disable Spotlight by getting a little dirty with system settings. There are a number of 'wrong' ways to do this, as evidenced by several blogs that suggested moving or deleting system files, or changing there permission flags to 0000. From the best I can infer, the 'correct' way to disable Spotlight for Mac OS X 10.5 is by standard system services calls. Open a terminal and run the following commands:
> sudo launchctl unload /System/Library/LaunchDaemons/
> sudo launchctl unload -w /System/Library/LaunchDaemons/
The second command will result in an "launchctl: Error unloading:" error, but (from what I've read) you can ignore this error.

To later re-enable Spotlight search, type these commands:
> sudo launchctl load /System/Library/LaunchDaemons/
> sudo launchctl load -w /System/Library/LaunchDaemons/

Sources: Comments within the following blogs (don't follow the main article suggestions):
Hope this helps!

Saturday, August 16, 2008

Hiding Password in Registration E-mail for Joomla

I use Joomla 1.0 for the IEEE Security in Storage Working Group (SISWG) homepage, and discovered that when new users register, their e-mail addresses are e-mailed to them in clear text before being hashed using MD5 and stored in the database. Since SISWG is a security group, it's important to provide a little better security than for the usual Joomla user. Things like sending a plaintext password in e-mail are a no-no.

Unfortunately, it looks like Joomla 1.0 does not provide a way to disable having the users' e-mail sent to them if the admin chooses to require registration. The closest thing is to change "Use New Account Activation:" to "No" in the Global Configuration->Site tab, but then users can register without a valid e-mail address.

Fortunately, this is just a one-line change to the appropriate file, which shouldn't be a problem for those who don't mind getting a little dirty. Here are the edit instructions:

Open the file components/com_registration/registration.php and change this line:

154: $pwd = $row->password;


154: $pwd = "********";

If you're not using version 1.0.12, the line number may be a little different.

That should do it!

Wednesday, July 16, 2008

Thales UK Acquires nCipher for US$100 Million

On July 11, 2008, Thales UK (a subsidiary of the French defense company Thales) submitted a proposal to acquire all the assets of nCipher for roughly US$100 million. nCipher produces cryptographic hardware, and recently acquired most of the assets of NeoScale Systems for just under US$2 million.

The deal is still pending a vote by the nCipher board, but the offer carries a 2x premium on previous stock prices, so it doesn't seem likely that this vote will fail.

The technical editor for the IEEE P1619.3 was employed by NeoScale and is currently employed by nCipher, soon to be acquired by Thales. nCipher is a strong supporter of the P1619.3 key management effort and I'm hoping that Thales will continue in this strong support.

It's a little unclear to me how to pronounce "Thales". The namesake is an ancient greek philosopher, and as such the name should be pronounced "Thay' - leaz". However, the French owners prefer to pronounce it more like "Tal-less".


Sunday, July 6, 2008

TrueCrypt Releases version 6.0

On July 4th, 2008, the TrueCrypt Foundation released TrueCrypt version 6.0. TrueCrypt is a very popular open-source disk encryption tool that is currently based on the XTS-AES encryption mode that the IEEE P1619 Task Group developed and standardized in December 2007. As the chair of the IEEE Security in Storage Working Group (SISWG) -- the group that oversaw the development of XTS -- I'm very pleased to see the continued adoption of XTS in the industry.

On a related note, NIST is currently considering XTS as an Approved Mode of Operation for protecting U.S. government confidential data under FIPS 140-2. If NIST accepts XTS, this will be a huge boon to TrueCrypt and similar tools that use XTS. If you use TrueCrypt or other tools that use XTS, please send NIST a comment (before Sept 2008).

For a limited time, you can pick up a free copy of XTS from IEEE. After September, you'll have to buy it from the IEEE store. See the P1619 homepage for instructions and other information.

Thursday, June 19, 2008

BITs "Whitepaper" on key management

Here's an interesting paper on cryptographic key management:

BITS: Enterprise Key Management

Apparently, the ANSI X9 group is considering starting a standards effort based on the BITS whitepaper. I think that key management is starting to get a little crowded for standards, but until one standard emerges as the most adopted in the industry, the hunting season is still open.

The IEEE Key Management Summit on September 23-24 (Baltimore, MD) has a 30-minute slot for ANSI X9, so we might be able to get more details then...

Friday, May 23, 2008

How to Prevent the Random Number Bug in Debian OpenSSL in Other Implementations

As probably the entire hacker community has heard by now, there was a bug recently discovered in Debian's OpenSSL implementation that crippled the random number generator. For background, see Schneier's Coverage, Slashdot's Coverage, Debian's Announcement, Ubuntu's Announcement, and a Cartoon.

On the surface, this just looks like a stupid mistake by one Debian maintainer. But if you look at the details, it's not that obvious. Here one of the two lines in question, within md_rand.c
MD_Update(&m,buf,j); /* purify complains */
This function seeds the cryptographically secure pseudo-random number generator, which then generates important things like cryptographic keys. The security of a cryptographic key is solely in the difficulty of an attacker to guess the value (like a house key's tumbler positions), and if it is predictable, there is no security. The maintainer removed this line because the Purify and Valgrind tools complained about uninitialized data.

Truthfully, if I were in the same position as a maintainer, there's a good chance I might have commented out these lines too. Code analysis tools are very useful in helping to maintain high code quality, and crippling these tools also has consequences. The right action is not always obvious, and when you go through hundreds of lines of code it's easy to forget the significance of a single line like this.

The fundamental issue with random number generators (RNGs) is that they are infamously difficult to test. A standard software regression test takes a known input to a program and checks for a known output. RNGs aren't like that -- at least when used with good seeds. A good seed never repeats and cannot be tested against known answers. Instead, you have to perform statistical tests on several samples from the seed source (examples: DIEHARD, NIST's RNG suite).

This problem is much more widespread than people think. This is also a very common problem in embedded systems. Many instantiations of SSL don't properly seed their RNG. This won't cause the system to fail, testers won't ever catch it, and customers won't complaint. So from the vendor perspective, there's really no incentive to make it work. Most of the engineers adding OpenSSL don't know much about cryptography, and often won't know to or even bother to hook up a seed. Some systems don't have a good way to generate this seed.

Even NIST' Computer Security Division (the owner of FIPS 140-2, a major cryptographic standard for government agencies) has mostly washed their hands of this problem. FIPS 140-2 used to include statistical tests on the entropy (i.e, 'randomness') source used to create the seed, but now the only requirement is that the vendor justify a certain entropy level.

To solve this problem, I think that OpenSSL (and other SSL implementations) needs to add some kind of sanity check to the seed to make sure this mistake doesn't happen again. Here's a rough outline of this test:
  1. During OpenSSL's initialization, immediately collect several samples from the seed source. The number depends on the constraints of the system, but NIST's old FIPS 140-2 statistical test collected 20,000 bytes, which is a reasonable number.
  2. Run simple statistical tests on these samples (see FIPS 140-2 for an example) and make sure the entropy source is reasonable.
  3. Store the first 100 bytes or so of this sample set in non-volatile memory, (e.g, hard disk, flash), and keep a history of several thousand of these initial samples. Discard the other samples (don't use them as part of the real seed)
  4. Run another statistical test on the series of first samples. If there is a correlation between these samples (i.e,. the values tend to be the sample after initialization), then fail with a big obnoxious error that you'd hope no distribution maintainer or embedded software engineer would miss.
This test would have caught the bug in questions because the same seed would likely have occurred across power cycles. The default for this seed is the process ID, which by default is at most 32,768. According to the birthday paradox, you will on average see a duplicate random number in the range 1 to N after roughly the square root of N samples. In this case, the square root of 32,768 is 181. Someone would have seen the horrible error message by then.

Caveat: Make sure that the files that store these seeds are properly protect from access and modification. If the entropy source is poor, it's possible to leak information about the rest of the system, or even give hints as to what the subsequent seed will be as used by the real random number generator.

When it comes to security, you really can't rely on people to catch these kinds of mistakes through code reviews. You need to have good tools to automatically catch this. Unfortunately, making the tools is difficult, which is why we still mostly rely on code reviews.

Although code reviews are better than nothing, and in this case, it sounds like even a code review would have helped...

Thursday, May 15, 2008

Can Firefox keyconfig fix Home/End buttons in textboxes?

In poking around a little further with the problem of creating custom key mappings in Firefox for Mac, I found the plugin named keyconfig (see forum discussion). The plugin seems reasonable on the surface, but it's not actively supported, and its web presence is poor. The best article I found was here on random($foo). Unfortunately, the shortcuts it includes seem to be just for navigating, and not for text edit boxes. I suspect it's possible to add custom code to make keyconfig do what Starry Hope's Firefox keyfixer does -- remap the home/end keys to go to the beginning/end of the current line -- but I couldn't find out how without a lot of poking around.

Keyfixer is a patch utility that modifies the Firefox configuration file named platformHTMLbindings.xml (kept inside the zip file named /Applications/ with the appropriate key board shortcuts. For example, these are the lines that change Home/End to go to the beginning/end of the current line instead of top/bottom of current edit window.

<!-- Additions to fix home/end -->
<handler event="keypress" keycode="VK_HOME" command="cmd_beginLine"/>
<handler event="keypress" keycode="VK_END" command="cmd_endLine"/>
<handler event="keypress" keycode="VK_HOME" modifiers="shift" command="cmd_selectBeginLine"/>
<handler event="keypress" keycode="VK_END" modifiers="shift" command="cmd_selectEndLine"/>

The question I have is whether similar changes are possible within the keyconfig firefox plugin. If this is possible, it would be much easier to maintain because you have to uninstall Firefox keyfixer before upgrading firefox.

I see a lot of potential in keyconfig because it's a firefox plugin, and is something that should be in Firefox anyways, but right now I can't make it work better than the keyfixer patch.

Ada beats Colossus in Cipher Challenge

I caught this article on Slashdot about an Ada program beating Colossus in a challenge to quickly decrypt messages from the Lorenz SZ42 cipher. The Ada program took 46 seconds versus what Colossus (the World War II machine in Bletchley Park that cracked German messages) solved in about three and a half hours.

Ada has me very fascinated because I'm interested in programming languages that lend themselves to computational correctness and high execution speed. I haven't used Ada, but I tend to hear good things about it, and would like to try using it next time I start a project. I'm particularly interested in how it would help programs that use cryptography and that are going through a Common Criteria evaluation.

According to the TIOBE Index, as of May 2008, Ada is sitting at #17 with 0.431% popularity and a rating of 'B'. From that point of view it doesn't look like the hottest thing out there, but it has Objective-C (Mac's darling) way out-ranked. Objective-C is #47 with 0.083% popularity, which makes me wonder if it's worth learning for writing Mac OS X Widgets. Fortunately, Mac's Xcode provides some good hooks for Python (#7, 4.6% popularity) and Ruby (#10, 2.85%)...

Saturday, May 10, 2008

Fixing Home and End Keys on Firefox 3 for Mac OS X

[Update: Keyfixer 0.4 is now available; see new blog post]

Since I'm a heavy gmail user, I make significant use of the editor window in my web browser. Ever since moving to a Macbook as my primary computer, I've been struggling with re-learning the Mac navigation key shortcuts. Since I still use Windows a lot, I decided to instead reconfigure the Mac shortcuts to emulate Windows shortcuts.

This strategy worked well, except for in Firefox, which doesn't respect the Mac DefaultKeyBinding.dict file. For Firefox 2, I solved this problem by running Keyfixer as published by Starry Hope. Unfortunately, this stopped working for me when I updated to Firefox I switched to Safari for a while, but Safari's other bugs and "features" started to annoy me. I wanted my Firefox back.

After digging into what Keyfixer does, I've put together an updated version 0.3 that should work for Firefox and Firefox 3.0 beta 5 (tested on Mac OS X 10.5.2). The new solution performs patches instead of straight copies of the keymapping xml file, so I'm hoping it is more robust against future changes in Firefox.

Click here to get Firefox Keyfixer 0.3.

Compared to version 0.2, this new version has the following updates:
  • Support for both Firefox 2 and 3 (versions on or after May 2008)
  • Running the program twice will uninstall the patch. This is useful when performing upgrades (Firefox won't upgrade if Keyfixer has been applied -- you have to remove it first)
  • PageUp and PageDown now moves the cursor instead of just moving the screen. This is more consistent with Firefox on Windows.
As a side note, it looks like there was an intention in Firefox to follow standard Mac behavior by mapping Command-Left Arrow and Command-Right Arrow to move the cursor to the beginning and end of line, respectively. However, this doesn't seem to work (at least when using gmail). I'm interested to know if anyone else has seen this issue, because it's a bug.

If you have any problems or questions with this version, please drop a comment and I'll see what I can do to help!

Tuesday, April 29, 2008

Follow-up to Hitachi's Announcement of AES-256 Encryption Within a Hard Disk

As many readers noticed, last week Slashdot covered a announcement that Fujitsu is the first to offer 256-bit AES encryption in their MHZ2 CJ Series 320 GB 2.5" hard drives.  As chair of the IEEE P1619 Security in Storage Working Group, I felt an obligation to get more details on exactly what 'AES-256' encryption means. So I clicked on the handy box to submit questions, and got the following responses from Fujitsu:
1. What is the mode of operation for the AES block cipher (e.g., ECB, CBC, CTR, etc)?
===> We don't disclose this.

2. How are the 256-bit AES keys managed?
===> We don't disclose this.

3. Is Fujitsu considering NIST FIPS 140-2 certification for this disk drive (like Seagate is doing)?
===> under consideration.  
I had similar questions about Seagate's Full Disk Encryption (FDE) hard drive, and couldn't get any answers there, either.  According to AES Certificate #587, Seagate is using Electronic Code Book (ECB) for their FDE.  Unfortunately, ECB is a very insecure mode-of-operation, one that I hope NIST eventually withdraws.  To visually see what I mean, take a look at the ECB encryption of Tux the penguin.  The latest rumors I've heard is that Seagate is moving to cipher-block-chaining (CBC) encryption (a much more secure mode-of-operation) for subsequent encrypting hard disks.  Fujitsu will likely take a similar course, although there is expected to be some flexibility in the algorithms.

In contrast, tape drive vendors have been much more open about the details of their tape encryption. According to the LTO-Technology page, LTO uses the AES-GCM mode as specified in IEEE P1619.1 (soon to be published as IEEE Std 1619-2007).  Sun's T10000 uses AES-CCM, both as specified in P1619.1 and in NIST SP 800-38C.  IBM's TS1120 also uses AES-GCM.

So why aren't hard disk vendors disclosing the technical details about their encryption implementation?

Here are my thoughts:
  1. Hard disk vendors don't think that the mode of encryption is too important because it is difficult to get direct access to the encrypted data (this would require bypassing the firmware or putting the hard disk on a spin stand)
  2. Hard disk vendors are afraid that weaknesses will be found in their encryption mode, whether real or perceived
  3. There are no good standards to use for hard disk encryption
While it is true that most users don't understand enough about encryption to even know what a mode-of-operation is, I believe that these details will become increasingly important as buyers become better educated and demand open details about the encryption.  Otherwise there is no way to know whether you've been sold snake oil that doesn't actually provide measurable benefits (for example, weak ECB encryption of the entire hard disk using the otherwise strong AES block cipher).

Concerning standards, this is an example of how the late arrival of IEEE 1619 has caused confusion in the storage encryption industry.  When IEEE 1619 start about 6 years ago, the goal was to create a strong encryption standard suitable for data storage devices.  First came the wide-block EME mode.  This mode fell when Antoine Joux found a vulnerability that sent Shai Halevi and Phil Rogaway back to the drawing board.  Next was the LRW mode.  This fell when Niels Ferguson of Microsoft noted in Crypto 2006 that you can leak the tweak key if encrypted with itself (Microsoft has no control over where the keys are).  About this same time, the Trusted Computing Group wanted to endorse LRW (this was dropped).  About two years ago during the LRW unrest, Mart Somermaa pointed the group to the XEX mode as proposed by Phil Rogaway.  The P1619 group added ciphertext-stealing to this mode and called it XTS-AES.  

The XTS-AES algorithm was approved last December by IEEE as part of IEEE 1619-2007, and is nearly published.  After it is published, IEEE will submit XTS to NIST for consideration as an Approved Mode of Operation for FIPS 140-2.  If NIST accepts XTS, then this will become an excellent mode for hard disk vendors to consider.

Saturday, April 26, 2008

Fixing up the Mac Key Bindings for Windows Users

[Note: Edited on 2010-02 to switch the order of the shift and command key modifiers.  Apparently, Mac OS is particular about the order.]

[Edited on 2010-10-22 to describe how to use TextEdit to apply this key mapping]
[Edited on 2010-11-12 to mention that TextEdit sometimes adds a .txt extension]

I'm a longtime Windows user who recently purchased a Macbook. Overall I'm very impressed with the machine, but it does have a learning curve, especially for the key bindings.

The first thing I noticed was that the Macbook does not have the Del and Ins keys at all, and the Home, End, PageUp and PageDown keys require pressing 'Fn' and then an Arrow key (which is understandable because the keyboard on a small Macbook is somewhat cramped -- also, I've asked a couple users who have not used PCs much before using a Mac, and they did not even know these keys existed, or what they would do with them)

However, when I'm not on the road, I like to use a nice full sized Microsoft Natural Keyboard, to reduce tendinitis. As a (former) hard-core programmer, I very extensively use the Home, End, PageUp and PageDown keys to quickly navigate code or text documents. I was very dismayed to discover that Apple pretty much doesn't do anything with these keys.

In hunting through all the configuration options, I noticed that you can reconfigure a lot of key mappings through the System Preferences utility (go to the apple in the upper-left corner, select System Preferences..., click on Keyboard & Mouse, and click the Keyboard Shortcuts tab). This was useful for a start, but I quickly determined that the Mac wouldn't allow me to bind keys to any of the 6 special keys (Home, End, PageUp, PageDown, Delete, Insert). This made me sad.

I did discover, however, that it is possible to switch the Control and Command keys. This is a big help because now all the windows favorites like Ctrl+c for copy, Ctrl+v for paste, Ctrl+x for cut, and Ctrl+z for undo now work the same on both systems. I still switch frequently between Windows and Mac platforms, so it's very nice to have the same key mappings.

Most recently, I discovered that there is a special file you can create that allows special mappings to the 6 special keys. This made me happy. I was now able to get much closer to having a unified key mapping. For more details, see this article.

To do this, create a new file called ~/Library/KeyBindings/DefaultKeyBinding.dict and put the following text into it (You'll probably have to create the directory the first time -- this is okay).

/* ~/Library/KeyBindings/DefaultKeyBinding.Dict
This file remaps the key bindings of a single user on Mac OS X 10.5 to more closely
match default behavior on Windows systems.  This particular mapping assumes
that you have also switched the Control and Command keys already.

This key mapping is more appropriate after switching Ctrl for Command in this menu:
Apple->System Preferences->Keyboard & Mouse->Keyboard->Modifier Keys...->
Change Control Key to Command
Change Command key to Control
This applies to OS X 10.5 and possibly other versions.

Here is a rough cheatsheet for syntax.
Key Modifiers
^ : Ctrl
$ : Shift
~ : Option (Alt)
@ : Command (Apple)
# : Numeric Keypad

Non-Printable Key Codes

Up Arrow:     \UF700        Backspace:    \U0008        F1:           \UF704
Down Arrow:   \UF701        Tab:          \U0009        F2:           \UF705
Left Arrow:   \UF702        Escape:       \U001B        F3:           \UF706
Right Arrow:  \UF703        Enter:        \U000A        ...
Insert:       \UF727        Page Up:      \UF72C
Delete:       \UF728        Page Down:    \UF72D
Home:         \UF729        Print Screen: \UF72E
End:          \UF72B        Scroll Lock:  \UF72F
Break:        \UF732        Pause:        \UF730
SysReq:       \UF731        Menu:         \UF735
Help:         \UF746

NOTE: typically the Windows 'Insert' key is mapped to what Macs call 'Help'.  
Regular Mac keyboards don't even have the Insert key, but provide 'Fn' instead, 
which is completely different.

"\UF729"   = "moveToBeginningOfLine:";                       /* Home         */
"@\UF729"  = "moveToBeginningOfDocument:";                   /* Cmd  + Home  */
"$\UF729"  = "moveToBeginningOfLineAndModifySelection:";     /* Shift + Home */
"@$\UF729" = "moveToBeginningOfDocumentAndModifySelection:"; /* Shift + Cmd  + Home */
"\UF72B"   = "moveToEndOfLine:";                             /* End          */
"@\UF72B"  = "moveToEndOfDocument:";                         /* Cmd  + End   */
"$\UF72B"  = "moveToEndOfLineAndModifySelection:";           /* Shift + End  */
"@$\UF72B" = "moveToEndOfDocumentAndModifySelection:";       /* Shift + Cmd  + End */
"\UF72C"   = "pageUp:";                                      /* PageUp       */
"\UF72D"   = "pageDown:";                                    /* PageDown     */
"$\UF728"  = "cut:";                                         /* Shift + Del  */
"$\UF727"  = "paste:";                                       /* Shift + Ins */
"@\UF727"  = "copy:";                                        /* Cmd  + Ins  */
"$\UF746"  = "paste:";                                       /* Shift + Help */
"@\UF746"  = "copy:";                                        /* Cmd  + Help (Ins) */
"@\UF702"  = "moveWordBackward:";                            /* Cmd  + LeftArrow */
"@\UF703"  = "moveWordForward:";                             /* Cmd  + RightArrow */
"@$\UF702" = "moveWordBackwardAndModifySelection:";   /* Shift + Cmd  + Leftarrow */
"@$\UF703" = "moveWordForwardAndModifySelection:";   /* Shift + Cmd  + Rightarrow */
Remember: These key mappings assume that you've switched Control and Command. If you don't want to make this switch, replace each @ (command) with ^ (control).

Here are steps to take to apply these changes:
  • Open TextEdit under the Applications folder.  If TextEdit was already open, create a new document using File->New.  There should be a window labeled 'Untitled'.
  • Select the text within the window above, copy it, and then paste it into your new TextEdit window.
  • In TextEdit, convert this to plain text (the default is rich text) by selecting Format->Make Plain Text.
  • Next, select File->Save As...  In the "Save As" dialog box, navigate to your home directory (look under PLACES on the left side for a house picture that has your name next to it).  In your home directory, double-click on the Library folder.  If you see a KeyBindings folder then double-click on it.  If not, then click on "New Folder" (within the Library directory), name the new folder KeyBindings (with no space), and then double-click on it.  Type DefaultKeyBinding.dict for the filename (at the top) and then click Save.
  • Warning:  TextEdit will sometimes try to 'help' you by appending a .txt extension to the filename.  Make sure this doesn't happen.  If asked to use a .txt extension, tell TextEdit to instead use .dict.  It will not work if you use .txt.  If you have trouble, see comment by Nathan below.
  • Before these changes take effect, you need to log out and then log back in.

There you have it! I know this emulation isn't perfect (not all applications honor this mapping), but it's a good start. Please drop comments if you have any questions or suggestions for improvements.

Tuesday, April 8, 2008

RSA 2008: Cryptographer's Panel

As one of the great highlights of the RSA Conference is the cryptographer's panel with the great experts of modern public key cryptography: Whitfield Diffie, Martin Hellman of Diffie-Hellman fame (discrete log crypto) and Ron Rivest and Adi Shamir of RSA fame (crypto based on the integer factorization problem -- used in SSL).

This is a rough draft post that will be cleaned up later, but contains the last part of the discussion:

Question from Burt: Where would you put your research effort?
Diffie: I'd put research into genetics - We'll see the first child made from two women, showing that men are an expensive and unnecessary thing to have around.
Hellman: We need to become more rational in our approach to security

Closing remarks:
Diffie: I'm optimistic about this subject. People are going to get along just fine -- cyber security is very important. The most important thing in the 20th century is client server computing. By putting important information onto a single computer, it's possible to control access. -- Something's going to happen that we don't expect, from younger people
Hellman: Don't be afraid to tackle problems
Rivest: (countering Diffie): There is a lot of cryto still to be discovered. We're still at the early stages of tying worst-case complexity to best-case complexity -- how to run crypto protocols in parallel so that they don't interfere -- we need the secure platform -- next problem is user interfaces
Shamir: It's about subtlety behind the schenes --- multiple lines of defense -- most of the basic elements are there. But we haven't reached nirvana -- we need to develop tools and techniques -- a GPS for data, need the ability to located where your data is. Use 160-bit sha-1 summary to help locate this data. This could help the information management problem

Burt: 1024-bit RSA -- how much longer before the publicly announced factorization
Shamir - next year
Hellman - I was at Certicom -- Elliptic curves have been rock-solid since inception
Rivest - Use Moores law - There are low-probability algorithms that are hard to predict
Burt -2010 is the transition to 2048 bit keys

These guests will be in the crypto commons for more discussion.

My RSA Conference 2008 Schedule

I will be at the RSA Conference from Monday April 7 to Friday.  For those who would like to meet up during this week, here is my anticipated schedule:

(I'll update this entry periodically as things change...)

10:25 - 11:20: Cryptographer's Panel (with Diffie, Hellman, Rivest,  and Shamir (no Adleman))
11:20 - 1:30: Lunch -- Meet at the nCipher exhibit
1:30 - 2:40: RED 309 - Real World Key Management: News from the trenches
3:00 - 3:50: RED 309 - Cryptographic Security for Ruby on Rails Web Services
4:10 - 5:20: RED 309 - Security Usability: The New Challenge (with Phillip Hallam-Baker)
5:40 - 6:30: RED 309 - Beyond the Coding Errors: The Complete View of Software Security
6:30 - 9:00: Dinner (TBD)

8:00 - 8:50: RED 300 - Improved AES Implementations
9:10 - 10:20 RED 300 - Public Key Encryption with Special Properties
10:40 - 11:50 RED 300 - Side Channel Cryptanalysis
2:00 - 6:00 Key Notes
6:30-7:30 "Dinner for 6"

8:00 - 8:50: RED 310 - High-Speed Risks in 802.11n Networks
9:10 - 10:20 RED 308 - Extended Validation: Raising the Bar for Internet Trust
10:40 - 11:50 RED 308 - PCI DSS Security Standards Foundation and future
2:00 - 5:00 Keynotes

9:00 - 9:50: RED 308 - Standardizing Key Management for Trusted Storage
10:05 - 10:55: RED 308 - The New FIPS 140-3 Standard
11:10 - 12:00 RED 308 - What's New with XACML, the Access Control Standard?
Catch bus for flight: around 1:00 pm

If anyone is interested in meeting up, please shoot me an e-mail or give me a call on my cell phone: 303-717-2717

Sunday, March 9, 2008

Seagate Includes IEEE P1619.3 in an FDE Whitepaper

Seagate recently published a white paper depicting the IEEE 1619.3 key management protocol used in a system containing Seagate Full Disk Encryption (FDE) hard disks. It's an interesting read if you're into the hardware encryption scene.

The white paper mentions using existing key management systems, like IBM's EKM (Enterprise Key Management) system, with storage systems that include Seagate FDE hard disks

The FDE encrypts the hard disk data using an AES-128 encryption key (NIST's Advanced Encryption Standard), and stores the only copy of this encryption key on the hard disk in encrypted form. To decrypt the encryption key, you need an 'authentication key'. The FDE also stores a cryptographic hash of the authentication key, which is used to verify whether the user entered the correct authentication key.

The beauty of this setup is that it is possible to perform a fast secure-erase of the hard disk by simply erasing the encrypted encryption key. Also, if an attacker was able to open the hard disk or compromise the firmware, the only available information is the encrypted encryption key and the hash of the authentication key. Without the authentication key, it is impossible to get any data off the hard disk.

There are a few caveats here, however:
  1. In the absence of a key management server, the authentication key is likely a password entered by the user, which makes the strength of the encryption only as strong as the weaker of the entropy of the password (which is typically very low) or the physical security of the hard disk (which is unknown). If someone is able to comprise the firmware of the FDE hard disk to reveal the hashed authentication key or encrypted encryption key, then it becomes possible to launch an off-line dictionary attack against likely passwords, making it possible to decrypt the data.
  2. Neither the white paper nor any other source I've seen describes the AES encryption mode used for protecting the data and the encryption key in the FDE. Just using AES-128 is not sufficient to ensure a high-level of security -- you need to use AES in a secure mode of operation. For example, using AES in Electronic Code Book (ECB) mode is notorious for leaking a significant amount of data -- see an example of Tux (the Linux penguin) encrypted using ECB as compared to other modes. I'm not saying that Seagate is using a bad mode of operation -- it's just that we don't know.
  3. The white paper mentions P1619.3 even though the standard is still in relatively early stages. On the one hand, I like seeing publicity for P1619.3, but on the other it's hard to say exactly how it will look in the end. It may not be what we expect.
Overall, I'm very happy to see encryption enter the hard disk market and to see increased interest in the 1619.3 work. The FDE hard disk is certainly sufficient for most user's security needs. However, for the agencies with high security needs (like the government), the lack of FIPS 140-2 certification and encryption mode disclosure makes it a difficult (if not impossible) purchase. Hopefully after P1619.3 helps create interchangeable key management solutions, we'll see the FDE volumes increase enough to justify improvements like FIPS certification.

Saturday, March 8, 2008

Is 91 Prime?

Quick -- without looking it up, tell me: Is 91 prime?

Hard to say, isn't it? It looks fairly prime. It's odd; it's not divisible by 5 or 11 -- or 3 either...

In this article, I'd like to show to several ways to quickly factor numbers in your head, and even some hints for tricks to speed up number factoring for computer programs. Factoring numbers has many important applications, including RSA (which is used in SSL encryption on your web browser). For me, I like to use the process of factoring numbers as a way to memorize a number. By putting significance on a number (and by going through the process of putting significance on a number), it's easier to recall numbers.

But first, a little lead-in...

In Neil Stephenson's Cryptonomicon, Lawrence Waterhouse shows his cryptographic prowess to some British World War II officers while discussing Special Detachment 2701:
"Do you have any questions?" the Main Guy asks.
"Did Alan Choose the number?" [asks Waterhouse]
"You mean Dr. Turing?"
"Yes. Did he choose the number 2701?"
This level of detail is clearly several ranks beneath the station of the men in the Broadway Buildings. They look startled and almost offended, as if Waterhouse has suddenly asked them to take dictations.
"Possibly," says the Main Guy. "Why do you ask?"
"Because," Waterhouse says, "the number 2701 is the product of two primes, and those numbers, 37 and 73, when expressed in decimal notation, are, as you can plainly see, the reverse of each other."
All heads swivel toward the don, who looks put out. "We'd best change that," he says, "it is the sort of thing that Dr. von Hacklheber would notice." He stands up, withdraws a Mont Blanc fountain pen from his pocket, and amends the organizational chart so that it reads 2702 instead of 2701. As he is doing this, Waterhouse looks at the other men in the room and thinks that they look satisfied. Clearly, this is just the sort of parlor trick they have hired Waterhouse to perform.
This change sets in action one of the main subplots. You'll have to read the book to find out more (which I highly recommend).

Factoring Tricks
Most (mathematically inclined) people can quickly pick off the first several primes, just from memory: 2, 3, 5, 7, 11, 13, 17, 19, ... It's when you start getting past about fifty where things get a little sketchy.

Here are some tricks for factoring numbers by dividing by small primes.

The first "trick" is to check whether the number is odd or even. If it's even (and not 2), then it's not prime because it is divisible by two.

The next check is to see if the last digit is a '0' or '5'. If so, it's divisible by 5.

These first two tricks work because our numbering system -- decimal -- does some of the work for us. Since 10 is divisible by both 2 and 5, base 10 already computes the result in the last digit of dividing by 2, 5, or 10. If we used something like hexadecimal (base 16), then this trick wouldn't work for 5 -- it would only work for 2 (or 4 or 8). If we used base 12 (like American clocks do), then looking at the last digit would tell us whether the number is divisible by 2 or 3 (or 4 or 6) -- but not 5.

The next trick is to add all the digits of the number, and see whether the result is divisible by 3. If so, then the entire number is divisible by 3. In our example, we can test 91 by adding 9 + 1 = 10 -- which is not divisible by 3.

Let's test 123456:
1 + 2 + 3 + 4 + 5 + 6 = 21 (= 7 x 3).
This process can be repeated, so to test 21 we have 2 + 1 = 3, which is divisible by 3.

This trick works because we're representing numbers in a base such that "base - 1" is divisible by three (10 - 1 = 9, which is divisible by 3 and 9). The process of adding digits works because we are subtracting 10 times the digit, but then adding 1 times the digit, for a net change of subtracting 9 times each digit. Take this example:

345 = [(3 * 100)] + [(4 * 10)] + [5]
sum of digits = 345 - (3 * 100) + 3 - (4 * 10) + 4
sum of digits = 345 - (3 * 99) - (4 * 9)
sum of digits = 12
Adding or subtracting multiples of 9 (or 99, or 999, ...) does not change the remainder when dividing by 9 (or by 3). That's basically why it's possible to add the digits of a number to test divisibility by 9 (or 3).

Testing divisibility by 11 is pretty easy for two digit numbers: if the two digits are the same, then the whole result is divisible by 11. It's when the number gets larger that things get a little trickier.

The method I'd like to describe here for testing divisibility by 11 is to group the original number into pairs of digits starting from the right, and then add these pairs and check whether the sum is divisible by 11. If the resulting sum is in 3 (or more) digits, repeat the process until 2 digits are remaining and see whether the digits are the same.

Here's an example:
Is 13574 divisible by 11?
1 + 35 + 74 = 110 (add pairs)
1 + 10 = 11 (add pairs again)
11 is divisible by 11 -- therefore 13574 is divisible by 11
The reason why this trick works is the same basic reasoning as for the trick with 3 and 9. The difference is that adding pairs of digits is equivalent to subtracting a multiple of 100 (or 10,000, etc) from the original number, and adding back that value one time, for a difference of 99 (or 9999, etc). Since 99 = 3 * 3 * 11, the result still equals the original number, modulo 11 (or 3).

The same basic principle applies to larger groupings of digits in any base (binary, decimal, hexadecimal...). I'll get into more details in a subsequent article about how this can speed up factoring large numbers.

For 7, the trick is to successively subtract twice the last digit from the other digits to the left. If you get a number that is divisible by 7 during any stage of this process, then the original number is divisible by seven.

Here's an example:
Is 8638 divisible by 7?
863 - 2*8 = 847 (subtract twice the last digit)
84 - 2*7 = 70 (subtract twice the last digit)
70 = 7 * 10, which is divisible by 7 -- therefore 8638 is divisible by 7
How about our original number?
Is 91 divisible by 7?
9 - 2*1 = 7 -- therefore 91 is divisible by 7 (91 = 7 * 13)

There you have it -- 91 is not prime.

It's unclear whether factoring will get you a hot date or the next promotion, but it will help keep your mind sharp and give you some memory tools, or best of all, the illusion that you too can have the cryptographic genius of Stephenson's Waterhouse. In the next installment, I hope to entertain my programming geek readers with a discussion of how to speed up the process of finding large prime numbers using a binary generalization of the methods for finding divisibility by 3 or 11. For the rest of you, I hope to have given you an amusing diversion before your boss notices that you're reading random blogs instead of doing real work. If you're caught, just ask your boss if 91 is prime... :)

Tuesday, February 26, 2008

Who uses IEEE 1619 and 1619.1?

Last December, IEEE approved the standards two encryption standards (See Press Release):
  • IEEE 1619 (specifying the XTS encryption mode, commonly used for disk encryption); and
  • IEEE 1619.1 (specifying GCM, CCM, CBC-HMAC, and XTS-HMAC encryption modes, typically for tape) .
Byte and Switch interviewed me in an article discussing this and other related standards.

Currently the IEEE Security in Storage Working Group (SISWG) is investigating the possibility of submitting the IEEE 1619 XTS mode to NIST for consideration as an Approved Mode of Operation for FIPS 140-2 certification. One of the questions asked on the SISWG e-mail reflector was whether there is widespread industry support for these newly approved modes.

From doing some web searching, I've come up with the following list of companies who are claiming compliance to these newly approved standards.

IEEE 1619 (XTS-AES) Support
  • IBM, HP, Quantum, Tandberg's LTO-4 Tape Drive uses IEEE 1619.1 GCM-AES
  • IBM's TS1120 Enterprise Tape Drive uses GCM
  • Sun's T10000 Enterprise Tape Drive uses 1619.1 CCM
  • Many core vendors (basically same list as for 1619)
I expect that these lists will grow considerably after IEEE officially publishes 1619 and 1619.1 in the next couple months. Right now, it's hard to assess compliance because a product cannot technically claim compliance until the standard is published.

If you know of implementations that expect IEEE 1619 or 1619.1 compliance, please post a comment with the vendor and product name, with a link to the appropriate webpage.

Sunday, February 3, 2008

The Microcomputer Trainer

Back when I was 7 years old, I wanted to buy a TRS-80 Color Computer, but my dad wanted to make sure that I was committed to programming before spending the $100 -- so he bought me a Science Fair "Microcomputer Trainer" and wanted me to learn to program that first. After I attempts to read the book with my second grade (or was it first?) reading level, I finally managed to do a little programming, but I ended up destroying the system by shorting the clock out with my fingers. My dad was satisfied and bought me the CoCo for Christmas later that year, and I've been programming ever since...

The microcomputer trainer used a simple 4-bit processor with 16 hexadecimal buttons, 4 control buttons, 7 LEDS, a 7 segment HEX LED, and a speaker. The CPU was a Texas Instruments TMS1100, which was based on the Speak and Spell's TMS1000.

I downloaded a copy of the manual here. The primary goal was to teach machine language programming to a young audience.

My question is whether it's possible to teach kids machine language in today's world by using a similar device. If so, what changes would be necessary to keep enough interest to teach a little bit about machine language and pointers? I think a small instruction set would be appropriate, but how small could you make it and still be Turing complete, and still make it fun to use?

To get the ideas started, here is the instruction set for the Microcomputer Trainer:

Main commands:
  • 0: KA = "Key into Ar" - If a key is pressed, then set Ar to key value and set flag to 0; else set flag to 1.
  • 1: AO = "A Output" - Displays contents of Ar on HEX LED and sets flag to 1.
  • 2: CH = "exChange" - Exchanges the contents of Ar with Br, exchanges Zr with Yr, and sets flag to 1.
  • 3: CY = "exChange Ar with Yr" - Exchanges the contents of Yr and Ar
  • 4: AM = "Ar to Memory" - Moves contents of Ar to the memory location indicated by following nibble and sets flag to 1.
  • 5: MA = "Memory to Ar" - Moves contents of memory indicated by following nibble into Ar and sets flag to 1.
  • 6: M+ = "Memory +" - Adds contents of memory pointed to by Yr to Ar, stores the result in Ar, and sets flag to 1 if there is a carry.
  • 7: M- = "Memory -" - Subtracts the contents of Ar from the memory location pointed to by Yr and stores the result in Ar. If an underflow occurs, flag is set to 1.
  • 8: TIA = "Transfer Immediate into Ar" - Moves following nibble into Ar and sets flag to 1.
  • 9: AIA = "Add Immediate into Ar"- Adds following nibble into Ar and sets flag to 1 if there is a carry.
  • A: TIY = "Transfer Immediate into Yr" - Moves following nibble into Yr and sets flag to 1.
  • B: AIY = "Add Immediate into Yr" - Adds following nibble into Yr and sets flag to 1 if there is a carry.
  • C: CIA = "Compare Immediate to Ar" - Sets flag to zero if Ar equals following nibble; else sets flag to 1.
  • D: CIY = "Compare Immediate to Yr" - Sets flag to zero if Yr equals following nibble; else sets flag to 1.
  • E: CAL = "Call" - Execute extended command (see below) if flag is set to 1 in previous command; else do nothing.
  • F: JUMP = "Jump" - If flag is 1, jump to byte address in following 2 nibbles; else do nothing
CALL commands (preceded by 'E'):
  • E0: RSTO = "Reset port O" - Turns off the HEX LED.
  • E1: SETR = "Set LED" - Sets LED according to number indicated by Yr (0-6)
  • E2: RSTR = "Clear LED" -Clears the LED according to number indicated by Yr (0-6).
  • E3: Not Used
  • E4: CMPL = "Complement" - Replaces Ar with its ones-complement (i.e. result of F - Ar)
  • E5: CHNG = "Change registers" - Exchanges Ar, Br, Yr & Zr with Ar', Br', Yr' & Zr'.
  • E6: SIFT = "Shift" - Shifts the contents of Ar one bit to the right and sets the flag to the opposite of the least significant bit of Ar before shifting.
  • E7: ENDS = "End Sound" - Emits an 'end' sound
  • E8: ERRS = "Error" - Emits an 'error' sound
  • E9: SHTS = "Short Sound" - Emits a 'blip' through the speakers
  • EA: LONS = "Long Sound' - Emits a long sound through the speakers
  • EB: SUND = "Sound" - Emits a note according to value in Ar (0 = no sound, 1 = la, 2 = ti, 3 = do, ... E = sol, F = no sound)
  • EC: TIMR = "Timer" - Pauses for the following number of seconds: seconds = (Ar + 1) / 10
  • ED: DSPR = "Display on port R" - Displays contents of memory locations E and F in binary on the 7 LEDs. Memory F contains the right-most 4 LEDs and memory E contains the leftmost 3 LEDs.
  • EE: DEM- = "Decimal conversion of M- result" - similar to DEM+, but subtracts instead of adds.
  • EF: DEM+ = "Decimal conversion of M+ result" - Adds together the decimal contents of Ar and the pointed address to give a decimal answer and stores that answer at the pointed address. If there is a carry, 1 is added to the pointed address less 1. After the command has been executed, the pointer is left pointer one address below the pointed address (If the number to be added is in 54, the answer is put in 54 and the pointer is reduced by 3. If there is a carry, one is added to 53)

Wednesday, January 9, 2008

iDrone reprint: Writing an accumulator generating function in Ruby

[This is an iDrone reprint, Written by Mark W on Feb 9th, 2007]

Talking with Matt, this morning, our conversation turned to the very large differences in power between various programming languages. While most would readily agree that higher level programming languages are far more expressive than assembly, many people draw the line there. Java, C++, VB, Python, Fortran are all “about the same” to quite a few people, including ,unfortunately, most of my previous bosses.

One example Paul Graham used to illustrate differences between higher level programming languages is how difficult it is to write a function that generates accumulators— a function that takes a number n as a parameter, and returns a function that takes another number i and returns n + i.

In Python, he said the commonly accepted solution was as follows:

def foo(n):
class acc:
def __init__(self, s):
self.s = s
def inc(self, i):
self.s += i
return self.s
return acc(n).inc

In Javascript, a surprisingly powerful language, considering it’s almost completely ignored outside of web page development, the solution is shorter and cleaner:

function acc(n) {
return function (i) {
return n += i

One language Paul Graham did not mention, was Ruby. I’m still fairly new to Ruby, but the solution was not difficult. Due to the power of blocks in Ruby, this sort of functional programming is extremely clean. Here is the accumulator generating function:

def acc(n)
lambda {|i| n += i}

Here’s a quick example of it in use:

f1 = acc(3)           # a function that increments by three # returns 8 # returns 2
f2 = acc(1.5) # a function that increments by 1.5 # returns 2.5 # returns 4.5 # returns 5.5

In some less expressive languages, such as VB, C++, Fortran or Java, it isn’t even possible[1] to write a simple function that can generate accumulators. Obviously since each of those languages are Turing complete, the problem would be solvable, just not with a simple function. You could, given enough time, write a Ruby interpreter in C++, and then input

def acc(n)
lambda {|i| n+i}
to get your accumulator.

[1]: Ken Anderson did offer an example of a way to kludge together an accumulator generating function in Java in Graham’s essay above. He used an integer to integer interface. Naturally enough, it fails to work for anything but integers. Generating a polymorphic function in Java, similar to the functions listed in this article would be a Sisyphean undertaking. If you can do it, I’d love to see a link to it in a comment.

iDrone reprint: How to Make a CSS Menu Bar

[This is an iDrone reprint, written by Mark W sometime in 2006; I'm posting it now because I finally got the syntaxhighlighter tool working]

Last week, a friend of mine asked how I made the CSS menu bar on my blog. Though it may seem a bit daunting to those who haven’t been using style sheets, it’s actually very simple. First, I made a list in my HTML:

<div id=”menu”>
<ul id=”navlist”>
<li><a href=”/”>Home</a></li>
<li><a href=”/chinese-tools”>Tools</a></li>
<li><a href=”/archives”>Archives</a></li>

Notice that I put the list inside <div> tags. That’s important. It’s also important that the <div> block and the <ul> have IDs, “menu” and “navlist”, in this case. After adding the above to my HTML, I had to add the following to my style sheet:

 /* Horizontal Menu */
#menu {
height: 20px;
margin: 0px 0px 6px;
background: #eec;

ul#navlist {
margin: 0px -15px 6px -15px;
padding: 0px;
white-space: nowrap;
font-weight: bold;

#navlist li {
display: inline;
list-style-type: none;
float: left;
padding: 0px 0;

#navlist a {
padding: 5px 12px;

#navlist a:link, #navlist a:visited {
color: #246;
background-color: #eec;
text-decoration: none;
#navlist a:hover {
color: #fff;
background-color: #369;
text-decoration: none;

The #menu portion just sets the height and background of the menu, the ul#navlist aligns the list to fit nicely with my layout, #navlist a determines how much space is between each item in my menubar, and #navlist a:hover is what causes the color and the background of each link to change when you hover your mouse over them. Voila, a navigation menu that changes colors when your mouse is over it. Best of all, there’s no Javascript involved. Idrone also uses a very similar CSS menu bar.

iDrone reprint: Thoughts on the Subset Sum Problem

[This is an iDrone reprint, written by Matt Ball sometime in 2006]

Wanna win a million dollars? All you have to do is solve one of the seven Millennium Problems. These are seven of the greatest currently unsolved mathematical problems. My favorite Millennium problem is the question of ‘P vs. NP’. In short, the challenge is to prove that problems we think are “hard” (i.e. NP) are either truly hard or are “easy” (i.e. P). Sometimes you’ll see this problem expressed as ‘P = NP’ or ‘P != NP’. For a more detailed description of ‘P vs. NP’, see this Wikipedia article.

You’ve probably heard of several NP problems. There’s the famous Traveling Salesman problem, in which a salesman attempts to find the shortest path that connects several cities. There’s also Sudoku, the new puzzle craze coming out of Japan. Even tetris and minesweeper are NP problems. All of these examples are from a class called ‘NP-complete’, which means that finding an “easy” solution to one of them within this class results in an easy solution to them all. Conversely, if it’s possible to prove that any of these is hard, you prove that they are all hard.

Probably the simplest NP-complete problem to describe is the ‘subset sum problem’. Here’s the problem: Given a set of integers (both positive and negative), find a subset whose sum is zero (not including the empty set). Here’s a simple example:

Find the set of integers from (-8, -5, 2, 3, 14) whose sum is zero.

Not to keep you in suspense, the answer is (-5, 2, 3). Pretty easy, right? Sure - if there are only 5 numbers in the set! But difficulties quickly arise when you start adding a lot of numbers to the set. If I had a set of 100 numbers, it would probably take something like a year to solve it on my computer. If my set had 1000 numbers in it, it would probably take more energy than is in the known universe to find all the solutions, using the best-known algorithm. This is because the best-known algorithm to solve the subset sum problem takes roughly O(N2^(N/2)) operations (using Big-O notation). Given a set of 1000 numbers, the solution requires roughly 10002^500 operations, which is a very, very large number. This is what I mean by hard.

The catch is that we don’t know for sure that the best algorithm for solving the subset sum problem is O(N*2^(N/2)). Nobody has proven either way. The first person who does will be a million dollars richer, so there’s definitely incentive to find a solution. There’s also a lot of other money to be made by quickly solving certain NP-complete problems, including circuit optimization. On the other hand, there is also incentive to prove that NP-complete problems are hard. Several cryptographic algorithms are based on NP-complete problems. If someone finds a way to efficiently solve them, all these algorithms become worthless for cryptography.

So now that we’ve got the preliminaries taken care of, I want to describe some of my thoughts on a solution to the subset sum problem.

First, let me explain a simple algorithm to solve the subset sum problem. If the integers in the problem set aren’t too large (less than 2^N, roughly), it’s possible to record all of the possible solutions in an array, then just look to see if zero is among the solutions. To create this array, you compute all possible sums that both include and exclude each integer. The trick is that the number of operations grows according to the span between the largest positive sum and the largest negative sum.

The process I just described can be accomplished using a discrete convolution. For each integer, create a graph that has a 1 at both zero (on the x-axis) and the integer’s value. All other values are zero (on the y-axis). This represents the two possible uses of the integer; either you include the integer and add its value, or you don’t include the integer and add zero. The result is the convolution of all the integer’s graphs. Each value on the resulting graph shows the number of solutions that sum to that given value.

For example, let’s take 3 integers: (2, 3, 5). As we convolve these values, we get the following solutions:

Each graph for this example is expressed as an array that starts at zero. For simplicity, I’ll only use positive integers. Start with the array [1], which means there is a solution at zero (the sum of the NULL set).

  • Add 2: [1] * [1 0 1] = [1 0 1]
  • Add 3: [1 0 1] * [1 0 0 1] = [1 0 1 1 0 1]
  • Add 5: [1 0 1 1 0 1] * [1 0 0 0 0 1] = [1 0 1 1 0 2 0 1 1 0 1]

Here is the resulting set of possible sums: (0, 2, 3, 5, 7, 8, 10), with two ways to get 5. (Note: the above example is easy to compute using MATLAB’s convolution operator).

Now here’s a trick: it is possible to quickly compute convolutions using Fourier analysis. Stated differently, it is possible to solve the subset sum problem using waves. The convolution of two functions is the same as the product of their frequency representations.

Before I get too much further, I want to step back and talk about how to solve problems involving Fourier transforms.

It is possible to perform an optical Fourier transform by shining light through a small hole and measuring where the light hits a wall on the other side. The angle of diffraction depends on the wavelength of the light. The end result is a sinc function (sin(x)/x), which happens to be the Fourier transform of a square function, as represented by the slit.

By representing each integer as a beam of light with a frequency that corresponds to the integer’s value, it could be possible to modulate several beams of light together, then take the fourier transform (i.e. shine through a diffraction grating) of the combined beam to determine the possible solutions. This is an example of using physics to solve a mathematical problem.

Another interesting consideration is that quantum computers also perform a Fourier transform of sorts, so it could be possible to use a quantum computer to quickly solve a subset-sum problem. I don’t know a lot more about quantum computers, so I’ll stop here before I get into trouble. Based on this possibility of using physics to solve mathematical problems, I suspect that eventually we will have computing devices that are capable of quickly solving NP-complete problems. It’ll just be a matter of working out the details.

iDrone reprint: On Google's Evilness

[This is a reprint from iDrone, written by John P sometime in 2006]

There’s been a fair amount of talk lately about Google’s evilness. Google co-founder Sergey Brin recently admitted that with regards to the censorship issue, Google slid a bit closer to the Dark Side. Meanwhile Lore Sj√∂berg at Wired News ponders ways that Google could be more evil.

All this got me thinking about various philosophical questions regarding Google. For example:

  1. If Google caves in to the censors in China, it becomes more evil. But what if it were to use such a deal to get a stronger foothold, then turn the tables at some crucial point and suddenly supply mainland China with a regime-horrifying massage dose of truth? Is backstabbing the evil another act of evil or an act of virtue?

  2. What if Google totally stood up to China, and got the boot as a result? This would obviously be deemed by most as a virtuous deed. Everyone in China, however, would be really pissed about being denied Google. They might be forced to turn to government-sanctioned alternatives, thus causing a surge in the competition. But that would be an overall increase in evil, right?

  3. If Google were then to launch a grassroots cyber-subversion campaign, it would garner tremendous support. If the Google uber-brain and global resources were applied to this problem, who knows how poewrful it could be. But how would the world react? What could China do? Would some see this as evil hegemony?

  4. What if Google, with its information domination, really were that powerful? What if China reached a critical point in its history with regards to availability of information that certain actions on Google’s part could change—dare I say topple?—a regime? Would Google be evil to act on it, or to not act on it? Who would make such a call?

These are all pretty wild fantasies, but when it comes to evil, some pretty wild stuff can occur. (You know, sharks with lasers and the like.)

iDrone reprint: The Perfect Programming Language

iDrone reprint: The Perfect Programming Language
By Matt Ball, written some time in 2006

I’ve recently become interested in finding the ‘perfect’ programming language. This is mostly because I use one of the more out-dated languages, “C”. I work as an embedded software (firmware) engineer, where I create firmware for tape drives. Embedded systems is one of the few industries that hasn’t switched wholesale to Java or one of the .NET variants. As a result, I’m stuck in a sea of developers who have been programming C and nothing but C for the last 20 years. Just to retain my sanity, I’ve got to look at other approaches.

The idea of a ‘perfect’ language is a little far-fetched, but makes a good started point for knowing what to look for in a new language. If a perfect language existed, it would likely have some of these features:

  • Compiles to code that runs as fast as C (or even faster). This removes the excuse to ever use C (or C++).
  • Does not use a pre-processor
  • Contains built-in garbage collection
  • Allows creation of low-level data structures and data types.
  • Natively supports integers of any size, between a single bit and an arbitrary large number of bits (BigNum). Along the same lines, natively supports modulo arithmetic to facilitate RSA and D-H cryptography (e.g. X * Y mod Z).
  • Supports all the ‘good’ object oriented features, including single inheritance (not multiple like C++)
  • Provides a build-in mechanism for determining whether parameters and data can be trusted.

The list goes on from there.

Relating to this subject, I found an interesting Slashdot article talking about someone from IBM that is doing a series on a similar topic. He’s approaching the problem from the perspective of a Java programmer looking at other languages, but the goal is roughly the same: look at several programming languages and find their strengths and weaknesses.

One good language that I found is the “D” programming language. This language is meant as an enhancement to C and C++, and provides several of the things in the list above. Unfortunately, D is still in the beta stage, so it’s not possible to use this in a real product. It also doesn’t have any books that I know of (having a book is one of Paul Graham’s requirements for a good programming language).

Of course, there are many other good programming languages out there, like Ruby, Python, and Lisp, but none of these languages compile tightly enough, or provide enough low-level control for embedded systems. This ‘perfect’ language needs to have the ability to compile small, if needed.

Finding the perfect language will be a long journey, but in the end I expect it will be well-worth the time, and if nothing else, will provide insight into how to use existing languages more effectively.

R.I.P.: Intelligent Drone (iDrone)

Back a couple years ago, I embarked on a joint blog project with Mark of Doubting to Shuo and John of Sinosplice. This blog was named "Intelligent Drone" (or iDrone for short), and was meant to be a common place for all of us to put our blogs that related to technology. The theory was that the readership of a blog increases significantly if the posting is regular, and with 3 authors the posting would be roughly 3 times as often as 3 independent blogs. We all had the occasional technology blog, but these typically wouldn't fit very well with each of our individual blogs.

Over the years, we had a number of good stories, but the blog never really took off. I had a new baby, and never had the time I wanted for blogging. Both Mark and John still continue to regularly post to their main blogs and posted more often to iDrone than I did, but it wasn't too often. At the end, we invited a few guest bloggers, but it still wasn't enough to revive the interest in iDrone.

Then someone hacked DreamHost (the host for iDrone) and corrupted a bunch of pages with spam. John was never quite able to get iDrone back the way it used to be, and didn't want to devote the time to fix it.

The last nail in the coffin came just recently when the name lapsed, and was bought by someone else who has posted two stories so far relating to license plates and things. It's not even quite clear to me that the new is a real blog or some strange auto-content generation system. It doesn't allow posting comments, and each article is a short paragraph with a single link to a page that sells services of some kind.

To preserve the old stories for posterity, I will grab several of the old iDrone articles and post them here (based on the Google Reader cache). The next several stories on this site will be reprints of these articles.

For those who want to remember the old iDrone, please post comments here with anecdotes and anything else that might get you teary-eyed.

R.I.P. iDrone