Thursday, July 15, 2010

Coding for Effect vs Coding a Model

One of the software engineers I've worked with in the past has a very different way of doing things. He will take the most expedient way of implementing a particular piece of software in order to get the job done, always. Every time. (Admittedly, there is the odd time where this is the most effective means of doing software in order to get a job done, and potentially avoid getting fired, but that's for another discussion). The person in question takes no pride in his own work (by his own admission) and doesn't care about the quality of his work. Essentially, he codes with only the desired effect of his work in his mind. I've found that this way of doing things (while being faster) is a great way to introduce bugs and to cause problems later that arise as a lack of planning and forethought, ie lack of maintainability, inability to add features to the software.

Whenever I write my own software, I write it to model what's going on in a business or scientific process. (Really, isn't that what software's supposed to do ? ;P) I think about what's going on, and I try to model it in the software, taking into account all possible factors (or at least everything that reasonably occurs to me) that may influence the process. This typically results in a reliable system that very rarely breaks down. When a breakdown does occur, it's always been the result of something that I hadn't anticipated. I then go back to the software, and re-evaluate the model to see if there's something I need to change, and change it. This way of implementing software results in software that's easy to maintain, is (virtually) free of side effects, and can virtually eliminate undesired behaviour.

Monday, July 05, 2010

Working with CSV files in Excel the way you want ... rather than the way Microsoft tells you to by default

In both my current job and my past jobs, I've worked with CSV files in Microsoft Excel as a matter of necessity. Excel is just too useful not to use (OpenOffice has its own quirks, but that's for another post.) The only problem with Excel is that it formats fields automatically when you open a CSV file by double clicking on it or by going through the File -> Open option (or Ribbon -> Open in Office 2007 and later). Sometimes this is nice, but most times it's a pain in the ass, especially when you want to be able to save that CSV data right back out to CSV again. What happens is Excel converts the values to particular data types after introspecting the data in the cells. This is annoying and stupid, especially when you have financial, scientific and engineering data that's in a format that doesn't fit will into Microsoft's algorithm for dealing with numbers, dates, times, and currencies.

There is a way around this giant annoyance. Instead of taking the easy way of opening the file, you can open Excel directly, with a fresh spreadsheet. Then, on the ribbon, go to the Data tab, and click on the 'From Text' button. This will let you open a delimited text file, and treat it as a data source, and Excel will give you options for opening the file and how you want to deal with its contents. This is much better for dealing with the data, especially when dealing with tab-delimited files.