Category Archives: Code

n-1

Given the nth item, grab the n-1 item in a list.

Simple, right? You essentially create a stack of size 1 to hold the previous element.

public T GetPrevious<T>(IOrderedEnumerable<T> list, T target)
{
    var previous = null;

    foreach (var item in list)
    {
        if (item == target)
        {
            break;
        }

        previous = item;
    }

    return previous;
}

As usual, LINQ provides a more condensed and elegant solution:

public T GetPrevious<T>(IEnumerable<T> list, T target)
{
    return list
        .OrderByDescending(i => i.Updated)
        .SkipWhile(i => i != target)
        .Skip(1)
        .FirstOrDefault();
}

The n-ith element should be the same logic.  This is the actually first time I’ve used SkipWhile! You can use TakeWhile to do something similar, coming from the other end of the list.

IndexOf

The IList<T> interface has an IndexOf method (and a corresponding Item property). The IEnumerable<T> interface does not have an equivalent IndexOf method (though it does have an extension method ElementAt<T> that mirrors the Item property).

Naturally, you can add your own extension method:

public static int IndexOf<TSource>(this IEnumerable<TSource> source, TSource t)
{
    var index = source.TakeWhile(s => s != t).Count();
    return index == source.Count() ? -1 : index;
}

Deferred Execution

Or, not putting things back where they belong.

I am a big fan of using LINQ, but it’s not without its gotchas.  I recently wrote some code that was behaving unexpectedly, and it took me a bit to figure out what was actually going on, even though it was actually pretty simple.  (Sometimes it’s the simple things that we forget.)

So, let’s bake a cake.  Here’s a snippet.

. . .

var unusedDry = ingredients.Where(i => i.IsDry() && !i.Used);

foreach (var ingredient in unusedDry)
{
    this.Measure(ingredient);
    this.Combine(ingredient);

    ingredient.Used = true;

    this.ReturnToCupboard(ingredient);
}

this.Mix();

. . .

But let’s be honest.  No matter how well your parents taught you, no one actually puts things away immediately after using them.  Usually we do this all in one go at the end.  That’s a simple enough change:

var unusedDry = ingredients.Where(i => i.IsDry() && !i.Used);

foreach (var ingredient in unusedDry)
{
    this.Measure(ingredient);
    this.Combine(ingredient);

    ingredient.Used = true;
}

this.Mix();

. . .

await Task.WaitAll(unusedDry.Select(i => i.ReturnToCupboardAsync());

Sweet!

Or is it?  Turns out that none of the unused dry ingredients got put back where they belong.

When you write LINQ, it’s so natural that sometimes it’s easy to forget that your variable doesn’t actually contain the results of your query, but instead a data structure that contains the query itself, to be executed at some later point in time, when it needs to.  In this case, the Task.WaitAll() method executes the query that’s passed in, namely ingredients.Where(…).Select(…). Since the Used property has been set on each of the unused dry ingredients at this point in time, the results of the query are in fact empty, and Task.WaitAll() is given an empty IEnumerable<Task>.

At times like these, you can use the ToArray() or ToList() methods to force the evaluation of the query and to cache the results.

For more, see Classification of Standard Query Operators.

Implicit Typing: To var or not to var?

Since the beginning of time, religious wars have been fought over differences in religion and beliefs.  I think this is one such war.  C# 3.0 introduced the concept of implicit typing back in 2007 with the keyword var.  Since then, you are bound to find people vehemently opposed (and disgusted) by the concept, and others that embrace it to the point of no return.  I’m going to try to take an impartial look of implicit typing here (but you may conclude that I have failed).

First off, the facts.  MSDN’s “Implicitly Typed Local Variables” has a very concise explanation, but let’s try to distill it even further.

  • Local variables (and only local variables) can be declared with an inferred “type” of var instead of an explicit type.
  • The var keyword tells the compiler to infer the type of the variable based on the right hand side of the initialization statement.  (ie. You must be in a statement that declares and initializes the local variable to use var; there is no: var x;)
  • In many cases, the use of var is optional and is syntactic convenience.
  • var is not optional when the variable is initialized with an anonymous type.
  • var may be useful with query expressions where the exact constructed type of the query variable is difficult to determine.
  • var can be useful when the specific type of the variable is tedious to type on the keyboard, or is obvious, or does not add to the readability of the code.
  • var does have the potential to make your code more difficult to understand for other developers.

When a member of my team, CS, started using the keyword var, I despised it.  Though the concept is so simple, it’s actually a fairly large paradigm shift.  I thought it made code less readable, and also broke my IDE’s autocompletion feature.  At first he suggested that we only use it for LINQ queries, initializations with constructors, and casting statements.  Trying to keep an open mind, I gave it a shot, and it didn’t take long before I found out that I actually liked it.  Code was looking nicer, cleaner, and simpler.  Talk about heading over to the dark side!

Dictionary<string, List<Item>> dictionary = new Dictionary<string, List<Item>>();

    vs.

var dictionary = new Dictionary<string, List<Item>>();

Which looks better?  (In case it’s not obvious, I think the latter is leaps and bounds better.)  Or:

DateTime dateTime = (DateTime)eventArgs;

    vs.

var dateTime = (DateTime)eventArgs;

At this point though, CS had moved on to using it practically everywhere (except for value types)!  What the heck!?  It is a slippery slope.  I acknowledge that this is probably not as obvious:

foreach (var kvp in this.dictionary) { … }

    vs.

foreach (KeyValuePair<string, List<Item>> kvp in this.dictionary) { … }

… but it’s actually way more readable.  Anyway, you can probably tell that I slid down that slippery slope.  Now I, too, use it when I can (except for value types, though I would not be surprised before I start using it literally everywhere).  There has been backlash though, against this usage.  Why?

By far the largest argument against the usage of var that I’ve seen is that it’s not always clear from looking at the declaration of the variable what the type really is (though this is certainly not true in the previous two examples):

Con: I can’t tell what the type is.
Pro: If you use VS and hover over the variable, it tells you what the type is.
Con: I don’t use VS.
Pro: …
Con: I can’t tell what the type is in Notepad.
Pro: O__o’

Actually I don’t use VS either (I use a combination of SourceInsight and vim, neither of which grok var), yet I like var a lot.  Why?  At the end of the day, I think that argument comes down to the naming of local variables.  When you look at a variable being used in code, you are usually looking at its usage instead of its declaration.  I find that naming the variable with a descriptive name often helps much more than the declaration statement far above, because you don’t need to search for the declaration at all!  For instance:

stringBuilder.Append(“foo”);

    vs.

s.Append(“foo”);

What the heck is “s”?  (Yes, sometimes I will shorten it to “sb” too.)  It has been brought up many times before that the benefit of “better naming for local variables” “really means it compels developers to use longer Hungarian style variable names.” –Carnage4Life.  Personally I think that is a good thing to follow in general; I don’t find it adding to the “noise” in the code.

One other benefit of var that I don’t see purported too often is that when you change a type, property, or return type of a method, you don’t have to go change every usage of that type as well.  Take, for instance:

var fraction = new { Numerator = 9, Divisor = 11 };
var value = fraction.Numerator / fraction.Divisor;

The second variable, value, would be implicitly typed to be an ‘int’ in this scenario.  But if I were to change ‘Numerator’ to 9.0 instead of 9, value would be typed as a float, without me having to change that line.  Obviously this is somewhat of a contrived scenario, but one could easily imagine changing a property in a class (or a return type of a method) and have all the callers of that property to “just work.”

I can’t imagine that the debate to var or not to var will ever end, but I for one, love it.  After having used it dutifully for several months, I cringe at the thought of typing out types—it’s so 20th century!  I’m completely hooked.

Let’s end off with a comment thread from ReSharper:

C:  Variable type declaration is not equal to variable instantiation.

P:  If this were true, the compiler would be unable to infer the type.

C:  I’ve fought far to many battles in classic ASP and vbscript where the type I thought was going into the variable wasn’t the type I ended up with.

P:  Comparing a dynamic language with a strongly-typed one isn’t a great way to develop best practices.  Yet again, I have to ask why people are scrolling to the point of variable declaration to determine the type, rather than just hovering over the variable.

C: That’s supported in Visual Studio only. What if you’re reading code on paper, on a web page, pdf or plain text editor?

P:  If that’s a primary design concern, I worry about your development process.

C:  I guess you’ve never heard of code reviews.

P:  Sure, we do them too, just not on stone tablets.

It’s certainly interesting how something so seemingly trivial could incite such passionate debates among people.

OEMCP vs. ACP

One of the trepidations I had when starting to work on client software three years ago was dealing with globalization and localization.  I had heard of horror stories of how much busy work localization involved (they have teams that deal with that issue alone), and it was a black box for me.  (Yes, I realize that they’re just strings in resource files.)  Fortunately enough, since I have been working on a client platform, instead of the UX on top of the platform, I never had to deal with localization at all.  In fact, I was pleasantly surprised when I first heard that sync’ing files like 好嗎.txt actually worked.  Effortless!

My naïvité lasted up until a month ago, when I investigated a bug report from a customer.  I sat on the bug for a while, as I was a little perplexed by the whole report.  It was most certainly some sort of localization issue, and it took some head-banging before I figured it out.  The details aren’t particularly important here, so let me just get to the gist of the bug.

At the command prompt: (Yes, the command prompt.  Does that still surprise you?  Seven of my 10 previous “code” posts have to do with the command prompt.)

C:>echo Comment ça va?
Comment ça va?

I took French in high school.  That’s a cedilla under the ‘c’.  Anyway, there’s nothing unexpected here.  Echo does its job as asked.  Now for some redirection:

C:>echo Comment ça va? > salut.txt

C:>type salut.txt
Comment ça va?

Again, nothing unexpected here.  Yawn.

But wait!  (There’s more!)

C:>notepad salut.txt

Notepad will open up.  And what does it show?

Comment ‡a va?

What the heck?!  Blink.  Twice.  Huh?  What’s going on?  I did a bunch of digging and found all sorts of interesting information online about code pages.  From Wikipedia:

Code page is the traditional IBM term used to map a specific set of characters to numerical code point values.  …  [T]he term is most commonly associated with the IBM PC code pages. Microsoft, a maker of PC operating systems, refers to these code pages as OEM code pages, and supplements them with its own "ANSI" code pages.

It turns out that on an EN-US operating system, the OEM code page is CP437, whereas the default ANSI code page (ACP) is CP1252.  Apparently the command prompt uses a different code page than do GUI programs, so this must be an encoding issue.  If you look at the file in a hex editor, you’ll see that the character ‘ç’ is encoded as 0xe7, as expected from the OEMCP: 87 = U+00E7 : LATIN SMALL LETTER C WITH CEDILLA.  But in the ACP, 0xe7 is ‘‡’: 87 = U+2021 : DOUBLE DAGGER!

So does that mean that the issue goes away if you change the code page the command prompt is using to the one UI apps use?

C:>chcp /?
Displays or sets the active code page number.

CHCP [nnn]

  nnn   Specifies a code page number.

Type CHCP without a parameter to display the active code page number.

C:>chcp
Active code page: 437

C:>chcp 1252
Active code page: 1252

C:>echo Comment ça va? > salut.txt

C:>notepad salut.txt

Comment ça va? (in Notepad)

Check that out!  Ah, oui!  Bien sûr!

Raymond Chen has a bit more on the details and the historical reasons behind the schism.  Thanks to Michael Kaplan for a series of excellent posts on the subject.  I still know next to nothing about localization, but it’s pretty cool when you figure things out, even if those things were discussed ad nauseam years before you even came across it.  (And just to wrap this up, the Win32 function GetACP() will allow you to get the current ANSI code page, but that isn’t scriptable unless you wrap it in an exe.)

I’m floating the idea of doing a “Bugs of Live Mesh/Framework” series.  (Bonus points for anyone that can figure out how this bug relates to Live Mesh.  Hint: It’s related to a previous post I made here over a year ago.)  I was thinking of covering some of the more ‘interesting’ bugs: why they exist, what fixes were/will be made, what workarounds there are, or even to solicit feedback.  What say you?

Where is my Desktop?

DesktopProperties

 

Some people tend to save their documents and files on a separate drive.  This allows the flexibility of flattening the Windows/OS partition at any time, without losing any data.  This is one reason I have never been a fan of using the %USERPROFILE% folders (eg. C:Users<UserName> on Vista or C:Documents and Settings<UserName> on XP).

It turns out, though, that you can redirect these ‘special’ shell folders easily on Vista, or with the Tweak UI PowerToy for XP (or hacking up the registry).  These folders include things like “My Pictures,” “My Music,” “Favorites,” or even your “Desktop” folder.  On Vista, for instance, when you right-click the Desktop folder and select Properties, you get the dialog on the right.

If you’ve done this, obviously it’s not good enough to assume that your desktop is in %USERPROFILE%Desktop.  You’ve got to open up the registry to find the location of these “User Shell Folders”:

set OUTPUTDIR=%USERPROFILE%Desktop

set SHFOLDER_REGISTRY_KEY = "HKCUSoftwareMicrosoftWindowsCurrentVersionExplorerUser Shell Folders"
for /f "tokens=2*" %%i in (
    ‘REG QUERY %SHFOLDER_REGISTRY_KEY% /v Desktop’
) do (
    call set OUTPUTDIR=%%~j
)

Notes:

  • We’re querying HKCU, so no elevation on Vista is required.
  • Most of values underneath this key are of type REG_EXPAND_SZ.  This is why we need to ‘call’ to expand the value.
  • Bonus points if you can tell me why I start with token #2.

In some cases, you may also find that there is a “DesktopDirectory” value as well as a “Desktop” value underneath that key.  I cannot eloquently explain the difference, but perhaps you may be able to interpret Wikipedia’s explanation:

The "Desktop" virtual folder is not the same thing as the "Desktop" special folder. The Desktop virtual folder is the root of the Windows Shell namespace, which contains other virtual folders.

I believe that DesktopDirectory is the correct value to use, but sometimes it is not available.  Most of the time the two values are equivalent.

I bet you someone on the shell team could explain this properly, if he hasn’t already.  I’m sure there’s an interesting story behind it.

Extension Methods and LINQ

I know very little about the new features of C# 3.0 and LINQ.  I’ve been playing around with it a little bit and was puzzled by how it all works.  This is probably old news for a lot of people, but it’s all new to me.  For instance, let’s take the simplest possible query:

int[] numbers = { 4, 8, 15, 16, 23, 42 };
var query = from n in numbers where IsPrime(n) select n;

Trying to compile, I got this compiler error:

Error: Could not find an implementation of the query pattern for source type ‘int[]’.  ‘Where’ not found.  Are you missing a reference to ‘System.Core.dll’ or a using directive for ‘System.Linq’?

You might get this error if you’re missing some references (as the error message says).  I was getting it due to some other external issue.  But in trying to fix/investigate the issue, I first tried to figure out what exactly the compiler was doing behind the scenes.  The method ‘Where’ isn’t available on IEnumerable or Array, so it seemed like a valid error.  What was I missing?

I couldn’t find any good information online as to how the compiler actually takes the LINQ syntax and generates code that resembles something more IL-ish.  There are plenty of sample queries online, but I couldn’t find any good site that really explains how it all works.  So after a bunch of digging I managed to piece together that the compiler actually translates the above query to something like this:

var query = numbers.AsQueryable().Where(n => n.IsPrime());

Interesting.  That kind of explains where the error message is coming from.  But still, I couldn’t for the life of me figure out how the heck this worked.  There’s no ‘AsQueryable’ method on Array or IEnumerable, or ‘Where’ on IQueryable either.  Where are these methods coming from?

It turns out that the compiler further translates the query to something like this:

IQueryable<int> query = Queryable.Where<int>(Queryable.AsQueryable(numbers), delegate(int n) { return IsPrime(n); });

How the heck?!  Where did the Queryable class come from?

Apparently there is this new concept of “extension methods” in C#.  You can add methods to existing types without creating a new derived type, or recompiling, or modifying the original type.  Wow!!  Far out!

System.Linq has a static class ‘Queryable’ that does exactly this; it extends IEnumerable and IQueryable<T>.

public static IQueryable AsQueryable(this IEnumerable source);
public static
IQueryable<T> Where<T>(this IQueryable<T> source, Expression<Func<T, bool>> predicate);

All you have to do is add the directive “using System.Linq;” to take advantage of this functionality.  Check out the rules on how to implement a custom extension method.

And now it all makes sense.  Very cool!

Date Time

I do many more things than batch scripts all day, and someday (soon) I’ll blog about some of that stuff, but for now, this will have to suffice.

I had several lines of batch code I created several years ago that created a unique filename from the current date/time with the help of two environment variables:

C:> echo %DATE% %TIME%
Tue 04/08/2008 22:48:43.84

So the code I had:

set NOW=%DATE% %TIME: =0%
set NOW=%NOW::=-%
set NOW=%NOW:.=-%

set YEAR=%NOW:~10,4%
set MONTH=%NOW:~4,2%
set DAY=%NOW:~7,2%
set NOW=%YEAR%-%MONTH%-%DAY%_%NOW:~15%

gave a nice:

C:> echo %NOW%
2008-04-08_22-48-43-84

which would allow you to sort files by filename in chronological order.  It worked quite well and I had used it for years.  Recently though, I started seeing some weird behaviour from some people.

C:> echo %NOW%
Tue-8/-00_22-48-43-84

This left me scratching my head for a while, until I asked one of these people to run the following on their machine:

C:> echo %DATE% %TIME%
04/08/2008 Tue 22:48:43:84

Gosh darn those locale settings!  It’s not enough to handle just this specific case, as in Canada for instance, dates are written DD/MM/YYYY as opposed to the US’s MM/DD/YYYY.  So I went online and found all sorts of strange ways to figure out the format of the date.  The best way I read of, to get locale was by crawling through the registry.  I wasn’t particularly motivated to do this, so left it at that.  Fortunately this particular script got deprecated in favour of new one I wrote that didn’t require date/time uniqueness.  Saved!

Progressive Dots

I just found out about this (thanks to Live Search!) from a fellow Canadian’s blog entry "batch file snippets" and it’s got me super excited.

Haven’t you ever found it annoying that you can’t print to a line without a newline (carriage return/line feed) in a batch script?  (At least, I didn’t know how to.)  Well, now you can!

For instance:

echo Copying files .

for %%f in (A B C D) do (
    echo .
    xcopy %%f %DEST% /cqy 2>&1>NUL
)

would give you:

Copying files .
.
.
.
.

as opposed to a progressive:

Copying files ….

So what you actually want is:

set /p CRLF=Copying files .<NUL
for %%f in (A B C D) do (
    set /p CRLF=.<NUL
    xcopy %%f %DEST% /cqy 2>&1>NUL
)

Don’t tell me that’s not cool!  How does it work?  The ‘/p’ option given to set is asking for user input:

SET /P variable=[promptString]

The /P switch allows you to set the value of a variable to a line of input entered by the user.  Displays the specified promptString before reading the line of input.  The promptString can be empty.

The key is that it prompts for input *on the same line*!  And by redirecting the NUL device into it, you get an immediate return.  How absolutely clever.

I love it!

CabIt!

I hesitated about posting this because I was talking to Steve last week about someone that had a Ruby fetish, much like someone else seems to have picked up a Python fetish.  I then realized that one could make a case that I have a Batch script fetish, seeing as how I’ve made three posts with it.  I actually really like Perl, but those skills have gotten rusty as it’s not installed natively on a Windows machine.  That’s why I turn to batch scripts sometimes–those scripts will work on all boxes.

Something else that’s not natively installed in Windows is the ability to zip a file from a command line.  (Please correct me if I’m wrong.)  I presume there are legal issues with bundling this functionality with the OS.  Instead, Windows gives the ability to create CABinets.  I couldn’t find any good (quick/dirty) documentation on how to make a single CAB from a number of files, while preserving hierarchy (ie. folders).  I was eventually forced to wade through the official SDK to figure out how to do this.

So here’s a simple script I "whipped" up through trial and error (it actually took a lot of time to figure out what options to give MakeCab.exe).  Give it a path and it will CAB up everything under that path into a file "Zipped.cab".  Admittedly, there are some ‘hacks’ here, as I think CABs were originally intended to zip up files to span across disks (like floppies).

 

REM ———–
REM * CAB IT! *
REM ———–
:CABIT

set OUTPUT=Zipped.cab

set DIRECTIVEFILE=%TEMP%Schema.ddf
set TARGET=%1
set TEMPFILE=%TEMP%TEMP

if not exist %TARGET% (
    echo %TARGET% does not exist.
    goto :EOF
)

pushd %TARGET%

echo. > %DIRECTIVEFILE%
echo .set CabinetNameTemplate=%OUTPUT%
                                 >> %DIRECTIVEFILE%
echo .set DiskDirectoryTemplate= >> %DIRECTIVEFILE%
echo .set InfFileName=%TEMPFILE% >> %DIRECTIVEFILE%
echo .set RptFileName=%TEMPFILE% >> %DIRECTIVEFILE%
echo .set MaxDiskSize=0          >> %DIRECTIVEFILE%
echo .set CompressionType=LZX    >> %DIRECTIVEFILE%

call :CAB_DIR .

MakeCab /f %DIRECTIVEFILE%

del /f %DIRECTIVEFILE%
del /f %TEMPFILE%

popd

goto :EOF

REM CAB Helper
:CAB_DIR
echo .set DestinationDir=%1 >> %DIRECTIVEFILE%
for /f %%i in (‘dir /b /a:-d %1’) do (
    echo %1%%i >> %DIRECTIVEFILE%
)
for /f %%i in (‘dir /b /a:d %1’) do (
    call :CAB_DIR %1%%i
)
goto :EOF

 

To extract the cab file, you would use "Extract.exe", also found natively on Windows.  Note that if you view the Cab in Windows Explorer, you don’t actually see hierarchy like you do with zipped files.  That threw me off for a bit; weird behaviour.  Using "Extract.exe" would deflate the file into its hierarchy, or you can use Explorer to unzip specific files.

Notes:

  • I couldn’t figure out how to get MakeCab to stop generating the .inf and .rpt files either, so my workaround for that is kind of ugly.
  • MakeCab doesn’t like empty directories, so it’ll spit out errors.  But I think you can safely ignore them.

Comments welcome.