Introduction
I didn’t noticed this behavior of the GetFiles() method until now, I must admit. It’s something not frequent to see, but might happen. And it’s dangerous.
As this post , and the MSDN library itself state, when you use the GetFiles() method with a search wildcard that includes the asterisk symbol, and you include a 3 characters long extension (like *.xml, or *.jpg), the GetFiles() method will return any file whose extension STARTS with the one you provided. That means that a search for *.jpg will return anything with extensions like: *.jpg, *.jpg2, *.jpegfileformat, etc.
This is a quite weird behavior (and not too elegant, I should say), introduced to support the 8.3 file name format. As stated in the above mentioned blog:
“A file with the name “alongfilename.longextension” has an equivalent 8.3 filename of “along~1.lon”. If we filter the extensions “.lon”, then the above 8.3 filename will be a match.”
That’s the reason to make the GetFiles() method behave that way. The official MSDN explanation:
Note |
---|
When using the asterisk wildcard character in a searchPattern (for example, "*.txt"), the matching behavior varies depending on the length of the specified file extension. A searchPattern with a file extension of exactly three characters returns files with an extension of three or more characters, where the first three characters match the file extension specified in the searchPattern . A searchPattern with a file extension of one, two, or more than three characters returns only files with extensions of exactly that length that match the file extension specified in the searchPattern . When using the question mark wildcard character, this method returns only files that match the specified file extension. For example, given two files in a directory, "file1.txt" and "file1.txtother", a search pattern of "file?.txt" returns only the first file, while a search pattern of "file*.txt" returns both files. |
In my case, I had a bug in my software because I temporally renamed an XML file to xxx.XML2222, just to wipe it out of the application. The program was still reading it, what made it had a wrong behavior.
A workaround for this issue
If you want to prevent this behavior, you will need to do a manual check for the returned array of FileInfo classes, to remove those not matching your pattern. An elegant way to do so, is to write a MethodExtender to the DirectoryInfo class, like the following one:
/// <summary>
/// Returns array of files that matches the search wildcard, but with an exact match for the extension.
/// </summary>
/// <param name="pSearchWildcard"> Search wildcard, in the format: *.xml or file?.dat </param>
/// <returns> Array of FileInfo classes </returns>
public static FileInfo [] GetFilesByExactMatchExtension( this DirectoryInfo dinfo, string pSearchWildcard)
{
         FileInfo [] files = dinfo.GetFiles(pSearchWildcard);
         if (files.Length == 0)
             return files;
 
         string extensionSearch = Path .GetExtension(pSearchWildcard).ToLowerInvariant();
         List < FileInfo > filtered = new List < FileInfo >();
         foreach ( FileInfo finfo in files)
         {
             if (finfo.Extension.ToLowerInvariant() != extensionSearch)
                 continue ;
             filtered.Add(finfo);
         }
         return filtered.ToArray();
}
This way, just by the regular GetFiles() method of the DirectoryInfo class, you will find now the brand new GetFilesByExactMatchExtension(), which will have the desired behavior.
Note : In order to be able to use this method in a class, just like any other MethodExtender, you will need to include a “Using” statement to the extension method’s namespace.
Hope it helps !
0 comments:
Post a Comment