Porting Word2MediaWikiPlus to VB.NET: Part 6

[This series has six previous articles: the prologue, Part 1, Part 2, Part 3, Part 4 and Part 5.] 

My First round of debugging

Figuring that it would be wise to debug the code so far before moving on to the next module, I fired up F5 and let ‘er rip:

Type mismatch. (Exception from HRESULT: 0x80020005 (DISP_E_TYPEMISMATCH))

************** Exception Text **************
System.Runtime.InteropServices.COMException (0x80020005): Type mismatch. (Exception from HRESULT: 0x80020005 (DISP_E_TYPEMISMATCH))
   at Microsoft.Office.Core.CommandBarsClass.get_Item(Object Index)
   at Word2MediaWiki__.ThisAddIn.ThisAddIn_Startup(Object sender, EventArgs e) in C:\Documents and Settings\msmithlo\My Documents\personal\VS2005 Projects\Word2MediaWiki++\Word2MediaWiki++\ThisAddIn.vb:line 11
   at Microsoft.Office.Tools.AddIn.OnStartup()
   at Word2MediaWiki__.ThisAddIn.FinishInitialization() in C:\Documents and Settings\msmithlo\My Documents\personal\VS2005 Projects\Word2MediaWiki++\Word2MediaWiki++\ThisAddIn.Designer.vb:line 65
   at Microsoft.VisualStudio.Tools.Applications.Runtime.AppDomainManagerInternal.ExecutePhase(String methodName)
   at Microsoft.VisualStudio.Tools.Applications.Runtime.AppDomainManagerInternal.ExecuteCustomizationStartupCode()
   at Microsoft.VisualStudio.Tools.Applications.Runtime.AppDomainManagerInternal.ExecuteEntryPointsHelper()
   at Microsoft.VisualStudio.Tools.Applications.Runtime.AppDomainManagerInternal.Microsoft.VisualStudio.Tools.Applications.Runtime.IExecuteCustomization2.ExecuteEntryPoints()

************** Loaded Assemblies **************

That sure didn’t take long.  .NET is barfing on the following statement:

Dim MyControl As Microsoft.Office.Core.CommandBarControl = Application.CommandBars(W2MWPPBar).FindControl(Microsoft.Office.Core.MsoControlType.msoControlButton, Tag:="W2MWPP Convert")

I’m not entirely sure where the “type mismatch” is coming from, but given that this is a command directly ported from VBA, it’s probably safe to assume there’s a different/better approach in VB.NET.  [If I had to guess, however, I’d think that the MyControl can’t be equated to Application.CommandBars() when it’s just been declared As Microsoft.Office.Core.CommandBarControl – there’s something about casting Application.CommandBars to Office.CommandBars that tells me “Application” and “Office” just can’t be equated like that.  I really wish I knew exactly why, but someday I’m sure I’ll look back and wonder why I didn’t “get” it.]

Based on my experience with a previous Word VSTO add-in, I tried  Office.CommandBarButton rather than CommandBarControl, and I think I’ll rework some of the logic to parallel the way that I’d made this same Bar + Button construction work in the past.

I’ve let this task sit for a few days while I debate with myself the merits of (a) making as little syntactic as well as functional changes to the code as humanly possible, (b) replacing whole swaths of code with stuff I know that “works” — even if I’m not convinced that the replacement code is any more “elegant” or performant than the stuff I’m replacing, or (c) bashing my head against a wall with a few attempts at [a] until I confirm that I won’t be able to figure it out easily, at which point I revert to [b].

I was originally inclined towards [a], so that I didn’t offend the original author and I didn’t act like some impatient arrogant twit who thinks that “they know better”.  However, the longer I stall on any piece of code for which I don’t have a clear idea which specific calls, variables or syntax is causing the incompatibility — though I feel I understand its overall function — the more I’m getting to the point that I’d rather put out something that works than spend another six months trying to be as surgically precise as possible in my upgrade.

I guess the most important thing is to document (a) what I’m replacing, (b) how I expect the replacement code should work/should accomplish the same thing, and (c) use as many of the same variable names as possible so that anyone familiar with the old code would have the best chance of understanding the new code as well.

So my thinking is, I’ll try to preserve the variable names, and as much of the overall logic as possible, but that I’ll replace any non-functioning code with whatever I know (or can find online) that works so that this project doesn’t get permanently stalled (which is a serious risk with me).

 Response to Redistribution Request for Wikifunctions.dll

The whole loosely organized nature of open source means that you might never get a straight answer to any question, but I did receive a response to my inquiry about whether it would be acceptable to redistribute the compiled binary of Wikifunctions.dll.  The response indicated,

“No problem if you include the binary you downloaded yourself. Don’t forget, it’s copyleft. Of course, you’ll have to honor GPL by mentioning that “this software includes parts from AWB developed by blah blah blah…”. Be advised though that WikiFunctions is designed primarily for AWB and thus has some limitations such as requirement for approval to be able to edit pages. Probably you will find something else more useful. For example, WikiAccess (with docs in Russian, hehe).” 

 

Join us again… well, you know the drill by now…

Advertisements

Porting Word2MediaWikiPlus to VB.NET: Part 4

[This series has four previous articles: the prologue, Part 1, Part 2 and Part 3.] 

Proposed Class Hierarchy

I’m trying to think through how this will be basically grouped and structured.  This is the basic class hierarchy I’m thinking of implementing:

  • W2MW++: functions to perform the basic setup & teardown, general functions
  • W2MW++.Format: functions for converting the formatting & layout
  • W2MW++.Image: functions for manipulating images embedded in the Word document
  • W2MW++.Text:
  • W2MW++.Table
  • W2MW++.Wiki
  • W2MW++.Wiki.Publish: functions for making the remote calls necessary to publish the translated document to the Wiki

Aside: Searching SourceForge.net (D’oh!?)

It just occurred to me that I should probably dig through SF.net before diving into a headlong rush of code… it would really suck if I got halfway through, 40 hours into it and then found out that someone else had already developed a really robust Office add-in or set of libraries for MediaWiki upon which I could’ve built.  Although that would sure be in the spirit of the open source community… 😉

I went over to SF and searched on “wiki edit” and came up with very little, so I tried just “wiki” and came back with 60+ pages of results.  Filtering on Operating System = “All 32-bit Windows” reduces that list to 8 pages.  [It’s unfortunate that there’s no way to filter Programming Language = “{.NET languages}”, but so it goes in the land of SF – best I can do is re-run this three times, for C#, Visual Basic and Visual Basic.NET.]

Here’s some of the more interesting, potentially complementary codebases I’ve found so far:

  • WikiAccess Library: (C#, GPL)  Seems to have implemented a comprehensive set of methods and properties for accessing a MediaWiki-based site.  [This might just end up supplying the W2MW++.Wiki class I was thinking about.]
  • DotNetWikiBot Framework: (C#, MIT (X11) license) Seems to have a similarly-large set of methods & properties for accessing a MediaWiki-based site.
    • What’s even more coincidental is that the author (according to the embedded copyright) shares an almost-identical name (Vassiliev) with the author of WikiAccess Library (Vasiliev). 
    • I’m assuming it’s the same guy with two accounts, though I don’t know why he’d reimplement the same stuff over again.
    • However, I have to admit the CHM file documentation that’s included, as well as much greater amount of activity, makes me feel very good about importing the library as a foundation for this app.  [If I understand these licensing arrangements, I don’t have to worry about license conflict if I don’t re-use any of the source code, but only place a dependency on the compiled DLL.]
  • Wiki Word Importer: (VB.NET, GPL) An empty project.  Next!
  • Wiki Editing Suite: (VB.NET, GPL) A populated project, with completely barren source files.  Next!
  • wikiTech: (VB.NET, APL) Yet another empty project.  Next!
  • excel2Wiki: (Visual Basic, GPL) Empty project.  What a surprise – Next!
  • Wiki2HTML: (Visual Basic, LGPL) A VBScript that converts Wiki-formatted content to common HTML.
  • AutoWikiBrowser: (C#, GPL) A very well-staffed project with an amazing amount of code activity.  Has a page on Wikipedia as well, which describes the WikiFunctions.dll API library that they encourage others to use in their standalone projects.  Perhaps they’d be okay with including it here…?

Fascinating, just fascinating.  This is a real decision point for this project then:

  • Do we take a dependency on one or more external libraries?  If so, which one(s)?
  • If so, then how do we negotiate/maintain redistribution rights?  [Presumably it’d be a pain in the arse to force our users to go download and decompress the DLL(s) necessary from other packages.]
  • If not, do we import source code from one of these projects?  If so, do we only focus on GPL code, so that there won’t be any conflicts with the Word2MediaWikiPlus basis?

AutoWikiBrowser’s WikiFunctions Assembly

I’m pretty much decided on taking the dependency on AutoWikiBrowser’s WikiFunctions.dll (as it appears at this time to have the most robust support), so now it’s just a question of whether to add any others:

  • The AutoWikiBrowser talk archives from October 2006 indicate that DotNetWikiBot “…is significantly more advanced/complicated than WikiFunctions.Editor”.  I’ll see how much editing functionality is needed and only take the dependency on DotNetWikiBot if there are major functions that it has that we need.

I’ve left a note on the AutoWikiBrowser Talk page to inquire about redistribution rights, and on the assumption that I’ll be able to include this (or at least, that I’ll be able to find some way to minimize the pain of separately downloading the DLL), I’ll start working on this add-in now.

 

Join us again next time — same Bat time, same Bat channel!

Porting Word2MediaWikiPlus to VB.NET: Part 3

[This series has three previous articles: the prologue, Part 1 and Part 2.]

Digging into modWord2MediaWikiPlus

This is the motherlode, right?  Here’s where all the action happens — all the reformatting, text extraction and Wiki-izing.  Yes, this monster VBA module probably has the majority of the code I’ll be porting over to VSTO.

So here’s a canonical list of the Functions and Sub’s contained therein — the best way for me to get to know this beast is to strip it down, piece by piece:

  • MW_CloseProgramm()
  • Word2MediaWikiPlus_Config()
  • Sub Word2MediaWikiPlus_Upload()
  • MediaWikiConvert_CleanUp()
  • MediaWikiConvert_Comments()
  • MediaWikiConvert_EscapeChars()
  • MediaWikiConvert_Fields()
  • MediaWikiConvert_FontColors()
  • MediaWikiConvert_FootNotes()
  • MediaWikiConvert_FormFields()
  • MediaWikiConvert_Headings()
  • MediaWikiConvert_HTMLChars()
  • MediaWikiConvert_Hyperlinks()
  • MW_ImageInfoReset()*
  • MediaWikiExtract_Images()*
  • MediaWikiExtract_ImagesHtml()*
  • MediaWikiExtract_ImagesPhotoEditor()*
  • MediaWikiConvert_Indention()
  • MediaWikiConvert_IndentionTab()
  • MediaWikiConvert_Lists()
  • MediaWikiConvert_Paragraphs()
  • MediaWikiConvert_Prepare()
  • MediaWikiConvert_Tables()
  • MediaWikiConvert_TabTables()
  • MediaWikiConvert_TextFormat()
  • MediaWikiImageUpload()*
  • MediaWikiOpen()
  • MediaWikiReplaceQuotes()
  • MW_CheckFileName()
  • MW_CheckFileNameTitle()
  • MW_ClearFormatting()
  • MW_Convert_Table()
  • MW_ReplaceSpecialCharactersFirst()
  • MW_ReplaceCharacter()
  • MW_FindNormalWidth()
  • MW_FontFormat()
  • MW_FormatCategoryString()
  • MW_GetEditorPath()
  • MW_GetImageNameFromFile()*
  • MW_GetImagePath()*
  • MW_GetScaleIS()*
  • MW_GetUserLanguage()
  • MW_ImageExportPowerpointPNG()*
  • MW_ImageExtract()*
  • MW_ImageExtract2()*
  • MW_ImagePathName()*
  • MW_ImageUpload_File()*
  • MW_Initialize()
  • MW_InsertPageHeaders()
  • MW_LanguageTexts()
  • MW_PhotoEditor_Convert()*
  • MW_PowerpointQuit()
  • MW_ReplaceString()
  • MW_ScaleMax()*
  • MW_ScaleMaxOK()*
  • MW_SearchAddress()
  • MW_SetWikiAddressRoot()
  • MW_ChangeView()
  • MW_Statusbar()
  • MW_SurroundHeader()
  • MW_TableInfo()
  • MW_WordVersion()
  • GetRegValidate()
  • RemoveDir()
  • MediaWikiExtract_ImagesHtml2002()*
  • MakeDir()
  • TestSendMessage()
  • TestImageInfo()*
  • TestUnicode()
  • TestReadUnicode()
  • TestCopyDoc()
  • MW_SetOptions_2003()

My initial reaction

There’s a few things that stand out for me so far from this code module:

  1. Some functions are prefixed “MW_“, others “MediaWiki“, but only those with the “MediaWiki” prefix have the “copyright by Gunter Schmidt” notice.  I imagine the “MW_” functions are inherited from another codebase.  Just something to watch out for.
  2. There are references to Word 97 and Word 2000 in this code, but it occurs to me that VSTO probably doesn’t support anything less than Word XP or Word 2003.  I should check on that, and then I’ll know what code has to be cut out.
  3. There’s a ton of code that’s focused on migrating images from the source document to the target wiki page…including a bunch of SendKeys operations that I at first suspected couldn’t be easily implemented in .NET.  However, it looks like .NET implements this as the System.Windows.Forms.SendKeys class.
  4. I’m puzzled by the appearance of PowerPoint in the code — I wonder what it’s really being used for?
  5. There’s also a bunch of talk of using Microsoft Photo Editor (which seems to have died off at Office XP) — I wonder if .NET 2.0 has any image-manipulation classes that will relieve us of this dependency?  Photo Editor’s replacement (Picture Manager) doesn’t provide much in the way of editing functionality, and Microsoft’s other suggestion (Digital Image Pro) has recently been “de-hired” as well.
  6. There is a framework in this code for supporting a wide variety of languages, but in the worst case, it appears that only English and German are actually enabled.
  7. There are frequent instances of debug code in there, which I presume is just used as the equivalent to the built-in Breakpoint & exception handling in Visual Studio 2005.  I’ll likely drop it entirely, except where DEBUG or TRACE functionality would be useful.
  8. I’m also noticing a trend towards using the Registry to store a bunch of temporary settings that are only relevant to this Macro, not to Word or Windows in general.  I’ll likely convert this over to the XML Settings approach of .NET — there’s something that just seems “wrong” about using the Registry for such ephemeral data (except of course when there are no other options).
  9. I’m fascinated by the naming of all the objects in this code, and looking forward to making things a lot easier to understand.  All these cryptic variables, the prefixes that won’t be needed once the OO hierarchy is actually available to get the context for functions.  I’ll do my best to implement all the Microsoft guidelines for writing good .NET code – the more of others’ code I read, the more I appreciate those few who follow these guidelines.

Plan of attack

  1. I think I’m going to have to carve this down into releases, and not try to implement this all at once.
  2. I’m thinking that the image-manipulation code, while very sexy and useful, isn’t critical for the core use cases for a Word-based DOC-to-MediaWiki convertor.  [All those functions are labelled with a *.]  These are probably for “release 3”.
  3. The table-manipulation code seems more important, but still it’s probably not essential.  That seems like a “release 2” kind of thing.

Right now I’m feeling like I have a pretty good handle on what’s in store for me, and the “itch” to stop planning and start coding is getting to be too much to resist.  I think I’ll make a list of the top ten things I’ll want to start on, and then just get down to it.

Join us again — same Bat time, same Bat channel!

Porting Word2MediaWikiPlus to VB.NET: Part 2

[This series has two previous articles: the prologue and Part 1.]

Initial source code examination

So I figured I’d get right into it.  I opened up Word 2003, went into the Tools > Macro > Macros… menu, and thinking there’d be some obvious Import function, hit the Organizer button.  No such luck, so I tried again via Tools > Macro > Visual Basic Editor, muddled around for a few minutes, and eventually just imported all the files via the File > Import File… menu option.

I don’t know much about how VBA multi-file applications are organized, but I figured that the one listed under the Class Modules folder would likely be the right one to start — Classes are the basic building block for all OO programs, right?

I had a quick look through the Sub’s defined in Class Modules > ThisDocument1 (which must correspond to the ThisDocument.cls file), and thought, “heh, this’ll be easy — there’s only a couple hundred lines of code and a handful of Sub’s”:

  • cmdCopyModuls_Click()
  • CreateSymbol()
  • CreateSymbol2()
  • CopyModulesToNormal()
  • cmdSymbols_Click()
  • cmdUninstall_Click()

Ah, but wait: there’s _Click() routines in them thar hills…which means there’s Form UI to deal with, and now that I look more closely, those .BAS files ended up as a list of entries under the Modules folder.

modW2MWP_FileDialog

Yipes!  There’s a few more complicated questions to deal with than I’d originally thought.  For example, modW2MWP_FileDialog contains at least some code with a copyright heading, which could make life a bit tougher:

'***************** Code Start **************
'This code was originally written by Ken Getz.
'It is not to be altered or distributed, except as part of an application.
'You are free to use it in any application, provided the copyright notice is left unchanged.
'
' Code courtesy of:
' Microsoft Access 95 How-To
' Ken Getz and Paul Litwin
' Waite Group Press, 1996

This may not be so bad though, as the Functions in this module referring to file operations (e.g. GetOpenFile(), ahtCommonFileOpenSave())may be superfluous with the System.IO namespace available in VSTO.  We might not need these file I/O functions, though TestIt(), ahtAddFilterItem(), TriumNull() and TrimTrailingNull() may be needed.This brings up a really good point that I’ve seen mentioned in a couple of places, including John R. Durant’s blog from two years ago:

The real issue I see in migrating from VBA to VSTO is not the language or switching to a new runtime. […]  The more important factor is how the migration will affect the architecture of the application.  It is important to ask questions like: Does the new runtime make it possible for me to code this differently at a more fundamental level?  Can I cut lots of code? […]

This’ll be a fine line for me to walk: I’d like to make the conversion as quickly as possible, but it may not always be easy to figure out what the original code was meant to do.  It doesn’t help matters that many of the Comments in the source code are in German — as much as I’d like to think I can still understand German, it’s been sixteen years since I was in Germany and I’m more than a little rusty.  Hopefully the online German-to-English dictionaries will be able to sort out the gist of it.

modW2MWP_Registry

The module modW2MWP_Registry is almost entirely focused on Registry interactions (which are well represented by the Microsoft.Win32.Registry class) except for the Uninstall_Word2MediaWikiPlus() Sub.  However, any necessary code has a more appropriate home in the VSTO add-in’s ThisAddIn_Shutdown() Sub.

modWord2MediaWikiPlus

This “module” is a monster — it’s a wonder it hasn’t been broken down into a whole namespace of classes — or I suppose in the VBA world, the closest they have is “a set of Modules” [no hierarchy.]  And it’s the central code:

‘ Function: Converts a word document to the wiki syntax

Anyway, it has a great deal of functionality — I’ll dig into it later.  This’ll be the majority of the work here.

modWord2MediaWikiPlusGlobal

This module is much smaller than modWord2MediaWikiPlus, but has some interesting functionality as well.  After studying them for a bit, I think I’ve figured out what general category of purpose each of them has:

Filesystem functions

Process functions

Text functions

GetShortPath()
GetSpecialFolder()
DirExists()
DirExistsCreate()
FileExists()
GetFileName()
GetFilePath()
FormatPath()
KillToBin()
IExplorer()
AppActivatePlus()
fGetCaption()
GetWindowTitleHandle()
ReplaceStr()
RGB2HTML()
OleConvertColor()
AnonymizeText()
AnonymizeWords()

Oh, and a DisplayError() for good measure.

 

I think I’m ready to start digging through the code in modWord2MediaWikiPlus – wish me luck!

Join us again — same Bat time, same Bat channel!

Porting Word2MediaWikiPlus to VB.NET: Part 1

[This series of articles starts off with a prologue here.]

Initial Setup

First up, I’d downloaded the source code from here.  When I downloaded it the browser would only let me choose between htm, mht and txt format (I chose txt).  Then, figuring that my work would be easiest if opened the file in Visual Studio 2005 Team edition (with the VSTO SE addition), I needed to figure out what file extension I should assign to ensure Visual Studio recognized it for the kind of code that it is (and get as much of the Visual Studio Intellisense auto-completion and auto-colouring as possible).

I tried .BAS and .VBA, but neither of those extensions had any associated icons, and when I opened the .VBA, every line was displayed entirely in black.

Then, in exploring the filesystem to re-open the file, I noticed I’d created a C#–based project [since my VS2005 install is oriented to C# — having forced myself to write an app in it to prove to myself I wasn’t forever tainted by my earlier VB.NET experiences] instead of VB.NET, so I deleted that one and tried again (File > New > Project > Other languages > Visual Basic > Office > 2003 Add-ins > Word Add-in).  I chose Word 2003 because (a) that’s what I’m currently using, (b) that’s the platform I have a little experience with, and (c) it’s what most folks around the world would be using — at least, much moreso than Word 2007.  I chose to name the solution Word2MediaWiki++, to honour the fine work upon which I’m building.

[I’ve thought about whether this will become just one Project in a larger Solution, and I assume at some point there’ll be a need to rename either the Solution or this first Project to distinguish between this single piece of functionality and the overall Word-to-MediaWiki functionality.  It’s quite likely in fact, but I don’t want to overthink this or over-engineering the proceedings too badly – I have to remind myself that just getting the same functionality in VSTO add-in form is my primary goal.]

Quick Aside

I’ve done some reading about VBA to VSTO conversions in the past (while I was still at Microsoft), and I’d actually developed another Word 2003 add-in — from scratch — as a way to provide a platform for re-implementing a very comprehensive VBA macro-based application we were using.  So I already had some idea that this was possible, and that there are many resources (from Microsoft* and elsewhere) to help in making both the high-level transition and in converting many of the low-level VBA constructs.

A simple search on Google for “vba to vsto” comes up with 828,000 results, and many of them on the first page or two are good, first-person lessons.  However, the articles/white papers/blog posts I’d previously read that had given me the confidence I’m building on now include these:

Licensing

Oh, and before I actually go do any work on this code, I should double-check that the original author hasn’t put any restrictions on it (copyright, restrictive license, “all rights reserved”)…

… So according to the SourceForge home for W2MWP, this is licensed under the GPL.  It looks like I’m in the clear, so long as I publish this back out under GPL too – wouldn’t want to run afoul of the GPL police now would we?

 

Join us again – same Bat time, same Bat channel!

[*Footnote: My only gripe about Microsoft’s VSTO SE add-in work is that they’re so freakin’ focused on Office 2007 development that I have to do some serious detective work – either trolling through old VBA or pre-add-in documentation and hoping the classes are similar, or winding through the “everything is wonderful” Office 2007 documentation and hoping the stuff I need hasn’t been reinvented for the wondrous new world of Office 12.]

From VBA to VSTO: porting Word2MediaWikiPlus to VB.NET

I’ve gotten religion about Wikis a while back, and recently I’ve had an incentive to look into the world of conversion applications.  I’m looking into the applications that are available to convert from one format (e.g. Office documents) to MediaWiki format (i.e. the engine behind the venerable Wikipedia).

I went wandering through the Wikipedia Tools pages and when I found the Office-oriented tools, I was surprised that many of them were implemented as VBA macros.  One in particular caught my eye: the one labeled Word2MediaWikiPlus.  It’s a single VBA macro, with a small number of functions, and it looks just ripe for creating a VSTO add-in.  Further, as compared to the other Word macros, it seems like it’s a superset of the others, and that the others preceded this and/or have since been abandoned.

I’m really getting into the community mindset, and I figure that not only will some of my colleagues appreciate having this kind of functionality, but that there’d be many folks on the ‘net who’d probably use something like this as well.  And once a basic CommandBar framework was put in place, and the source code available for re-use, then anyone who’d like to add their own functionality to a VSTO add-in for Office should be able to leverage the basic framework I could establish.

Sounds simple eh?  Just a little parsing through the object model, some digging through VBA-to-VB.NET conversions, and a little refresher on Office.CommandBar and its brethren – and voila!  One VSTO add-in that implements the same functionality as you see in the original VBA macro.  [Famous last words.]

No, I’m not quite that naive – I suspect it’ll take a good deal more than that before I’m through.  With that in mind, I’m considering a novel approach to this project: as I’m going through each stage of the conversion, I’ll blog about my experiences – the dead-end code paths I pursue, the inconsistencies in the Word object model, the under-documented features of the original source code, and all the places I find anything useful that keeps this moving towards completion.

Wish me luck, and I’ll keep you posted on my progress!

Troubleshooting RunOnce entries: Part One

I’ve been investigating the root cause for a critical issue
affecting my CacheMyWork app (for those of you paying attention, it has come up in the past in this column). Ever since I received my (heavily-managed) corporate laptop at work, I’ve been unable to get Windows XP to launch any of the entries that CacheMyWork populates in RunOnce.

Here’s what I knew up front

  • On other Windows XP and Windows Vista systems, the same version of CacheMyWork will result in RunOnce entries that all launch at next logon
  • On the failing system, the entries are still being properly populated into the Registry – after running the app, I’ve checked and the entries under RunOnce are there as expected and appear to be well-formatted
  • The Event Log (System and Application) doesn’t log any records that refer even peripherally to RunOnce, let alone that there are any problems or what might be causing them
  • The entries haven’t disappeared as late as just before I perform a logoff (i.e. they’re not being deleted during my pre-reboot session).

Here’s what I tried

UserEnv logging
  • I added HKLM\Software\Microsoft\Windows NT\CurrentVersion\Winlogon\UserEnvDebugLevel = 30002 (hex).
  • This is able to show that the processes I’m observing are firing up correctly, but there is nothing in the log that contains “runonce” or the names of the missing processes, and I haven’t spotted any entries in the log that point me to any problems with the RunOnce processing.
ProcMon boot-time logging
  • I’ve got over 3.3 million records to scan through, so while I haven’t found anything really damning, I may never be 100% sure there wasn’t something useful.
  • After a lot of analysis, I found a few interesting entries in the ProcMon logs:
Process Request Path Data
mcshield.exe RegQueryValue HKLM\SOFTWARE\Network Associates\TVD\Shared Components\On Access Scanner\BehaviourBlocking\FileBlockRuleName_2 Prevent Outlook from launching anything from the Temp folder
mcshield.exe RegQueryValue HKLM\SOFTWARE\Network Associates\TVD\Shared Components\On Access Scanner\BehaviourBlocking\FileBlockRuleName_10 Prevent access to suspicious startup items (.exe)
mcshield.exe RegQueryValue HKLM\SOFTWARE\Network Associates\TVD\Shared Components\On Access Scanner\BehaviourBlocking\FileBlockWildcard_10 **\startup\**\*.exe
BESClient.exe RegOpenKey HKLM\Software\Microsoft\Windows\CurrentVersion\RunOnce Query Value
Explorer.exe RegEnumValue HKLM\Software\Microsoft\Windows\CurrentVersion\RunOnce Index: 0, Length: 220
waatservice.exe RegOpenKey HKLM\Software\Microsoft\Windows\CurrentVersion\RunOnce Desired Access: Maximum Allowed
Windows Auditing

I finally got the bright idea to put a SACL (audit entry) on the HKCU\…\RunOnce registry key (auditing any of the Successful “Full Control” access attempts for the Everyone special ID). After rebooting, I finally got a hit on the HKCU\…\RunOnce key:

Event Log data

First log entry

Second log entry

Third log entry

Category Object Access Object Access Object Access
Event ID 560 567 567
User (my logon ID) (my logon ID) (my logon ID)

And here are the interesting bits of Description data for each:

Description field

First log entry

Second log entry

Third log entry

Title Object Open: Object Access Attempt: Object Access Attempt:
Object Type Key Key Key
Object Name \REGISTRY\USER\S-1-5-21-725345543-602162358-527237240-793951\Software\Microsoft\ Windows\CurrentVersion\RunOnce [n/a] [n/a]
Image File Name C:\WINDOWS\explorer.exe C:\WINDOWS\explorer.exe C:\WINDOWS\explorer.exe
Accesses

DELETE
READ_CONTROL
WRITE_DAC
WRITE_OWNER
Query key value
Set key value
Create sub-key
Enumerate sub-keys
Notify about changes to keys
Create Link

[n/a] [n/a]
Access Mask [n/a] Query key value Set key value

Not that I’ve ever looked this deep into RunOnce behaviour (nor can I find any documentation to confirm), but this seems like the expected behaviour for Windows. Except for the fact that something is preventing the RunOnce commands from executing, of course.

Blocking the Mystery App?

Then I thought of something bizarre: maybe Explorer is checking for RunOnce entries to run during logon, and it isn’t finding any. Is it possible some process has deleted them during boot-up or logon, but before Explorer gets to them?

This flies in the face of my previous theory, that the entries were still there when Windows attempted to execute them, but something was blocking their execution. Now I wondered if the entries are even there to find – whether some earlier component hasn’t already deleted them (to “secure” the system).

If so, the only way to confirm my theory (and catch this component “in the act”) is if the component performs its actions on the Registry AFTER the LSA has initialized and is protecting the contents of the Registry. [It’s been too long since I read Inside Windows NT, so I don’t recall whether access to securable objects is by definition blocked until the LSA is up and ready.]

Hoping this would work, I enabled “Deny” permission for Everyone on the HKCU\…\RunOnce key for both “Set Value” and “Delete” (not knowing which one controls the deletion of Registry values in the key). This also meant that I had to enable Failure “Full Control” auditing for the Everyone group on this key as well.

However, while I’ve firmly confirmed that the Deletion takes place when I remove this Deny ACE, I can’t get Windows to log any information to indicate what process or driver is deleting the registry entries (and thus preventing Windows from executing them). It looks like – beyond what I’ve already found – there’s nothing else for which the LSA is asked to make any access control decisions for the HKCU\…\RunOnce key.

“Run Away!”

That’s all for now – I’m beat and need to regroup. If anyone has any bright ideas on ways to try to dig deeper into this system and figure out what’s missing, I’d love to hear it.

To be continued…

Threat Modeling Google group is now available

I’ve been using Microsoft’s Threat Analysis and Modeling (TAM) tool for about a year now, and I’ve gotten to really love how much easier and user-friendly this tool is than anything else I’ve found so far on the ‘net.  I’ve tried to find anything that was as comprehensive, easy for beginners, flexible and extensible as TAM is (let alone free), and there’s nothing else that even comes close.  Anytime I’m asked now to do any Threat Modeling for a product or technology, the only tool I would seriously consider is TAM.

That said, the more I work with it, I’m finding there are enhancements I’d like to make, or things I’d like to better understand:

  • What are the key steps that I should never skip?
  • What tools are useful for generating additional XSLT Report templates?
  • How does TAM merge overlapping content when importing Attack Libraries?
  • What extensibility classes are available for .NET-friendly developers to add to this tool?
  • What’s a reasonable number of Components or Attacks to include in any one threat model?

 I’ve worked with the TAM team at Microsoft to get some ideas on this, but they’re pretty much working flat-out on the Security Assessments for which they built this tool in the first place.  I’ve scoured their old blog entries (here, here and here) to glean tidbits, but I’d really like to work with more folks who are also using this – share what I’ve learned and get their input and ideas as well.

I’d hoped that Microsoft would have a Community forum for this great tool, but since they don’t, I’ve taken the bull by the horns and created one myself.  You can find it here on the Google Groups site.  Yes, Google.  Horrors!

I’ve tried to use MSN Spaces in the past as a collaboration workspace, but I’ve found Google Groups and Yahoo Groups are both better platforms for this sort of thing.  They give you more control, with less futzing around trying to make things “look right”, and they’re investing significant effort into these platforms.  Frankly, I’m a lazy guy at heart, and it was really freakin’ easy to setup the Google Group.  Sue me.

Call to Action: if you’re using Microsoft’s TAM tool already, or you know someone who’s responsible for things like “Secure Coding”, “Risk Assessments” or “Threat Modeling”, I’d encourage them to check out the Group, post some sample Files, start some Discussions or even just lurk for good ideas!

Debugging persistent Outlook crashing – can only go so far…

I’ve been experiencing a persistent crash in Outlook for months now – often Outlook will crash when I Send an email.  I suspect it’s related to the fact that in the main Outlook form ( Mail/Inbox) the Reply, Reply All and Forward buttons and keyboard shortcuts are inactive.  [They work from the context menu of an individual message, or if I open a message and use the buttons from the Toolbar displayed above the message itself.  Yes, very weird.]  I suspect it’s a result of an unclean uninstall of the Getting Things Done add-in for Outlook – which I used to like, but which has been supplanted by the combination of MindManager and ResultsManager (at least this week).

In any case, I’ve captured multiple .DMP files, but when I try to debug them I get very sketchy results.  I used to think it was because I don’t have access to symbols for the add-ins that I’ve installed for Outlook – which is normally where these kinds of crashes come from.  However, I’ve disabled all the add-ins that are listed in Outlook > Tools > Options > Other > Advanced Options > “Add-in Manager” & “COM Add-Ins”, and I’m still getting the same kind of crashing behaviour.  I’m still getting spotty results, which tells me I don’t even have symbols for Outlook (to map the function offsets that are listed in the dump), and I’ve been beating my head against a wall trying to figure out how to get access to them.

I’m almost positive I’ve got the Microsoft Internet Symbol Server configured correctly, and yet I continue to get errors like this:

*** ERROR: Symbol file could not be found.  Defaulted to export symbols for OUTLLIB.DLL –
OUTLLIB!OlkGetUIlangID+0xd434:

I discovered that you can debug the loading operations with the !sym noisy command.  Once I enabled this, I saw this in my .reload output:

0:000> .reload
…………………………………………………………………………………………………………………………………………………..
SYMSRV:  C:\symbols\OUTLLIB.DLL\4566283D749000\OUTLLIB.DLL not found
SYMSRV:  http://msdl.microsoft.com/download/symbols/OUTLLIB.DLL/4566283D749000/OUTLLIB.DLL not found
DBGENG:  C:\Program Files\Microsoft Office\OFFICE11\OUTLLIB.DLL – Mapped image memory
SYMSRV:  C:\symbols\outllib.pdb\0EAE667B6A73417A9D7DC2E4C81382232\outllib.pdb not found
SYMSRV:  http://msdl.microsoft.com/download/symbols/outllib.pdb/0EAE667B6A73417A9D7DC2E4C81382232/outllib.pdb not found
DBGHELP: outllib.pdb – file not found
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for OUTLLIB.DLL –
DBGHELP: OUTLLIB – export symbols

I wanted to double-check that it wasn’t just a lack of some specific version of OUTLLIB.DLL, so I browsed to http://msdl.microsoft.com/download/symbols/OUTLLIB.DLL/, and received a 404 error.  To make sure there wasn’t some subtle IIS configuration issue, I tested http://msdl.microsoft.com/download/symbols/KERNEL32.DLL (a known good library), which gave me a 403 (Forbidden) error.

That tells me that Microsoft still hasn’t published Office symbols to the Internet – even while they’re trying to push application developers to use Office as a “platform” on which to build enterprise-class applications (VSTO, VSTA).  That’s a really noticeable gap, at least to me.

In any case, this is as good a debug output as I’m able to get:

FOLLOWUP_IP:
OUTLLIB!OlkGetUIlangID+d434
301b7b82 ff506c          call    dword ptr [eax+6Ch]

SYMBOL_STACK_INDEX:  0

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: OUTLLIB

IMAGE_NAME:  OUTLLIB.DLL

DEBUG_FLR_IMAGE_TIMESTAMP:  4566283d

FAULTING_THREAD:  000014d0

DEFAULT_BUCKET_ID:  NULL_CLASS_PTR_DEREFERENCE

PRIMARY_PROBLEM_CLASS:  NULL_CLASS_PTR_DEREFERENCE

BUGCHECK_STR:  APPLICATION_FAULT_NULL_CLASS_PTR_DEREFERENCE

SYMBOL_NAME:  OUTLLIB!OlkGetUIlangID+d434

FAILURE_BUCKET_ID:  APPLICATION_FAULT_NULL_CLASS_PTR_DEREFERENCE_OUTLLIB!OlkGetUIlangID+d434

BUCKET_ID:  APPLICATION_FAULT_NULL_CLASS_PTR_DEREFERENCE_OUTLLIB!OlkGetUIlangID+d434

Unfortunately, Google Groups and the web have nothing helpful to understand what “APPLICATION_FAULT_NULL_CLASS_PTR_DEREFERENCE” even means, let alone how to fix the problem it flags.

If I ever come up with a good answer to this, I’ll be sure to post it.  In the meantime, if anyone has any clues or hints for me on what I could do to narrow this down any further (e.g. any cool or powerful commands that I don’t know about – which probably includes anything beyond .symfix, .reload, k and !analyze -v.

EFS Certificate Configuration Updater tool is released!

After weeks of battling with Visual Studio over some pretty gnarly code issues, I’ve released the first version of a tool that will make IT admins happy the world over (well, okay, only those few sorry IT admins who’ve struggled to make EFS predictable and recoverable for the past seven years).

EFS Certificate Configuration Updater is a .NET 2.0 application that will examine the digital certificates a user has enrolled and will make sure that the user is using a certificate that was issued by a Certificate Authority (CA).

“Yippee,” I hear from the peanut gallery. “So what?”

While this sounds pretty freakin lame to most of the planet’s inhabitants, for those folks who’ve struggled to make EFS work in a large organization, this should come as a great relief.

Here’s the problem: EFS is supposed to make it easy to migrate from one certificate to the next, so that if you start using EFS today but decide later to take advantage of a Certificate Server, then the certs you issue later will replace the ones that were first enrolled. [CIPHER /K specifically tried to implement this.]

Unfortunately, there are some persistent but subtle bugs in EFS that prevent the automatic migration from self-signed EFS certificates to what are termed “version 2” certificates. Why are “version 2” certificates so special? Well, they’re the “holy grail” of easy recovery for encrypted files – they allow an administrator to automatically and centrally archive the private key that is paired with the “version 2” certificate.

So: the EFS Certificate Configuration Updater provides a solution to this problem, by finding a version 2 EFS certificate that the user has enrolled and forcing it to be the active certificate for use by EFS. [Sounds pretty simple eh? Well, there’s plenty of organizations out there that go to a lot of trouble to try to do it themselves.]

Even though this application fills a significant need, it doesn’t (at present, anyway) do everything that might be needed in all scenarios. The additional steps that you might need to cover include:

  • Enrolling a version 2 EFS certificate. [You can automate this with autoenrollment policy and the Windows Server 2003-based CA that is already in place for issuing v2 certificates and Key Archival.]
  • Updating EFS’d files to use the new certificate. [You can automate this by using CIPHER /U, but it’ll take a while if the user has a lot of encrypted files. The good news, however, is that the update only has to re-encrypt the FEK, not re-encrypt the entire file, so it’s much quicker than encrypting the same set of files from scratch.]
  • Ensuring that the user’s EFS certificate doesn’t expire before a new or renewed certificate is enrolled. [This is very easy to accomplish with Autoenrollment policy, but without the use of Autoenrollment, there is a significant risk that when the user’s preferred EFS certificate expires, the EFS component driver could enroll for a self-signed EFS certificate.]
  • Archiving unwanted EFS certificates. [This is different from deleting a digital certificate – which also invalidates the associated private key, which is NOT recommended. This would keep the certificates in the user’s certificate store, and preserve the private key — so that any files encrypted with that old certificate were still accessible. This is hard to do from UI or script, but is a feature I’m hoping to add to the EFS Certificate Configuration Updater in the near future. This is also optional – it just minimizes the chances of a pre-existing EFS certificate being used if the preferred certificate fails for some reason.]
  • Publishing the user’s current EFS certificate to Active Directory. [This is also optional. It is only necessary to make it possible — though still hardly scalable — to use EFS to encrypt files for access by multiple users (see MSDN for more information). This can be automated during Autoenrollment, but some organizations choose to disable publishing a 2nd or subsequent EFS certificate since the EFS component driver may get confused by multiple EFS certificates listed for a single user in Active Directory.]
  • Synchronizing the user’s EFS certificate and private key across all servers where encrypted files must be stored. [This is not needed if you’re merely ensuring that all sensitive data on the user’s notebook/laptop PC is encrypted, so that the loss or theft of that PC doesn’t lead to a data breach. However, if you must also enforce EFS encryption on one or more file servers, the EFS Certificate Configuration Updater will not help at all in this scenario.]

Try it out — Tell your friends (you have friends who’d actually *use* this beast? Man, your friends are almost as lame as mine – no offense) — Let me know what you think (but no flaming doo-doo on my front porch, please). And have a very crypto-friendly day. 😉